I have a df like
ProjectID Dist
1 x
1 y
2 z
2 x
2 h
3 k
.... ....
I want to add a third column such that we have an incrementing counter for each ProjectID:
ProjectID Dist counter
1 x 1
1 y 2
2 z 1
2 x 2
2 h 3
1 k 3
.... ....
I've had a look at seq
rank
and a couple of other bits particularly looking to see if I could use ddply
to help:
df$counter <- ddply(df,.(projectID), function(x).....? )
I think I could adapt this answer How to create a counter/numeration by group? but would prefer something using something like ddply (I can't find an equivalent of cumsum but I think that's the same principle here: Create ascending series of integers by group in Pandas ). That'd let me index occurrences in a list (and e.g. merge on this).
ave
i.e.df$counter <- with(df, ave(seq_along(ProjectID), ProjectID, FUN=seq_along))
or a compact wrapper would belibrary(splitstackshape);getanID(df, 'ProjectID')[]
or usingplyr
;ddply(df, .(ProjectID), mutate, counter=seq_along(Dist))
– AstatineProjectID
and creating a new column as the sequence ofDist
per each group. You will find it easy after you read the help pages and try some examples – Astatineave
I (think) I'm finding confusing - I get theddply
example (which also works perfectly, thanks again) but the use ofave
alongsideseq_along
I'm struggling to get my head around – Fantocciniave
, second argument is the grouping variable i.e. ` ave(x, ..., FUN = mean)` If you look at the description ` ...: Grouping variables, typically factors, all of the same ‘length’ as ‘x’.` . You can also useave(ProjectID, ProjectID, FUN=seq_along)
, but when you havecharacter/factor
columns, this will either result in error or get character elements as output. – Astatine