I have a data frame df
with rows that are duplicates for the names column but not for the values column:
name value etc1 etc2
A 9 1 X
A 10 1 X
A 11 1 X
B 2 1 Y
C 40 1 Y
C 50 1 Y
I need to aggregate the duplicate names into one row, while calculating the mean over the values column. The expected output is as follows:
name value etc1 etc2
A 10 1 X
B 2 1 Y
C 45 1 Y
I have tried to use df[duplicated(df$name),]
but of course this does not give me the mean over the duplicates. I would like to use aggregate()
, but the problem is that the FUN part of this function will apply to all the other columns as well, and among other problems, it will not be able to compute char content. Since all the other columns have the same content over the "duplicates", I need them to be aggregated as is just like the name column. Any hints...?
etcX
also guaranteed to be the same for rows with the samename
? – Raspy