I have a data frame (df) that has about 40 columns, and I want to aggregate using a sum on 4 of the columns. Outside of the 4 I want to sum, each unique value in column 1 corresponds to identical values across the rest of the columns, and I want to keep all the columns in the aggregated data frame. Is there any way I can specify the columns in the by = list() portion without having to type them all explicitly? For example, if I knew I wanted to sum column "field" by columns 1-36. I've tried
aggregate(df$field, by = list(df[,1:36]), FUN = sum)
but it throws an error since that isn't a list of names. I've also tried
aggregate(df$field, by = list(names(df)[1:36]), FUN = sum)
And while this doesn't give an error, it gives me back an aggregation with my df names as the unique observations.
Or am I missing an easy way to say "aggregate these four columns using the rest of the data frame?"
Thanks
Here's an example data frame:
A B C D Sum
1 A B C D 1
2 A B C D 2
3 A B C D 3
4 E F 1 R 4
5 E F 1 R 5
After I aggregate I want it to look like:
A B C D Sum
1 A B C D 6
2 E F 1 R 9
I know I can do this if I explicitly state x$A, x$B, x$C, x$D in the "by" portion of the aggregate statement, but in my actual data frame this would require explicitly typing about 40 field names.