This is probably a silly question, but I have read through Crawley's chapter on dataframes and scoured the internet and haven't yet been able to make anything work.
Here is a sample dataset similar to mine:
> data<-data.frame(site=c("A","A","A","A","B","B"), plant=c("buttercup","buttercup",
"buttercup","rose","buttercup","rose"), treatment=c(1,1,2,1,1,1),
plant_numb=c(1,1,2,1,1,2), fruits=c(1,2,1,4,3,2),seeds=c(45,67,32,43,13,25))
> data
site plant treatment plant_numb fruits seeds
1 A buttercup 1 1 1 45
2 A buttercup 1 1 2 67
3 A buttercup 2 2 1 32
4 A rose 1 1 4 43
5 B buttercup 1 1 3 13
6 B rose 1 2 2 25
What I would like to do is create a scenario where "seeds" and "fruits" are summed whenever unique site & plant & treatment & plant_numb combinations exist. Ideally, this would result in a reduction of rows, but a preservation of the original columns (ie I need the above example to look like this:)
site plant treatment plant_numb fruits seeds
1 A buttercup 1 1 3 112
2 A buttercup 2 2 1 32
3 A rose 1 1 4 43
4 B buttercup 1 1 3 13
5 B rose 1 2 2 25
This example is pretty basic (my dataset is ~5000 rows), and although here you only see two rows that are required to be summed, the numbers of rows that need to be summed vary, and range from 1 to ~45.
I've tried rowsum() and tapply() with pretty dismal results so far (the errors are telling me that these functions are not meaningful for factors), so if you could even point me in the right direction, I would greatly appreciate it!
Thanks so much!
plyr
anddata.table
tag. Lots of questions basically address this. Good luck! – Trunk