use rollapply and zoo to calculate rolling average of a column of variables
Asked Answered
F

1

6

I want to calculate the rolling mean for all variables in column "sp". This is a sample of my data:

the_date    sp  wins
01-06--2012 1   305
02-06--2012 1   276
03-06--2012 1   184
04-06--2012 1   248
05-06--2012 1   243
06-06--2012 1   363
07-06--2012 1   272
01-06--2012 2   432
02-06--2012 2   369
03-06--2012 2   302
04-06--2012 2   347
05-06--2012 2   357
06-06--2012 2   331
07-06--2012 2   380
01-06--2012 3   1
02-06--2012 3   2
03-06--2012 3   3
04-06--2012 3   2
05-06--2012 3   0
06-06--2012 3   2
07-06--2012 3   0

What I want, is to have a column added to data, that gives the moving average over 3 days for each sp. So the following output is what I desire:

the_date    sp  wins    SMA_wins
01-06--2012 1   305     305.00
02-06--2012 1   276     290.50
03-06--2012 1   184     255.00
04-06--2012 1   248     236.00
05-06--2012 1   243     225.00
06-06--2012 1   363     284.67
07-06--2012 1   272     292.67
01-06--2012 2   432     432.00
02-06--2012 2   369     400.50
03-06--2012 2   302     367.67
04-06--2012 2   347     339.33
05-06--2012 2   357     335.33
06-06--2012 2   331     345.00
07-06--2012 2   380     356.00
01-06--2012 3   1       1.00
02-06--2012 3   2       1.50
03-06--2012 3   3       2.00
04-06--2012 3   2       2.33
05-06--2012 3   0       1.67
06-06--2012 3   2       1.33
07-06--2012 3   0       0.67

I am using rollapply.

df <- group_by(df, sp)
df_zoo <- zoo(df$wins, df$the_date) 
mutate(df, SMA_wins=rollapplyr(df_zoo, 3, mean,  align="right", partial=TRUE))

If I filter my data on a specific sp, it works perfectly.

How can I make this work when I group by sp?

Thanks

Fernanda answered 18/11, 2015 at 0:34 Comment(0)
A
10

You can do it like this:

library(dplyr)
library(zoo)

df %>% group_by(sp) %>%
       mutate(SMA_wins=rollapplyr(wins, 3, mean, partial=TRUE))

It looks like your use of df and df_zoo in your mutate call was messing things up.

Appurtenant answered 18/11, 2015 at 0:38 Comment(5)
Thank you @jeremycg. It gives the correct result. However, it functions independently of "the_date" column. Basically it takes the samples sequentially. It works if my input data is sorted by date but what if it is not?Fernanda
Add in an arrange(the_date)Appurtenant
@jeremycg, do you know how to apply this to my question here? My situation may be different since the dates are spaced irregularly: #50024398Brickle
@Brickle by the time I've seen this you had an answer in the comments, so I reopen voted itAppurtenant
@Appurtenant Is it possible to use rollapply to apply a function to group ? I have Q here can you have a look at it ?Atalanta

© 2022 - 2024 — McMap. All rights reserved.