Which is 'right' versus 'left' rolling mean in R?
Asked Answered
L

2

5

If I want to calculate the previous n mean years with a lag of the current year, how would I accomplish this? Is it as simple as a "right" rolling mean window? Or is it a "left" rolling mean window? I'm not sure which window to use here.

Sample Data

set.seed(1234)
dat <- data.frame(year = c(1990:2010), 
                  x = rnorm(21))
dat$x_lag1 <- lag(dat$x, 1)
Low answered 1/5, 2018 at 5:55 Comment(2)
This is surely a duplicate, please close-as-duplicateMalenamalet
No obvious dupe that I see...Antons
C
7

It may be easier to think in terms of offsets. If you want a window of 3 then

  • align = "right" corresponds to using a window based on offsets of -2, -1, 0, i.e. point before prior, prior and current point. The current point is the rightmost end of the window. Note that rollapplyr with an r on the end is the same as specifying align = "right"
  • align = "center" corresponds to using a window based on offsets of -1, 0, 1, i.e. prior point current point and next point. The current point is the center of the window. This is the default for align= .
  • align = "left" corresponds to using a window based on offsets of 0, 1, 2 i.e. current point, next point and point after that. The current point is the leftmost point of the window.

rollapply allows one to use the align= specification or offset notation. To use the latter for width specify a list containing a single vector defining the offsets. (The actual specification of width is to specify a vector of widths, one for each element of the input or a list of offset vectors; however, in both cases they recycle so the usual case of specifying a single scalar width or a list containing a single offset vector are specifical cases.)

window ending in current point

Below we use align= to take the mean of a window of 3 ending in the current point and also use offsets as an alternative. We show both data frames and zoo objects.

We have omitted fill=NA for the zoo objects since they automatically align anyways so it is typically unnecessary to use it.

library(zoo)

r1 <- transform(dat, roll = rollapplyr(x, 3, mean, fill = NA))

r2 <- transform(dat, roll = rollapply(x, list(seq(-2, 0)), mean, fill = NA))

all.equal(r1, r2)
## [1] TRUE

z <- read.zoo(dat, FUN = identity)
r3 <- rollapplyr(z, 3, mean)

r4 <- rollmeanr(z, 3)

r5 <- rollapply(z, list(seq(-2, 0)), mean) # z from above

all.equal(r3, r4, r5)
## [1] TRUE

window ending in prior point

If you want the 3 prior points, i.e. offsets -3, -2, -1, i.e. not the current point but the 3 points prior to that, then the following would work. Note that lag in the last line requires a time series and should not be used with plain vectors.

# r6 is data frame
r6 <- transform(dat, roll = rollapply(x, list(-seq(3)), mean, fill = NA))

# r7, r8, r9 are zoo objects

r7 <- rollapply(z, list(-seq(3)), mean) # z from above

r8 <- stats::lag(rollapplyr(z, 3, mean), -1)

r9 <- stats::lag(rollmeanr(z, 3), -1)

all.equal(r7, r8, r9)
## [1] TRUE
Calli answered 1/5, 2018 at 16:13 Comment(1)
Great answer! Thank you.Low
G
3

In short use align = "right" is the answer. The align specifies whether index of result be left/center/right aligned compared to rolling window of observations. If width=3, align="right" then two observations from left are passed along with current observation to FUN calculate value at index of current observation.

One can use lag of rolling(align = "right") mean to get the mean for previous n observations excluding current observation. The below answer is based on zoo::rollapply and it calculate mean for previous 5 years.

set.seed(1)
dat <- data.frame(year = c(1990:2010), 
                  x = rnorm(21))

library(dplyr)
library(zoo)
#Mean for previous 5 years can be calculated as:

dat$meanx <- lag(rollapply(dat$x, 5, mean, align = "right", fill=NA))

#Test result
dat[1:10,]
# year          x      meanx
# 1  1990 -0.6264538         NA
# 2  1991  0.1836433         NA
# 3  1992 -0.8356286         NA
# 4  1993  1.5952808         NA
# 5  1994  0.3295078         NA
# 6  1995 -0.8204684 0.12926990
# 7  1996  0.4874291 0.09046698
# 8  1997  0.7383247 0.15122413
# 9  1998  0.5757814 0.46601479
# 10 1999 -0.3053884 0.26211490
# so on
Gaudette answered 1/5, 2018 at 6:12 Comment(2)
Great answer. Thank you!Low
Rather than using lag use rollapppy(x, 5, list(-seq(5)), mean, fill = NA) . i.e. specify offsets -1, -2, -3, -4, -5.Calli

© 2022 - 2024 — McMap. All rights reserved.