Goal
My goal is to define some functions for use within dplyr
verbs, that use pre-defined variables. This is because I have some of these functions that take a bunch of arguments, of which many always are the same variable names.
My understanding: This is difficult (and perhaps impossible) because dplyr
will lazily evaluate user-specified variables later on, but any default arguments are not in the function call and therefore invisible to dplyr
.
Toy example
Consider the following example, where I use dplyr
to calculate whether a variable has changed or not (rather meaningless in this case):
library(dplyr)
mtcars %>%
mutate(cyl_change = cyl != lag(cyl))
Now, lag
also supports alternate ordering like so:
mtcars %>%
mutate(cyl_change = cyl != lag(cyl, order_by = gear))
But what if I'd like to create my own version of lag
that always orders by gear
?
Failed attempts
The naive approach is this:
lag2 <- function(x, n = 1L, order_by = gear) lag(x, n = n, order_by = order_by)
mtcars %>%
mutate(cyl_change = cyl != lag2(cyl))
But this obviously raises the error:
no object named ‘gear’ was found
More realistic options would be these, but they also don't work:
lag2 <- function(x, n = 1L) lag(x, n = n, order_by = ~gear)
lag2 <- function(x, n = 1L) lag(x, n = n, order_by = get(gear))
lag2 <- function(x, n = 1L) lag(x, n = n, order_by = getAnywhere(gear))
lag2 <- function(x, n = 1L) lag(x, n = n, order_by = lazyeval::lazy(gear))
Question
Is there a way to get lag2
to correctly find gear
within the data.frame that dplyr
is operating on?
- One should be able to call
lag2
without having to providegear
. - One should be able to use
lag2
on datasets that are not calledmtcars
(but do havegear
as one it's variables). - Preferably
gear
would be a default argument to the function, so it can still be changed if required, but this is not crucial.
gear
is another vector right? You're not passing it to the local environment oflag2
. Trylag2 <- function(x, gear) {...}
(note, no need for paramn
as written). – Interclaviclegear
is a variable inmtcars
. Yeah I goofed then
argument. – Semipalmatelag2
requires a parameter vectorgear
. But you're not passinggear
to the function... rewrite your function so that gear is passed to it. – Interclaviclelag
, whereorder_by
is set togear
by default, without me needing to specify that.lag2 <- function(x, gear) lag(x, order_by = gear)
needs specification of thegear
argument.lag2 <- function(x, gear = gear) lag(x, order_by = gear)
is illegal (can't do x = x in the arguments).lag2 <- function(x, y = gear) lag(x, order_by = y)
will just give the same error as my first attempt in the question. – Semipalmategear
within the context ofsummarize
where other bare variable names are correctly evaluated. Likely by fetching it from the correct environment.. – Semipalmatedata.table
, but neither of them would work withdplyr
– Biospheremulti
)dplyr
for this specific project. – Semipalmatedplyr
verbs as way of dealing with a specific class of object. Since those verbs are the interface, I'd like to keep that consistent. I can rundata.table
code in themutate_
method, but capturing of the arguments is done by the S3 generic and I suppose this will pose some restrictions on possible solutions. I'm still interested in seeing thedata.table
solutions. – Semipalmate