Why doesn't lazy evaluation work in this R function? [duplicate]
Asked Answered
E

2

9

Possible Duplicate:
How to write an R function that evaluates an expression within a data-frame

I want to write a function that sorts a data.frame -- instead of using the cumbersome order(). Given something like

> x=data.frame(a=c(5,6,7),b=c(3,5,1))
> x
  a b
1 5 3
2 6 5
3 7 1

I want to say something like:

sort.df(x,b)

So here's my function:

sort.df <- function(df, ...) {
  with(df, df[order(...),])
}

I was really proud of this. Given R's lazy evaluation, I figured that the ... parameter would only be evaluated when needed -- and by that time it would be in scope, due to 'with'.

If I run the 'with' line directly, it works. But the function doesn't.

> with(x,x[order(b),])
  a b
3 7 1
1 5 3
2 6 5
> sort.df(x,b)
Error in order(...) : object 'b' not found

What's wrong and how to fix it? I see this sort of "magic" frequently in packages like plyr, for example. What's the trick?

Enemy answered 11/10, 2012 at 17:59 Comment(3)
sort.df(x, x$b) works, but still I have no idea why sort.df(x,b) does not workPilsudski
See also plyr::arrange which does exactly this.Mosstrooper
Thanks! I didn't know about arrange despite using plyr every day. Yet another example that it's hard to find the right solutions in the R world -- and so much of good R programming is learning best practices using a few good packages.Enemy
W
7

It's because when you're passing b you're actually not passing an object. Put a browser inside your function and you'll see what I mean. I stole this from some Internet robot somewhere:

x=data.frame(a=c(5,6,7),b=c(3,5,1))

sort.df <- function(df, ..., drop = TRUE){
    ord <- eval(substitute(order(...)), envir = df, enclos = parent.frame())
    return(df[ord, , drop = drop])
}

sort.df(x, b)

will work.

So will if you're looking for a nice way to do this in an applied sense:

library(taRifx)
sort(x, f=~b)
Wellfixed answered 11/10, 2012 at 18:17 Comment(2)
+1 for the nice solution and, especially, for suggesting playing around with a browser() call inside the function. IMHO, that's far and away the best way to learn about ... and all the oddness that surrounds it.Brickle
Someone could correct me on this, but enclos = parent.frame() is default in eval so simply eval(substitute(order(...)), envir = df) also works :)Rhys
E
9

This will do what you want:

sort.df <- function(df, ...) {
  dots <- as.list(substitute(list(...)))[-1]
  ord <- with(df, do.call(order, dots))
  df[ord,]
}

## Try it out
x <- data.frame(a=1:10, b=rep(1:2, length=10), c=rep(1:3, length=10))
sort.df(x, b, c)

And so will this:

sort.df2 <- function(df, ...) {
    cl <- substitute(list(...))
    cl[[1]] <- as.symbol("order")
    df[eval(cl, envir=df),]
}
 sort.df2(x, b, c)
Erlindaerline answered 11/10, 2012 at 18:17 Comment(2)
Or sort.df <- function(df, ...) df[order(eval(substitute(...), df)),]Filmdom
@JoshuaUlrich -- Not quite the same. Yours will only end up sorting by the first element of ..., since substitute(...) only captures that. (Put a browser() call in sort.df(), and then compare substitute(...) and substitute(list(...)) to see what I mean.)Brickle
W
7

It's because when you're passing b you're actually not passing an object. Put a browser inside your function and you'll see what I mean. I stole this from some Internet robot somewhere:

x=data.frame(a=c(5,6,7),b=c(3,5,1))

sort.df <- function(df, ..., drop = TRUE){
    ord <- eval(substitute(order(...)), envir = df, enclos = parent.frame())
    return(df[ord, , drop = drop])
}

sort.df(x, b)

will work.

So will if you're looking for a nice way to do this in an applied sense:

library(taRifx)
sort(x, f=~b)
Wellfixed answered 11/10, 2012 at 18:17 Comment(2)
+1 for the nice solution and, especially, for suggesting playing around with a browser() call inside the function. IMHO, that's far and away the best way to learn about ... and all the oddness that surrounds it.Brickle
Someone could correct me on this, but enclos = parent.frame() is default in eval so simply eval(substitute(order(...)), envir = df) also works :)Rhys

© 2022 - 2024 — McMap. All rights reserved.