How to detect free variable names in R functions [duplicate]
Asked Answered
T

2

10

Suppose I have a function:

f <- function() {
  x + 1
}

Here x is a free variable since its value is not defined within function f. Is there a way that I can obtain the variable name, say x, from a defined function, say f?

I am asking this question while maintaining others' old R codes. There are a lot of free variables used, and that makes debugging hard.

Any suggestions are welcomed as well.

Topazolite answered 18/8, 2014 at 23:12 Comment(3)
?body. Try: body(f)Lidia
@BondedDust body(f) lists the definition, but does not give the free variable names. When f is very long, body(f) is very hard to use.Topazolite
@Dason, at least I could not find a duplicated question in stackoverflow. If you can find it later, please mention it.Topazolite
G
7

The codetools package has functions for this purpose, eg findGlobals

findGlobals(f, merge=FALSE)[['variables']]
# [1] "x"

if we redefine the function to have a named argument x then no variables are returned.

f2 <- function(x){
  x+1
}
findGlobals(f2, merge=FALSE)[['variables']]
# character(0)
Gottlieb answered 19/8, 2014 at 2:13 Comment(2)
Thanks @Gottlieb for pointing me this helpful function.Topazolite
And thanks for providing the search strategy to find the duplicate, which was also closed as a duplicate.Lidia
C
3

This is a rough stab at it.

find_vars <- function(f, vars=list(found=character(), defined=names(formals(f)))) {
    if( is.function(f) ) {
        # function, begin search on body
        return(find_vars(body(f), vars))
    } else if (is.call(f) && deparse(f[[1]]) == "<-") {
        # assignment with <- operator
        if (is.recursive(f[[2]])) {
           if (is.call(f[[2]]) && deparse(f[[2]][[1]]) == "$") {
               vars$defined <- unique( c(vars$defined, deparse(f[[2]][[1]])) )  
           } else {
               warning(paste("unable to determine assignments variable in", deparse(f[[2]])))
           }
        } else {
            vars$defined <- unique( c(vars$defined, deparse(f[[2]])) )  
        }
        vars <- find_vars(f[[3]], vars)
    } else if (is.call(f) && deparse(f[[1]]) == "$") {
        # assume "b" is ok in a$b
        vars <- find_vars(f[[2]], vars)
    } else if (is.call(f) && deparse(f[[1]]) == "~") {
        #skip formulas
    } else if (is.recursive(f)) {
        # compound object, iterate through sub-parts
        v <- lapply(as.list(f)[-1], find_vars, vars)
        vars$defined <- unique( c(vars$defined, unlist(sapply(v, `[[`, "defined"))) )
        vars$found <- unique( c(vars$found, unlist(sapply(v, `[[`, "found"))) )
    } else if (is(f, "name")) {
        # standard variable name/symbol
        vars$found <- unique( c(vars$found, deparse(f)))
    }
    vars
}

find_free <- function(f) {
    r <- find_vars(f)
    return(setdiff(r$found, r$defined))
}

Then you could use it like

f <- function() {
  z <- x + 1
  z
}
find_free(f)
# [1] "x"

I'm sure there are many possibilities for a false positives and I didn't do any special coding for functions with non standard evaluation. For example

g <- function(df) {
  with(df, mpg + disp)
}
g(head(mtcars))
# [1] 181 181 131 279 379 243

but

find_free(g)
# [1] "mpg"  "disp"

I already put in a special branch for the $ operator and formulas; you could put in a special branch for functions that have non standard evaluation like with() or subset() or whatever you like. It depends on what your code ends up looking like.

This assumes all assignment is happening via a standard <-. There are other ways to assign variables (ie, assign()) that would go undetected. We also ignore all function calls. So if you call myfun(1), it will not report myfun as being a free variable even though it may potentially be a "free function" defined else where in the code.

So this may not be perfect, but it should act as a decent screen for potential problems.

Cuckoopint answered 19/8, 2014 at 2:6 Comment(4)
Thanks for this illustrative code. I learned a lot from you again. It works perfectly.Topazolite
find_free definitely meets my needs. Just FYI, this is a corner case, f <- function() {summary(c(1))$Mean}, I expect the results to be empty, but I got [1] "summary(c(1))". Again, it does not affect my usage at all. Thank you.Topazolite
@Topazolite I made a small adjustment for the code in your example. Likely I said, it may need further adjustments to be perfect. I assume findGlobals in codetools is more complete.Cuckoopint
Thanks @MrFlick. It works. I am using both now.Topazolite

© 2022 - 2024 — McMap. All rights reserved.