R force local scope
Asked Answered
r
G

12

47

This is probably not correct terminology, but hopefully I can get my point across.

I frequently end up doing something like:

myVar = 1
f <- function(myvar) { return(myVar); }
# f(2) = 1 now

R happily uses the variable outside of the function's scope, which leaves me scratching my head, wondering how I could possibly be getting the results I am.

Is there any option which says "force me to only use variables which have previously been assigned values in this function's scope"? Perl's use strict does something like this, for example. But I don't know that R has an equivalent of my.


EDIT: Thank you, I am aware of that I capitalized them differently. Indeed, the example was created specifically to illustrate this problem!

I want to know if there is a way that R can automatically warn me when I do this.

EDIT 2: Also, if Rkward or another IDE offers this functionality I'd like to know that too.

Glister answered 2/6, 2011 at 15:55 Comment(3)
Just to clarify: it seems your initial question involves forcing local scope, but your edits and the answers involve code checking (static analysis). Which are you really trying to get at? The static checking is answered, but the forcing of local variables is does not quite seem answered.Gnu
@Glister - You got lots of good answers below, but I think my answer has a couple of useful solutions - even though I'm late to the party ;-)Warfore
Regarding EDIT2: RStudio IDE will "warn" you of a symbol "myVar" not being in scope even asking if you meant "myvar"Cormorant
W
33

As far as I know, R does not provide a "use strict" mode. So you are left with two options:

1 - Ensure all your "strict" functions don't have globalenv as environment. You could define a nice wrapper function for this, but the simplest is to call local:

# Use "local" directly to control the function environment
f <- local( function(myvar) { return(myVar); }, as.environment(2))
f(3) # Error in f(3) : object 'myVar' not found

# Create a wrapper function "strict" to do it for you...
strict <- function(f, pos=2) eval(substitute(f), as.environment(pos))
f <- strict( function(myvar) { return(myVar); } )
f(3) # Error in f(3) : object 'myVar' not found

2 - Do a code analysis that warns you of "bad" usage.

Here's a function checkStrict that hopefully does what you want. It uses the excellent codetools package.

# Checks a function for use of global variables
# Returns TRUE if ok, FALSE if globals were found.
checkStrict <- function(f, silent=FALSE) {
    vars <- codetools::findGlobals(f)
    found <- !vapply(vars, exists, logical(1), envir=as.environment(2))
    if (!silent && any(found)) {
        warning("global variables used: ", paste(names(found)[found], collapse=', '))
        return(invisible(FALSE))
    }

    !any(found)
}

And trying it out:

> myVar = 1
> f <- function(myvar) { return(myVar); }
> checkStrict(f)
Warning message:
In checkStrict(f) : global variables used: myVar
Warfore answered 18/11, 2011 at 16:59 Comment(0)
M
13

checkUsage in the codetools package is helpful, but doesn't get you all the way there. In a clean session where myVar is not defined,

f <- function(myvar) { return(myVar); }
codetools::checkUsage(f)

gives

<anonymous>: no visible binding for global variable ‘myVar’

but once you define myVar, checkUsage is happy.

See ?codetools in the codetools package: it's possible that something there is useful:

> findGlobals(f)
[1] "{"      "myVar"  "return"
> findLocals(f)
character(0)
Muskeg answered 2/6, 2011 at 18:4 Comment(2)
Thanks Ben, I guess checkUsage isn't what I want.Glister
@BenBolker +1 - Thanks Ben, had not looked at this package before. In my answer I managed to use findGlobals to solve the problem...Warfore
J
9

Using get(x, inherits=FALSE) will force local scope.

 myVar = 1

 f2 <- function(myvar) get("myVar", inherits=FALSE)


f3 <- function(myvar){
 myVar <- myvar
 get("myVar", inherits=FALSE)
}

output:

> f2(8)    
Error in get("myVar", inherits = FALSE) : object 'myVar' not found
> f3(8)
[1] 8
Johannisberger answered 18/1, 2013 at 20:20 Comment(0)
N
8

You need to fix the typo: myvar != myVar. Then it will all work...

Scope resolution is 'from the inside out' starting from the current one, then the enclosing and so on.

Edit Now that you clarified your question, look at the package codetools (which is part of the R Base set):

R> library(codetools)
R> f <- function(myVAR) { return(myvar) }
R> checkUsage(f)
<anonymous>: no visible binding for global variable 'myvar'
R> 
Newkirk answered 2/6, 2011 at 16:10 Comment(2)
Thanks, I'm aware that this was the problem. I was asking if R has an automated way to detect when this happens (i.e. when I use a variable in a function outside of its scope).Glister
This solution doesn't work if myvar has already been defined in the global environment ...Muskeg
T
8

There is a new package modules on CRAN which addresses this common issue (see the vignette here). With modules, the function raises an error instead of silently returning the wrong result.

# without modules
myVar <- 1
f <- function(myvar) { return(myVar) }
f(2)
[1] 1

# with modules
library(modules)
m <- module({
  f <- function(myvar) { return(myVar) }
})
m$f(2)
Error in m$f(2) : object 'myVar' not found

This is the first time I use it. It seems to be straightforward so I might include it in my regular workflow to prevent time consuming mishaps.

Therein answered 27/4, 2016 at 8:13 Comment(0)
S
7

You are of course doing it wrong. Don't expect static code checking tools to find all your mistakes. Check your code with tests. And more tests. Any decent test written to run in a clean environment will spot this kind of mistake. Write tests for your functions, and use them. Look at the glory that is the testthat package on CRAN.

Skiffle answered 2/6, 2011 at 18:19 Comment(2)
Or alternatively the RUnit package.Mutz
...but dont' expect your tests to find all mistakes either! Use all tools to your disposal - static checks, unit tests and actually running the code as the user would :). Then be prepared to fix more bugs when the REAL users finally gets their hands on it.Warfore
P
4

you can dynamically change the environment tree like this:

a <- 1

f <- function(){
    b <- 1
    print(b)
    print(a)
}

environment(f) <- new.env(parent = baseenv())

f()

Inside f, b can be found, while a cannot.

But probably it will do more harm than good.

Pad answered 2/6, 2011 at 16:53 Comment(1)
Setting the parent to baseenv is a bit restrictive - you can't call stats functions like runif then. I have a slightly different (I dare not say "better" ;-) approach in my answer.Warfore
N
3

You can test to see if the variable is defined locally:

myVar = 1
f <- function(myvar) { 
if( exists('myVar', environment(), inherits = FALSE) ) return( myVar) else cat("myVar was not found locally\n")
}

> f(2)
myVar was not found locally

But I find it very artificial if the only thing you are trying to do is to protect yourself from spelling mistakes.

The exists function searches for the variable name in the particular environment. inherits = FALSE tells it not to look into the enclosing frames.

Nichol answered 2/6, 2011 at 16:40 Comment(1)
To communicate with user you should use message, warning or stop. In that way suppressWarnings, suppressMessages or tryCatch can handle it.Paxton
T
3

environment(fun) = parent.env(environment(fun))

will remove the 'workspace' from your search path, leave everything else. This is probably closest to what you want.

Turboelectric answered 17/11, 2011 at 22:3 Comment(1)
This will not work if your function loads a library or do anything else that updates the searchpath, because the new enrivonment is inserted between the function's env and .GlobalEnv. See https://mcmap.net/q/372429/-force-r-function-call-to-be-self-sufficientOleary
M
2

@Tommy gave a very good answer and I used it to create 3 functions that I think are more convenient in practice.

strict

to make a function strict, you just have to call

strict(f,x,y)

instead of

f(x,y)

example:

my_fun1 <- function(a,b,c){a+b+c}
my_fun2 <- function(a,b,c){a+B+c}
B <- 1
my_fun1(1,2,3)        # 6
strict(my_fun1,1,2,3) # 6
my_fun2(1,2,3)        # 5
strict(my_fun2,1,2,3) # Error in (function (a, b, c)  : object 'B' not found

checkStrict1

To get a diagnosis, execute checkStrict1(f) with optional Boolean parameters to show more ore less.

checkStrict1("my_fun1") # nothing
checkStrict1("my_fun2") # my_fun2  : B

A more complicated case:

A <- 1 # unambiguous variable defined OUTSIDE AND INSIDE my_fun3
# B unambiguous variable defined only INSIDE my_fun3
C <- 1 # defined OUTSIDE AND INSIDE with ambiguous name (C is also a base function)
D <- 1 # defined only OUTSIDE my_fun3 (D is also a base function)
E <- 1 # unambiguous variable defined only OUTSIDE my_fun3
# G unambiguous variable defined only INSIDE my_fun3
# H is undeclared and doesn't exist at all
# I is undeclared (though I is also base function)
# v defined only INSIDE (v is also a base function)
my_fun3 <- function(a,b,c){
  A<-1;B<-1;C<-1;G<-1
  a+b+A+B+C+D+E+G+H+I+v+ my_fun1(1,2,3)
}
checkStrict1("my_fun3",show_global_functions = TRUE ,show_ambiguous = TRUE , show_inexistent = TRUE)

# my_fun3  : E 
# my_fun3  Ambiguous : D 
# my_fun3  Inexistent : H 
# my_fun3  Global functions : my_fun1

I chose to show only inexistent by default out of the 3 optional additions. You can change it easily in the function definition.

checkStrictAll

Get a diagnostic of all your potentially problematic functions, with the same parameters.

checkStrictAll()
my_fun2         : B 
my_fun3         : E 
my_fun3         Inexistent : H

sources

strict <- function(f1,...){
  function_text <- deparse(f1)
  function_text <- paste(function_text[1],function_text[2],paste(function_text[c(-1,-2,-length(function_text))],collapse=";"),"}",collapse="") 
  strict0 <- function(f1, pos=2) eval(substitute(f1), as.environment(pos))
  f1 <- eval(parse(text=paste0("strict0(",function_text,")")))
  do.call(f1,list(...))
}

checkStrict1 <- function(f_str,exceptions = NULL,n_char = nchar(f_str),show_global_functions = FALSE,show_ambiguous = FALSE, show_inexistent = TRUE){
  functions <-  c(lsf.str(envir=globalenv()))
  f <- try(eval(parse(text=f_str)),silent=TRUE)
  if(inherits(f, "try-error")) {return(NULL)}
  vars <- codetools::findGlobals(f)
  vars <- vars[!vars %in% exceptions]
  global_functions <- vars %in% functions

  in_global_env <- vapply(vars, exists, logical(1), envir=globalenv())
  in_local_env  <- vapply(vars, exists, logical(1), envir=as.environment(2))
  in_global_env_but_not_function <- rep(FALSE,length(vars))
  for (my_mode in c("logical", "integer", "double", "complex", "character", "raw","list", "NULL")){
    in_global_env_but_not_function <- in_global_env_but_not_function | vapply(vars, exists, logical(1), envir=globalenv(),mode = my_mode)
  }
  found     <- in_global_env_but_not_function & !in_local_env
  ambiguous <- in_global_env_but_not_function & in_local_env
  inexistent <- (!in_local_env) & (!in_global_env)
  if(typeof(f)=="closure"){
    if(any(found))           {cat(paste(f_str,paste(rep(" ",n_char-nchar(f_str)),collapse=""),":",                  paste(names(found)[found], collapse=', '),"\n"))}
    if(show_ambiguous        & any(ambiguous))       {cat(paste(f_str,paste(rep(" ",n_char-nchar(f_str)),collapse=""),"Ambiguous :",        paste(names(found)[ambiguous], collapse=', '),"\n"))}
    if(show_inexistent       & any(inexistent))      {cat(paste(f_str,paste(rep(" ",n_char-nchar(f_str)),collapse=""),"Inexistent :",       paste(names(found)[inexistent], collapse=', '),"\n"))}
    if(show_global_functions & any(global_functions)){cat(paste(f_str,paste(rep(" ",n_char-nchar(f_str)),collapse=""),"Global functions :", paste(names(found)[global_functions], collapse=', '),"\n"))}
    return(invisible(FALSE)) 
  } else {return(invisible(TRUE))}
}

checkStrictAll <-  function(exceptions = NULL,show_global_functions = FALSE,show_ambiguous = FALSE, show_inexistent = TRUE){
  functions <-  c(lsf.str(envir=globalenv()))
  n_char <- max(nchar(functions))  
  invisible(sapply(functions,checkStrict1,exceptions,n_char = n_char,show_global_functions,show_ambiguous, show_inexistent))
}
Memento answered 16/5, 2017 at 9:21 Comment(0)
M
1

What works for me, based on @c-urchin 's answer, is to define a script which reads all my functions and then excludes the global environment:

filenames <- Sys.glob('fun/*.R')
for (filename in filenames) {
    source(filename, local=T)
    funname <- sub('^fun/(.*).R$', "\\1", filename)
    eval(parse(text=paste('environment(',funname,') <- parent.env(globalenv())',sep='')))
}

I assume that

  • all functions and nothing else are contained in the relative directory ./fun and
  • every .R file contains exactly one function with an identical name as the file.

The catch is that if one of my functions calls another one of my functions, then the outer function has to also call this script first, and it is essential to call it with local=T:

source('readfun.R', local=T)

assuming of course that the script file is called readfun.R.

Motta answered 30/1, 2013 at 19:51 Comment(0)
M
0

This semi-manual approach is a one-liner which will return a character vector of the variables in the function that are not an argument and then you can manually check whether each one is defined in the function or not. No packages are used.

# test function
f <- function(x) {
  z <- x + y
  z
}

setdiff(all.vars(body(f)), names(formals(f)))
## [1] "z" "y"
Moneymaker answered 6/12, 2023 at 18:53 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.