Force R function call to be self-sufficient
Asked Answered
G

1

2

I'm looking for a way to call a function that is not influenced by other objects in .GlobalEnv.

Take a look at the two functions below:

y = 3
f1 = function(x) x+y

f2 = function(x) {
   library(dplyr)
   x %>%
       mutate(area = Sepal.Length *Sepal.Width) %>%
       head()
}

In this case:

  • f1(5) should fail, because y is not defined in the function scope
  • f2(iris) should pass, because the function does not reference variables outside its scope

Now, I can overwrite the environment of f1 and f2, either to baseenv() or new.env(parent=environment(2L)):

environment(f1) = baseenv()
environment(f2) = baseenv()
f1(3)    # fails, as it should
f2(iris) # fails, because %>% is not in function env

or:

# detaching here makes `dplyr` inaccessible for `f2`
# not detaching leaves `head` inaccessible for `f2`
detach("package:dplyr", unload=TRUE)
environment(f1) = new.env(parent=as.environment(2L))
environment(f2) = new.env(parent=as.environment(2L))
f1(3)    # fails, as it should
f2(iris) # fails, because %>% is not in function env

Is there a way to overwrite a function's environment so that it has to be self-sufficient, but it also always works as long as it loads its own libraries?

Gainsay answered 25/8, 2017 at 21:42 Comment(7)
As long as it is what?Remindful
Honestly, I just wouldn't write functions that included global variables at all--it seems like a recipe for unintended errors.Remindful
Possibly relevant: https://mcmap.net/q/361244/-r-force-local-scope/324364Stubblefield
@Remindful I’m like 99.9% sure that Michael not only knows this but strongly agrees. I don’t know the context of the question but I guess it has something to do with isolating self-contained user code in a library that performs cross-machine communication.Extract
@Remindful The question is to specifically guard against usage of global variablesGainsay
@Stubblefield This is indeed equivalent to my approach #2, but Konrad describes why it doesn't work: https://mcmap.net/q/361244/-r-force-local-scopeGainsay
@MichaelSchubert oh sorry, I must have misunderstoodRemindful
E
3

The problem here is, fundamentally, that library and similar tools don’t provide scoping, and are not designed to be made to work with scopes:1 Even though library is executed inside the function, its effect is actually global, not local. Ugh.

Specifically, your approach of isolating the function from the global environment is sounds; however, library manipulates the search path (via attach), and the function’s environment isn’t “notified” of this: it will still point to the previous second search path entry as its grandparent.

You need to find a way of updating the function environment’s grandparent environment when library/attach/… ist called. You could achieve this by replacing library etc. in the function’s parent environment with your own versions that calls a modified version of attach. This attach2 would then not only call the original attach but also relink your environment’s parent.


1 As an aside, ‘box’ fixes all of these problems. Replacing library(foo) with box::use(foo[...]) in your code makes it work. This is because modules are strongly scoped and environment-aware.

Extract answered 26/8, 2017 at 8:59 Comment(1)
Yes, it looks like there is no other way. But for this to work, this overwritten version of library would need to be attached, which I'm not sure I can enforce for local code due to potential unintended side effects.Gainsay

© 2022 - 2024 — McMap. All rights reserved.