How do you use "<<-" (scoping assignment) in R?
Asked Answered
D

7

187

I just finished reading about scoping in the R intro, and am very curious about the <<- assignment.

The manual showed one (very interesting) example for <<-, which I feel I understood. What I am still missing is the context of when this can be useful.

So what I would love to read from you are examples (or links to examples) on when the use of <<- can be interesting/useful. What might be the dangers of using it (it looks easy to loose track of), and any tips you might feel like sharing.

Derryberry answered 13/4, 2010 at 10:4 Comment(1)
I've used <<- to preserve key variables generated inside a function to record in failure logs when the function fails. Can help to make the failure reproducible if the function used inputs (e.g. from external APIs) that wouldn't necessarily have been preserved otherwise due to the failure.Clop
G
242

<<- is most useful in conjunction with closures to maintain state. Here's a section from a recent paper of mine:

A closure is a function written by another function. Closures are so-called because they enclose the environment of the parent function, and can access all variables and parameters in that function. This is useful because it allows us to have two levels of parameters. One level of parameters (the parent) controls how the function works. The other level (the child) does the work. The following example shows how can use this idea to generate a family of power functions. The parent function (power) creates child functions (square and cube) that actually do the hard work.

power <- function(exponent) {
  function(x) x ^ exponent
}

square <- power(2)
square(2) # -> [1] 4
square(4) # -> [1] 16

cube <- power(3)
cube(2) # -> [1] 8
cube(4) # -> [1] 64

The ability to manage variables at two levels also makes it possible to maintain the state across function invocations by allowing a function to modify variables in the environment of its parent. The key to managing variables at different levels is the double arrow assignment operator <<-. Unlike the usual single arrow assignment (<-) that always works on the current level, the double arrow operator can modify variables in parent levels.

This makes it possible to maintain a counter that records how many times a function has been called, as the following example shows. Each time new_counter is run, it creates an environment, initialises the counter i in this environment, and then creates a new function.

new_counter <- function() {
  i <- 0
  function() {
    # do something useful, then ...
    i <<- i + 1
    i
  }
}

The new function is a closure, and its environment is the enclosing environment. When the closures counter_one and counter_two are run, each one modifies the counter in its enclosing environment and then returns the current count.

counter_one <- new_counter()
counter_two <- new_counter()

counter_one() # -> [1] 1
counter_one() # -> [1] 2
counter_two() # -> [1] 1
Gould answered 13/4, 2010 at 14:18 Comment(3)
Hey this is an unsolved R task on Rosettacode (rosettacode.org/wiki/Accumulator_factory#R) Well, it was...Palsy
Would there be any need to enclose more than 1 closures in one parent function? I just tried one snippet, it seems that only the last closure was executed...Motif
Is there any equal sign alternative to the "<<-" sign?Primaveria
S
59

It helps to think of <<- as equivalent to assign (if you set the inherits parameter in that function to TRUE). The benefit of assign is that it allows you to specify more parameters (e.g. the environment), so I prefer to use assign over <<- in most cases.

Using <<- and assign(x, value, inherits=TRUE) means that "enclosing environments of the supplied environment are searched until the variable 'x' is encountered." In other words, it will keep going through the environments in order until it finds a variable with that name, and it will assign it to that. This can be within the scope of a function, or in the global environment.

In order to understand what these functions do, you need to also understand R environments (e.g. using search).

I regularly use these functions when I'm running a large simulation and I want to save intermediate results. This allows you to create the object outside the scope of the given function or apply loop. That's very helpful, especially if you have any concern about a large loop ending unexpectedly (e.g. a database disconnection), in which case you could lose everything in the process. This would be equivalent to writing your results out to a database or file during a long running process, except that it's storing the results within the R environment instead.

My primary warning with this: be careful because you're now working with global variables, especially when using <<-. That means that you can end up with situations where a function is using an object value from the environment, when you expected it to be using one that was supplied as a parameter. This is one of the main things that functional programming tries to avoid (see side effects). I avoid this problem by assigning my values to a unique variable names (using paste with a set or unique parameters) that are never used within the function, but just used for caching and in case I need to recover later on (or do some meta-analysis on the intermediate results).

Shuster answered 13/4, 2010 at 12:34 Comment(2)
Thanks Tal. I have a blog, although I don't really use it. I can never finish a post because I don't want to publish anything unless it's perfect, and I just don't have time for that...Shuster
A wise man once said to me it is not important to be perfect - only out standing - which you are, and so will your posts be. Also - sometimes readers help improve the text with the comments (that's what happens with my blog). I hope one day you will reconsider :)Derryberry
O
10

One place where I used <<- was in simple GUIs using tcl/tk. Some of the initial examples have it -- as you need to make a distinction between local and global variables for statefullness. See for example

 library(tcltk)
 demo(tkdensity)

which uses <<-. Otherwise I concur with Marek :) -- a Google search can help.

Omalley answered 13/4, 2010 at 12:12 Comment(2)
Interesting, I somehow cannot find tkdensity in R 3.6.0.Distressful
The tcltk package ships with R: github.com/wch/r-source/blob/trunk/src/library/tcltk/demo/…Omalley
C
9

On this subject I'd like to point out that the <<- operator will behave strangely when applied (incorrectly) within a for loop (there may be other cases too). Given the following code:

fortest <- function() {
    mySum <- 0
    for (i in c(1, 2, 3)) {
        mySum <<- mySum + i
    }
    mySum
}

you might expect that the function would return the expected sum, 6, but instead it returns 0, with a global variable mySum being created and assigned the value 3. I can't fully explain what is going on here but certainly the body of a for loop is not a new scope 'level'. Instead, it seems that R looks outside of the fortest function, can't find a mySum variable to assign to, so creates one and assigns the value 1, the first time through the loop. On subsequent iterations, the RHS in the assignment must be referring to the (unchanged) inner mySum variable whereas the LHS refers to the global variable. Therefore each iteration overwrites the value of the global variable to that iteration's value of i, hence it has the value 3 on exit from the function.

Hope this helps someone - this stumped me for a couple of hours today! (BTW, just replace <<- with <- and the function works as expected).

Cultism answered 15/6, 2015 at 14:58 Comment(4)
in your example, the local mySumis never incremented but only the global mySum. Hence at each iteration of the for loop, the global mySum get the value 0 + i. You can follow this with debug(fortest).Pase
It's got nothing to do with it being a for-loop; you're referencing two different scopes. Just use <- everywhere consistently within the function if you only want to update the local variable inside the function.Bombacaceous
Or use <<-- everywhere @smci. Though best to avoid globals.Ideatum
As far as I understand, R is NOT scoped withing braces { }, which is different from many other languages. The scoping is within functions. So, <<- does not access the mySum outside of the for loop as you might expect; rather, it accesses mySum outside of the fortest function. In the first run, mySum did not exist outside of fortest, so it was created, initialized as zero, and then incremented. Each subsequent iteration of the for loop iterates the global mySum again. So, the mySum in fortest always stays as zero but a global mySum is created and incremented to 3.Chaudfroid
S
5
f <- function(n, x0) {x <- x0; replicate(n, (function(){x <<- x+rnorm(1)})())}
plot(f(1000,0),typ="l")
Saundra answered 13/4, 2010 at 12:23 Comment(1)
This is a good example of where not to use <<-. A for loop would be clearer in this case.Gould
H
4

The <<- operator can also be useful for Reference Classes when writing Reference Methods. For example:

myRFclass <- setRefClass(Class = "RF",
                         fields = list(A = "numeric",
                                       B = "numeric",
                                       C = function() A + B))
myRFclass$methods(show = function() cat("A =", A, "B =", B, "C =",C))
myRFclass$methods(changeA = function() A <<- A*B) # note the <<-
obj1 <- myRFclass(A = 2, B = 3)
obj1
# A = 2 B = 3 C = 5
obj1$changeA()
obj1
# A = 6 B = 3 C = 9
Hua answered 15/6, 2015 at 15:12 Comment(0)
G
2

I use it in order to change inside purrr::map() an object in the global environment.

a = c(1,0,0,1,0,0,0,0)

Say I want to obtain a vector which is c(1,2,3,1,2,3,4,5), that is if there is a 1, let it 1, otherwise add 1 until the next 1.

purrr::map(
  .x = seq(1,(length(a))),
  .f = function(x) {
    a[x] <<- ifelse(a[x]==1, a[x], a[x-1]+1)
    })
a
[1] 1 2 3 1 2 3 4 5
Guglielma answered 2/5, 2022 at 13:10 Comment(1)
This tiny answer explains what brought me to this question--why <- often does not work in purrr::map and I have to use <<- . Now I get it: purrr::map does its work inside a function and there only function scope applies. So, super-assignment <<- is required to modify variables outside of the purrr::map internal function (the .f argument).Chaudfroid

© 2022 - 2024 — McMap. All rights reserved.