Explain a lazy evaluation quirk

W

2

62

I am reading Hadley Wickhams's book on Github, in particular this part on lazy evaluation. There he gives an example of consequences of lazy evaluation, in the part with add/adders functions. Let me quote that bit:

This [lazy evaluation] is important when creating closures with lapply or a loop:
add <- function(x) {
  function(y) x + y
}
adders <- lapply(1:10, add)
adders[[1]](10)
adders[[10]](10)
x is lazily evaluated the first time that you call one of the adder functions. At this point, the loop is complete and the final value of x is 10. Therefore all of the adder functions will add 10 on to their input, probably not what you wanted! Manually forcing evaluation fixes the problem:
add <- function(x) {
  force(x)
  function(y) x + y
}
adders2 <- lapply(1:10, add)
adders2[[1]](10)
adders2[[10]](10)

I do not seem to understand that bit, and the explanation there is minimal. Could someone please elaborate that particular example, and explain what happens there? I am specifically puzzled by the sentence "at this point, the loop is complete and the final value of x is 10". What loop? What final value, where? Must be something simple I am missing, but I just don't see it. Thanks a lot in advance.

Wriggler answered 21/4, 2013 at 9:51 Comment(3)

Note that the answer to this question has changed as of R 3.2.0, see my answer below. – Pinnule 8/6, 2015 at 9:9

Complement to @jhin's comment: While lapply() has changed in recent R, the function purrr::map(), which is intended to be used wherever lapply() is, still behaves like the old lapply() vis-à-vis shared environments of closures. However, I wouldn't count on this “anachronism” of purrr::map() to stick around, as it will likely be rectified in future versions. – Tilly 21/10, 2016 at 16:31

@jhin Actually, I guess hadley's tutorial is built directly from github so reading it after R 3.2.0 is now quite bizarre as that release made the whole section about lazy evaluation in that tutorial moot: there's no more difference with adders and adders2's outputs! – Communist 23/12, 2016 at 23:13

S

36

The goal of:

adders <- lapply(1:10, function(x)  add(x) )

is to create a list of add functions, the first adds 1 to its input, the second adds 2, etc. Lazy evaluation causes R to wait for really creating the adders functions until you really start calling the functions. The problem is that after creating the first adder function, x is increased by the lapply loop, ending at a value of 10. When you call the first adder function, lazy evaluation now builds the function, getting the value of x. The problem is that the original x is no longer equal to one, but to the value at the end of the lapply loop, i.e. 10.

Therefore, lazy evaluation causes all adder functions to wait until after the lapply loop has completed in really building the function. Then they build their function with the same value, i.e. 10. The solution Hadley suggests is to force x to be evaluated directly, avoiding lazy evaluation, and getting the correct functions with the correct x values.

Sangraal answered 21/4, 2013 at 10:11 Comment(6)

Ok, let me rephrase that to see whether I am getting it right. When we call lapply, R sort of remembers the structure of all 10 adder functions, but does not evaluate x yet. When we call the first adder function, R says, aha, let's see what that is, takes x, which already is 10 at that point from the lapply call, and evaluates the first called adder function as 10 + y. Same for the remaining adder functions, rendering them all identical. Probably crudely put, but is that the logic of it? – Wriggler 21/4, 2013 at 10:26

I believe that this is case. – Sangraal 21/4, 2013 at 11:10

@hadley When I call the first adder function, the lapply loop is already over. Where exactly does the adder function look to find x? Why does the value of x = 10 persists? – Embroil 12/8, 2014 at 22:54

How does the lazy evaluation actually work? All ten different adder functions each have ten separate environments in which to contain x. I suppose maybe they all point to somewhere prior to getting evaluated, but point to where? There's no x in the parent environment. – Douche 19/4, 2015 at 17:29

The environment is created when the function is called for the first time. The x variable is equal to 10 at that time after the lapply loop finished. So they are all the same. – Sangraal 19/4, 2015 at 17:46

By the way, my example code does not include an x, the example code does. I edited my question to remedy this. – Sangraal 19/4, 2015 at 18:6

P

58

This is no longer true as of R 3.2.0!

The corresponding line in the change log reads:

Higher order functions such as the apply functions and Reduce() now force arguments to the functions they apply in order to eliminate undesirable interactions between lazy evaluation and variable capture in closures.

And indeed:

add <- function(x) {
  function(y) x + y
}
adders <- lapply(1:10, add)
adders[[1]](10)
# [1] 11
adders[[10]](10)
# [1] 20

Pinnule answered 22/4, 2015 at 3:1 Comment(0)

S

36

The goal of:

adders <- lapply(1:10, function(x)  add(x) )

is to create a list of add functions, the first adds 1 to its input, the second adds 2, etc. Lazy evaluation causes R to wait for really creating the adders functions until you really start calling the functions. The problem is that after creating the first adder function, x is increased by the lapply loop, ending at a value of 10. When you call the first adder function, lazy evaluation now builds the function, getting the value of x. The problem is that the original x is no longer equal to one, but to the value at the end of the lapply loop, i.e. 10.

Therefore, lazy evaluation causes all adder functions to wait until after the lapply loop has completed in really building the function. Then they build their function with the same value, i.e. 10. The solution Hadley suggests is to force x to be evaluated directly, avoiding lazy evaluation, and getting the correct functions with the correct x values.

Sangraal answered 21/4, 2013 at 10:11 Comment(6)

Ok, let me rephrase that to see whether I am getting it right. When we call lapply, R sort of remembers the structure of all 10 adder functions, but does not evaluate x yet. When we call the first adder function, R says, aha, let's see what that is, takes x, which already is 10 at that point from the lapply call, and evaluates the first called adder function as 10 + y. Same for the remaining adder functions, rendering them all identical. Probably crudely put, but is that the logic of it? – Wriggler 21/4, 2013 at 10:26

I believe that this is case. – Sangraal 21/4, 2013 at 11:10

@hadley When I call the first adder function, the lapply loop is already over. Where exactly does the adder function look to find x? Why does the value of x = 10 persists? – Embroil 12/8, 2014 at 22:54

How does the lazy evaluation actually work? All ten different adder functions each have ten separate environments in which to contain x. I suppose maybe they all point to somewhere prior to getting evaluated, but point to where? There's no x in the parent environment. – Douche 19/4, 2015 at 17:29

The environment is created when the function is called for the first time. The x variable is equal to 10 at that time after the lapply loop finished. So they are all the same. – Sangraal 19/4, 2015 at 17:46

By the way, my example code does not include an x, the example code does. I edited my question to remedy this. – Sangraal 19/4, 2015 at 18:6

Recommended topics

Hot tags