In R, exactly what causes an object of type name (or symbol) to be evaluated?
Asked Answered
S

2

5

After running:

x <- as.name("aa")
aa <- 2

in R, why doesn't

(x)

return 2? And why doesn't

x <- as.name("aa")
aa <- 3
get(get(x))

return 3?

I know get() expects a string, but I don't understand why it doesn't evaluate x, find the string inside, and then get that. It seems to me like sometimes functions do such evaluation of their arguments, and sometimes they don't. For instance, in the second example, if you replace get(get(x)) with eval(x), eval() evaluates the x to find the name, and then evaluates the name to find 3.

Saberhagen answered 6/5, 2016 at 0:44 Comment(1)
I'm baffled by your edit. You seem to be asking why the R developers didn't make get do what eval does, but then the answer would have to be because they do two different things: one evaluates an expression (which may be a symbol) and the other retrieves objects where you specify the name (symbol) of the object via a character. Why wouldn't you split different functionality into two different functions?Sensuality
F
5

I think @joran's answer is right but maybe I can try to explain a different way.

The ( "function" in R is essentially the identity function. It echoes back what you pass it. It's almost like it isn't there. There is no difference between what will be returned by these statements

x      #1
(x)    #2
((x))  #3

The parenthesis just passthrough the value inside. You can add as many parenthesis as you want and it will not change what's returned. The evaluator looks at ((x)), see the outer parenthesis, and knows to just return the value of the thing inside the parenthesis. So now it's parsing just (x), and again, it sees the outer parenthesis and will just return the value inside the parenthesis, which is x. The parenthesis just pass though the value from the inside; they do not evaluate it.

The bare value x is a name (or symbol). A name is not uniquely tied to a value. The mapping between names and values differs by environment. That's why names must be evaluated in a particular context to get a value. Consider these examples

aa <- 5
dd <- data.frame(aa=20)
x <- as.name("aa")
foo <- function(x) {aa<-10; eval(x)}

eval(x)
# [1] 5
foo(x)
# [1] 10
eval(x, dd)
# [1] 20

This behavior is actually highly desirable. It's what makes features that require non-standard evaluation work, like

subset(mtcars, hp<100)

When you are using the R console, it behaves as a REPL -- it reads your input, evaluates it, prints it, and then waits for the next input. Note that it only does one level of evaluation and the evaluation happens in the "current" environment. It does not recursively evaluate the returned value from an expression. So when you do

x <- as.name("aa")
x   # identical to (x)
# aa

when the REPL gets to the evaluation step, it evaluates the name x which points to the name aa. That's it. One level of evaluation. The name aa is not subsequently evaluated.

There is a note in the ?eval help page that says this:

eval evaluates its first argument in the current scope before passing it to the evaluator

There's not a "double" evaluation happening there. It is merely evaluating it's parameters just as any other function in R does. For examples

aa <- 5 
bar <- function(x) print(x)
bar(aa+2)
# [1] 7

It prints "7", not "aa+2" because the function has evaluated it's parameter prior to printing. It also explains the differences between these two

dd <- data.frame(bb=20)
xx <- as.name("bb")
eval(bb, dd)
# Error in eval(bb, dd) : object 'bb' not found
eval(xx, dd)
# [1] 20

In the first eval() call, R is unable to evaluate bb in the current environment so you get the error. But note that

evalq(bb, dd)

does work because evalq does not try to evaluate the first expression parameter.

Farrow answered 17/5, 2016 at 19:13 Comment(5)
Dear @Farrow That is very helpful. But I still dont understand why given: x <- as.name("aa"); aa <- 2; if (x) returns the same thing as x at the command line (which it does: the symbol aa), why doesn't ((x)) return 2? (x) returns aa at the command line, and aa returns 2.Saberhagen
Because there is only one "return" happening here. The line is evaluated exactly once. There is no double evaluation. The parenthesis do not force evaluation.Farrow
OK, so maybe I've got it. The parentheses really, really don't do anything but group. Specifically, they don't do the same thing as typing x into the R console. It is the R console command line, which, as you put it, behaves like a Read–eval–print loop, not the ( function. So ((x)) arrives at the command line as x, not as aa. It doesn't become aa until the command line gets to it, and it is the console command line that resolves the x into aa. Putting in more layers of parenthesis doesn't make any difference because even the innermost parentheses don't do anything to x. Is that right?Saberhagen
BTW, this line: "eval evaluates its first argument in the current scope before passing it to the evaluator", which I must have read 20 times without its meaning penetrating, explains every failure to predict what eval() will do that I have ever had. Except the ones not caused by eval. A big double thanks from me.Saberhagen
Yep, I think you've got it now!Farrow
S
10

Because the value of x is not 2, it is the symbol (or name) aa. However, if you eval it:

> eval(x)
[1] 2

Similarly, get(x) doesn't work at all (i.e. produces an error) because as per the documentation for get, it's first argument must be an object name (given as a character string), where the parenthetical is meant to distinguish it from a symbol/name.

get only works with a character argument:

 > get("aa")
[1] 2

And a symbol (which I find less confusing than name) is not the same thing:

> identical("aa",as.name("aa"))
[1] FALSE

(as.name and as.symbol do the same thing.)

For an excellent explanation of the "evaluation of expressions" vs "evaluation of function arguments" distinction I mention below in a comment, see @MrFlick's answer.

Sensuality answered 6/5, 2016 at 0:54 Comment(3)
I think what I remain confused about is why R behaves differently if I type a symbol than it does if a function returns the same symbol. eval(x) does not return a symbol -- it evaluates the expression x, finds a symbol, and then evaluates the symbol to find the value. Evaluation happens twice. If I type (aa), R returns 2. if I type x, R returns aa, of type symbol. But if I type (x), R still returns the symbol. I don't understand why, after the inner parentheses evaluate to a symbol, the outer parentheses do not return what they do if I type that same symbol in parentheses.Saberhagen
My get() example was badly constructed. Let me restate. If I type (get("x")), just as above, it returns a symbol again. At the other extreme, get("get("x")") returns an error. It does not evaluate get("x") to a symbol, put quotes around it, and then evaluate the quoted symbol. That makes more sense to me, but only because I do not expect quotation marks to behave like a regular function, evaluating it's argument and then returning the result as a string. But in (get("x")), I do expect the argument of the outer parenthesis to be evaluated, and it's not.Saberhagen
@Saberhagen I think you're confusing "evaluating an arbitrary R expression in a specific environment" with "evaluating a function argument", where the latter really means something closer to "binding a promise to a value". () behaves semantically like the function identity, so any "evaluation" of its arguments simply means that it will resolve the promise to a value. Consider identity(x = get("x")). What's being evaluated is the argument x, and the result is it being bound to the value returned by get("x").Sensuality
F
5

I think @joran's answer is right but maybe I can try to explain a different way.

The ( "function" in R is essentially the identity function. It echoes back what you pass it. It's almost like it isn't there. There is no difference between what will be returned by these statements

x      #1
(x)    #2
((x))  #3

The parenthesis just passthrough the value inside. You can add as many parenthesis as you want and it will not change what's returned. The evaluator looks at ((x)), see the outer parenthesis, and knows to just return the value of the thing inside the parenthesis. So now it's parsing just (x), and again, it sees the outer parenthesis and will just return the value inside the parenthesis, which is x. The parenthesis just pass though the value from the inside; they do not evaluate it.

The bare value x is a name (or symbol). A name is not uniquely tied to a value. The mapping between names and values differs by environment. That's why names must be evaluated in a particular context to get a value. Consider these examples

aa <- 5
dd <- data.frame(aa=20)
x <- as.name("aa")
foo <- function(x) {aa<-10; eval(x)}

eval(x)
# [1] 5
foo(x)
# [1] 10
eval(x, dd)
# [1] 20

This behavior is actually highly desirable. It's what makes features that require non-standard evaluation work, like

subset(mtcars, hp<100)

When you are using the R console, it behaves as a REPL -- it reads your input, evaluates it, prints it, and then waits for the next input. Note that it only does one level of evaluation and the evaluation happens in the "current" environment. It does not recursively evaluate the returned value from an expression. So when you do

x <- as.name("aa")
x   # identical to (x)
# aa

when the REPL gets to the evaluation step, it evaluates the name x which points to the name aa. That's it. One level of evaluation. The name aa is not subsequently evaluated.

There is a note in the ?eval help page that says this:

eval evaluates its first argument in the current scope before passing it to the evaluator

There's not a "double" evaluation happening there. It is merely evaluating it's parameters just as any other function in R does. For examples

aa <- 5 
bar <- function(x) print(x)
bar(aa+2)
# [1] 7

It prints "7", not "aa+2" because the function has evaluated it's parameter prior to printing. It also explains the differences between these two

dd <- data.frame(bb=20)
xx <- as.name("bb")
eval(bb, dd)
# Error in eval(bb, dd) : object 'bb' not found
eval(xx, dd)
# [1] 20

In the first eval() call, R is unable to evaluate bb in the current environment so you get the error. But note that

evalq(bb, dd)

does work because evalq does not try to evaluate the first expression parameter.

Farrow answered 17/5, 2016 at 19:13 Comment(5)
Dear @Farrow That is very helpful. But I still dont understand why given: x <- as.name("aa"); aa <- 2; if (x) returns the same thing as x at the command line (which it does: the symbol aa), why doesn't ((x)) return 2? (x) returns aa at the command line, and aa returns 2.Saberhagen
Because there is only one "return" happening here. The line is evaluated exactly once. There is no double evaluation. The parenthesis do not force evaluation.Farrow
OK, so maybe I've got it. The parentheses really, really don't do anything but group. Specifically, they don't do the same thing as typing x into the R console. It is the R console command line, which, as you put it, behaves like a Read–eval–print loop, not the ( function. So ((x)) arrives at the command line as x, not as aa. It doesn't become aa until the command line gets to it, and it is the console command line that resolves the x into aa. Putting in more layers of parenthesis doesn't make any difference because even the innermost parentheses don't do anything to x. Is that right?Saberhagen
BTW, this line: "eval evaluates its first argument in the current scope before passing it to the evaluator", which I must have read 20 times without its meaning penetrating, explains every failure to predict what eval() will do that I have ever had. Except the ones not caused by eval. A big double thanks from me.Saberhagen
Yep, I think you've got it now!Farrow

© 2022 - 2024 — McMap. All rights reserved.