In R, evaluate expressions within vector of strings
Asked Answered
B

3

11

I wish to evaluate a vector of strings containing arithmetic expressions -- "1+2", "5*6", etc.

I know that I can parse a single string into an expression and then evaluate it as in eval(parse(text="1+2")).

However, I would prefer to evaluate the vector without using a for loop.

foo <- c("1+2","3+4","5*6","7/8") # I want to evaluate this and return c(3,7,30,0.875)
eval(parse(text=foo[1])) # correctly returns 3, so how do I vectorize the evaluation?
eval(sapply(foo, function(x) parse(text=x))) # wrong! evaluates only last element
Bissonnette answered 26/7, 2014 at 20:25 Comment(21)
How is using sapply is vectorizing?Snowclad
@DavidArenburg Because he is operating on multiple elements of a vector at one time?Nichrome
@iShouldUseAName, sapply is the same thing as for loop, just slower. For such simple operation, a for loop will be a better choice. In R, this is not what you mean by saying "vectorized solution"Snowclad
Scratch that. Same time.Nichrome
@iShouldUseAName, no, it's not. I tested it too and for loop wins. Compare this to your sapply: for(i in seq_along(foo)){ eval(parse(text = foo[i])) }Snowclad
@DavidArenburg I'm getting about 1.5 seconds for both my loop, your loop and the sapply for 40000 iterations. I'll expand it and see if any difference starts developing.Nichrome
@DavidArenburg plus your for loop isn't returning a value so it isn't really a good comparison, is it? You're not assigning it to anything, saving a lot of time.Nichrome
@iShouldUseAName, you aren't assigning your sapply to anything neither, but you could add print there or something. Anyhow, I think I made my point clear so we can close thisSnowclad
@DavidArenburg That's fine but sapply is returning a value, the intended value in a vector. Your loop doesn't.Nichrome
@iShouldUseAName, I told you already, add printSnowclad
@DavidArenburg you're not understanding my point. print doesn't return a value. It just prints the values. You would need to make your loop assign the values to a vector to match the effect and have it act as a solution to the question. No one wants their values printed to the screen. They want them in an object so they can use them. What your for loop does is calculate the result and leave it be.Nichrome
@iShouldUseAName, Ok, so how is it different from your sapply? As it stands, it is only printing the output to the screenSnowclad
@DavidArenburg because sapply returns a vector?Nichrome
@iShouldUseAName, I agree that for a vector of around 100K length, foo <- sapply(foo, function(x) eval(parse(text=x))) will be very slightly more efficient than for(i in seq_along(foo)) foo[i] <- eval(parse(text = foo[i])), but I don't think this is the case here and it still won't be a vecorized solutionSnowclad
@DavidArenburg except that all vectorization is is abstracting away the loop so it is vectorized.Nichrome
@iShouldUseAName, sapply is not abstracting away the loop, it is just hiding itSnowclad
@DavidArenburg haha then what is vectorization to you? What would abstracting away the loop be? You do realize that when you add two vectors in r that there is a for loop written in C or Fortran that does it, right?Nichrome
@iShouldUseAName, that is what I exactly mean. a C or Fortran loop - is vectorized, a hidden R loop - is not. See hereSnowclad
@iShouldUseAName, or even better, read R Inferno page 24Snowclad
@DavidArenburg for what is worth. Vectorization is not really specific to a language. Hiding an R loop, or a C loop is still vectorization. Vectorization doesn't make any implication that it would be faster.Roadhouse
@Roadhouse I'm not going to enter an almost a decade old discussion- but I meant vectorization as in my previous comment. Or even better as it defined here "Many CPUs have "vector" or "SIMD" instruction sets which apply the same operation simultaneously to two, four, or more pieces of data.". This is definitely implying it would be faster that a by element loop. In Rs case, creating by row loops in an R data.frame, for instance, would be slower (though more memory friendly) than a compiled function such rowSums or such.Snowclad
B
1

I just came across the same problem, and with the need for speed. It resulted in the solution (function) below. The idea is that it is faster to evaluate a vector than to evaluate its elements. The use below is with the '%>%' syntax from the dplyr or magrittr packages.

evaluate_string <- function(x) {
    parse(text = paste(
       paste(
        "c(",
        collapse = ""
       ),
       paste(paste(x, c(rep(",", length(x) - 1), ")"), sep = ""), collapse = " "),
    collapse = " "
)) %>%
    eval() %>%
    return()}
Buonomo answered 9/9, 2024 at 9:59 Comment(0)
N
11

Just apply the whole function.

sapply(foo, function(x) eval(parse(text=x)))
Nichrome answered 26/7, 2014 at 20:27 Comment(4)
I am accepting this because it works, thank you very much! However, to David Arenburg's point, I do understand that there is a difference between sapply and a vectorized operation. Perhaps "how do I serialize this" would have been a better comment. It appears that ?eval takes only a single expr input, not a vector, and it is beyond my r abilities to vectorize that.Bissonnette
@JohnAndrews, you could just stay with your loop and get over with itSnowclad
@JohnAndrews eval does evaluate multiple expressions, but it treats them as a single compound expression for the purpose of returning a value. Eg, try eval(parse(text=c("a <- 1", "b <- 2"))). You will find objects a and b in your workspace.Clubby
But parse is vectorized, so I have an even better one, sapply(parse(text=foo), eval), or perhaps even better sapply(str2expression(foo), eval). Feel free to include it in your answer if you wish.Roadhouse
C
2

Just to show that you can also do this with a for loop:

result <- numeric(length(foo))
foo <- parse(text=foo)
for(i in seq_along(foo))
    result[i] <- eval(foo[[i]])

I'm not a fan of using the *apply functions for their own sake, but in this case, sapply really does lead to simpler, clearer code.

Clubby answered 27/7, 2014 at 4:19 Comment(0)
B
1

I just came across the same problem, and with the need for speed. It resulted in the solution (function) below. The idea is that it is faster to evaluate a vector than to evaluate its elements. The use below is with the '%>%' syntax from the dplyr or magrittr packages.

evaluate_string <- function(x) {
    parse(text = paste(
       paste(
        "c(",
        collapse = ""
       ),
       paste(paste(x, c(rep(",", length(x) - 1), ")"), sep = ""), collapse = " "),
    collapse = " "
)) %>%
    eval() %>%
    return()}
Buonomo answered 9/9, 2024 at 9:59 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.