I'm trying to gain a deeper understanding of loops vs. *apply functions in R. Here, I did an experiment where I compute the first 10,000 triangular numbers in 3 different ways.
unwrapped
: a simple for loopwrapped
: I take the exact same loop from before, but wrap it in a function.vapply
: Usingvapply
and an anonymous function.
The results surprised me in two different ways.
- Why is
wrapped
8x faster thanunwrapped
(?!?!) My intuition is that givenwrapped
actually does more stuff (defines a function and then calls it), it should have been slower. - Why are they both so much faster than vapply? I would have expected vapply to be able to do some kind of optimization that performs at least as well as the loops.
microbenchmark::microbenchmark(
unwrapped = {
x <- numeric(10000)
for (i in 1:10000) {
x[i] <- i * (i + 1) / 2
}
x
},
wrapped = {
tri_nums <- function(n) {
x <- numeric(n)
for (i in 1:n) {
x[i] <- i * (i + 1) / 2
}
x
}
tri_nums(10000)
},
vapply = vapply(1:10000, \(i) i * (i + 1) / 2, numeric(1)),
check = 'equal'
)
#> Unit: microseconds
#> expr min lq mean median uq max neval
#> unwrapped 2652.487 3006.888 3445.896 3150.7555 3832.094 7029.949 100
#> wrapped 398.534 414.010 455.333 439.7445 469.307 656.074 100
#> vapply 4942.000 5154.639 5937.333 5453.2880 5969.760 13730.718 100
Created on 2023-01-04 with reprex v2.0.2