Why is R for loop 10 times slower than when using foreach?
Asked Answered
C

1

13

This is really blowing my mind. The basic loop takes like 8 seconds on my computer:

system.time({
x <- 0
for (p in 1:2) {
    for (i in 1:500) {
        for (j in 1:5000) {
            x <- x + i * j
        }
    }
}
})
x

Whereas if I use foreach in non-parallel mode, it does take only 0.7 secs!!!

system.time({
x <- 0
foreach(p = 1:2, .combine = rbind) %do% 
    for (i in 1:500) {
        for (j in 1:5000) {
            x <- x + i * j
        }
    }
})
x

The result is the same, but foreach was somehow able to reach it much faster than basic R! Where is the inefficiency of basic R?

How is this possible?

In fact, I got complete opposite result compared to this one: Why is foreach() %do% sometimes slower than for?

Chan answered 9/7, 2014 at 10:49 Comment(6)
This is a perfect example of how writing a package can improve upon/enhance base methods.Glissando
@Richard great, please if you understand why and what happens then post an answer.Chan
The code gets compiled, ultimately by make.codeBufButacaine
@James, yeah, that sounds like it!Chan
So if you take your triple-for-loop, make it into a function, and use compile::cmpfun, will the resultant function be as fast as foreach ?Phebephedra
Yes, I'll post an answerButacaine
B
11

foreach when used sequentially eventually uses compiler to produce compiled byte code using the non-exported functions make.codeBuf and cmp. You can use cmpfun to compile the innerloop into bytecode to simulate this and achieve a similar speedup.

f.original <- function() {
x <- 0
for (p in 1:2) {
    for (i in 1:500) {
        for (j in 1:5000) {
            x <- x + i * j
        }
    }
}
x
}

f.foreach <- function() {
x <- 0
foreach(p = 1:2, .combine = rbind) %do% 
    for (i in 1:500) {
        for (j in 1:5000) {
            x <- x + i * j
        }
    }
x
}

f.cmpfun <- function(x) {
f <- cmpfun(function(x) {
    for (i in 1:500) {
        for (j in 1:5000) {
            x <- x + i * j
            }
        }
        x
    })
    f(f(0))
}

Results

library(microbenchmark)
microbenchmark(f.original(),f.foreach(),f.cmpfun(), times=5)
Unit: milliseconds
         expr       min        lq    median        uq       max neval
 f.original() 4033.6114 4051.5422 4061.7211 4072.6700 4079.0338     5
  f.foreach()  426.0977  429.6853  434.0246  437.0178  447.9809     5
   f.cmpfun()  418.2016  427.9036  441.7873  444.1142  444.4260     5
all.equal(f.original(),f.foreach(),f.cmpfun())
[1] TRUE
Butacaine answered 9/7, 2014 at 12:43 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.