I am trying to use the doParallel and foreach package but I'm getting reduction in performance using the bootstrapping example in the guide found here CRANpage.
library(doParallel)
library(foreach)
registerDoParallel(3)
x <- iris[which(iris[,5] != "setosa"), c(1,5)]
trials <- 10000
ptime <- system.time({
r <- foreach(icount(trials), .combine=cbind) %dopar% {
ind <- sample(100, 100, replace=TRUE)
result1 <- glm(x[ind,2]~x[ind,1], family=binomial(logit))
coefficients(result1)
}
})[3]
ptime
This example returns 56.87
.
When I change the dopar
to just do
to run it sequentially instead of in parallel, it returns 36.65
.
If I do registerDoParallel(6)
it gets the parallel time down to 42.11
but is still slower than sequentially. registerDoParallel(8)
gets 40.31
still worse than sequential.
If I increase trials
to 100,000 then the sequential run takes 417.16
and the parallel run with 3 workers takes 597.31
. With 6 workers in parallel it takes 425.85
.
My system is
Dell Optiplex 990
Windows 7 Professional 64-bit
16GB RAM
Intel i-7-2600 3.6GHz Quad-core with hyperthreading
Am I doing something wrong here? If I do the most contrived thing I can think of (replacing computational code with Sys.sleep(1)
) then I get an actual reduction closely proportionate to the number of workers. I'm left wondering why the example in the guide decreases performance for me while for them it sped things up?