Parallel processing in R limited
Asked Answered
C

3

9

I'm running ubuntu 12.04 and R 2.15.1 with the parallel and doParallel packages. when I run anything in parallel I'm limited to 100% of a core, when I should have up to 800%, since I am running it with 8 cores. What shows up on the system monitor is that each child process is getting only 12%.

What is going on that is limiting my execution speed?

Claudicant answered 16/10, 2012 at 22:56 Comment(3)
you'll have to post more of the code you're running to know for sure. But on ubuntu 12.04 I don't have this issue at all and have pleasantly pegged 24 cores at 100% many times.Invigilate
it is not specific to any specific code.Claudicant
Then I can't help much. Cause it works for me! you can also try the doMC package... But without seeing the way you're using doParallel and then the implementation you're using (foreach? .parallel from plyr?) I don't think anyone can help.Invigilate
A
11

The problem may be that the R process is restricted to one core (and the subprocesses inherit that).

Try this:

> system(sprintf("taskset -p 0xffffffff %d", Sys.getpid()))
pid 3064's current affinity mask: fff
pid 3064's new affinity mask: fff

Now, if on your machine, the current affinity mask reports a 1, then this was the problem. The line above should solve it (i.e. the second line should report fff (or similar).

Simon Urbanek wrote a function mcaffinity that allows this control for multicore. As far as I know, it's still in R-devel.

For details, see e.g. this discussion on R-sig-hpc.

Update, and addition to Xin Guo's answer:

If you use implicit parallelization via openblas and explicit parallelization (via parallel/snow/multicore) together, you may want to change the number of threads that openblas uses depending on whether you are inside an explicitly parallel part or not.
This is possible (with openblas under Linux, I'm not aware of any other of the usual optimized BLAS' that provides a function to the number of threads), see Simon Fuller's blog post for details.

Aguirre answered 17/10, 2012 at 0:15 Comment(0)
U
5

I experienced the same problem because of the libblas.so(.3gf) packages, and I do not know if this also causes your problem. When R starts it calls a BLAS system installed in your system to conduct linear algebraic computations. I have libopenblas.so(.3gf) and it is highly optimized with the option "CPU Affinity", that is to say, when you do numerical vector or matrix computation, the openblas package will just make 8 threads and make each one of the threads stuck to one specified and fixed CPU to speed up the codes. However, by setting this, your system is then told that all the CPU's are very busy, and thus if further parallel tasks come, the system will try to squeeze them into one CPU so as to try best not to interfere the busy CPU's.

So this was my solution which worked: I downloaded an openblas package source and compiled it with the file "Makefile.rule" changed: there is one line "#NO_AFFINITY = 1" and I just deleted "#" so that after compiled, there is no affinity option selected. Then I installed the package and the problem was solved.

For the reference of this, see https://github.com/ipython/ipython/issues/840

Please note that this is a trade-off. Removing CPU affinity will make you lose some efficiency when doing numerical computations, that's why though the openblas maintainer (Dr. Xianyi Zhang) knows the problem, he still publish the codes with the cpu affinity as a default option.

Upstate answered 17/10, 2012 at 0:15 Comment(4)
Yes, this can happen as well.Aguirre
In my Makefile.rule this line was already uncommented, so it compiles with NO_AFFINITY=1, yet the problem for me still remainsReider
Hi Lindon, would you please help me check, if you have compiled the source file (when being compiled, the makefile would use multithread so the output should scroll very fast), and if you have installed the openblas you compiled yourself, and if you have used sudo update-alternatives --config libblas.so.3gf and sudo update-alternatives --config libblas.so, so that the current blas is really the openblas system? Hopefully this helps.Upstate
xianyi disabled the affinity in the 17th May commit github.com/xianyi/OpenBLAS/blob/… , so this problem should not recur once this gets into mainstream distributions. (Well, technically, it was wernsaar. Maybe xianyi didnt notice yet :-P )Anisole
N
0

My guess is that you probably had the wrong code. I would like to post one example copied from online http://www.r-bloggers.com/parallel-r-loops-for-windows-and-linux/ :

library(doMC)
registerDoMC()
x<- iris[which(iris[,5]!='setosa'),c(1,5)]
trials<- 10000
r<- foreach(icount(trials), .combine=cbind) %dopar% {
    ind<- sample(100,100,replace=T)
    result1<- glm(x[ind,2]~x[ind,1],family=binomial(logit))
    coefficients(result1)
}

and you can define how many cores you want to use in the parallel:

options(cores=4)
Nepali answered 17/10, 2012 at 0:15 Comment(1)
But he reports that he sees 8 child processes with 12% CPU usage each.Aguirre

© 2022 - 2024 — McMap. All rights reserved.