R package that automatically uses several cores?
Asked Answered
P

6

69

I have noticed that R only uses one core while executing one of my programs which requires lots of calculations. I would like to take advantage of my multi-core processor to make my program run faster. I have not yet investigated the question in depth but I would appreciate to benefit from your comments because I do not have good knowledge in computer science and it is difficult for me to get easily understandable information on that subject.

Is there a package that allows R to automatically use several cores when needed?

I guess it is not that simple.

Pinwheel answered 23/1, 2011 at 17:9 Comment(4)
Revolutions (revolutionanalytics.com) offer a multi-threaded version of R. Of course, their commercial status does seem to have a polarising effect within the R community.Trine
I would also add that a great many typical uses of R will not be amenable to automatic parallelisation. If you were to tell us what you programs do then you might get better answers.Trine
possible duplicate of Using Multicore in R for a pentium 4 HT machineMadeline
@DavidHeffernan it is not a multi-threaded version of R, but particular library that R uses in multi-threaded, and that library serves tiny subset of all R functions.Velvety
M
52

R can only make use of multiple cores with the help of add-on packages, and only for some types of operation. The options are discussed in detail on the High Performance Computing Task View on CRAN

Update: From R Version 2.14.0 add-on packages are not necessarily required due to the inclusion of the parallel package as a recommended package shipped with R. parallel includes functionality from the multicore and snow packages, largely unchanged.

Multiple answered 23/1, 2011 at 17:12 Comment(1)
See also: paper comparing different methods of parallel computing in R from 2009 jstatsoft.org/v31/i01Skerl
I
34

The easiest way to take advantage of multiprocessors is the multicore package which includes the function mclapply(). mclapply() is a multicore version of lapply(). So any process that can use lapply() can be easily converted to an mclapply() process. However, multicore does not work on Windows. I wrote a blog post about this last year which might be helpful. The package Revolution Analytics created, doSMP, is NOT a multi-threaded version of R. It's effectively a Windows version of multicore.

If your work is embarrassingly parallel, it's a good idea to get comfortable with the lapply() type of structuring. That will give you easy segue into mclapply() and even distributed computing using the same abstraction.

Things get much more difficult for operations that are not "embarrassingly parallel".

[EDIT]

As a side note, Rstudio is getting increasingly popular as a front end for R. I love Rstudio and use it daily. However it needs to be noted that Rstudio does not play nice with Multicore (at least as of Oct 2011... I understand that the RStudio team is going to fix this). This is because Rstudio does some forking behind the scenes and these forks conflict with Multicore's attempts to fork. So if you need Multicore, you can write your code in Rstuido, but run it in a plain-Jane R session.

Intramundane answered 23/1, 2011 at 17:52 Comment(4)
What about Mac OS X? I know it's not Linux, but at least somewhat close. Just curious what the status is there...Bombard
I opted for showfall because I can use it on all my machines that run either Windows in Ubuntu without hassle.English
I see the multicore package has now been deprecated in favor or 'parallel' package.Westnorthwest
the times... they are a changin'Intramundane
M
15

On this question you always get very short answers. The easiest solution according to me is the package snowfall, based on snow. That is, on a Windows single computer with multiple cores. See also here the article of Knaus et al for a simple example. Snowfall is a wrapper around the snow package, and allows you to setup a multicore with a few commands. It's definitely less hassle than most of the other packages (I didn't try all of them).

On a sidenote, there are indeed only few tasks that can be parallelized, for the very simple reason that you have to be able to split up the tasks before multicore calculation makes sense. the apply family is obviously a logical choice for this : multiple and independent computations, which is crucial for multicore use. Anything else is not always that easily multicored.

Read also this discussion on sfApply and custom functions.

Madeline answered 24/1, 2011 at 9:50 Comment(1)
The first link no longer takes to a page with any info on the package.Athirst
A
9

Microsoft R Open includes multi-threaded math libraries to improve the performance of R.It works in Windows/Unix/Mac all OS type. It's open source and can be installed in a separate directory if you have any existing R(from CRAN) installation. You can use popular IDE Rstudio also with this.From its inception, R was designed to use only a single thread (processor) at a time. Even today, R works that way unless linked with multi-threaded BLAS/LAPACK libraries.

The multi-core machines of today offer parallel processing power. To take advantage of this, Microsoft R Open includes multi-threaded math libraries. These libraries make it possible for so many common R operations, such as matrix multiply/inverse, matrix decomposition, and some higher-level matrix operations, to compute in parallel and use all of the processing power available to reduce computation times.

Please check the below link:

https://mran.revolutionanalytics.com/rro/#about-rro

http://www.r-bloggers.com/using-microsoft-r-open-with-rstudio/

Aloysius answered 6/2, 2016 at 17:48 Comment(0)
M
4

As David Heffernan said, take a look at the Blog of revolution Analytics. But you should know that most packages are for Linux. So, if you use windows it will be much harder. Anyway, take a look at these sites:

Revolution. Here you will find a lecture about parallerization in R. The lecture is actually very good, but, as I said, most tips are for Linux.

And this thread here at Stackoverflow will disscuss some implementation in Windows.

Mashhad answered 23/1, 2011 at 17:57 Comment(5)
I thought Revolutions products were biased towards Windows.Trine
@David -- They also offer a Red Hat Enterprise Linux version. Students/educators can get a free single-user license.Airmail
@richardh My point is that their stuff isn't exclusively for Linux as Manoel seemed to implyTrine
@David -- Oh, yes, you're correct (sorry, no snark intended in my comment :) ).Airmail
When I said most packages, I was thinking of most packages out there, not only Revolutions packages. Sorry. And Roman, what linking are you refering? Revolutions or at Stackoverflow?Mashhad
S
3

The package future makes it extremely simple to work in R using parallel and distributed processing. More info here. If you want to apply a function to elements in parallel, the future.apply package provides a quick way to use the "apply" family functions (e.g. apply(), lapply(), and vapply()) in parallel.

Example:

library("future.apply")
library("stats")
x <- 1:10

# Single core
  y <- lapply(x, FUN = quantile, probs = 1:3/4)

# Multicore in parallel
  plan(multiprocess)
  y <- future_lapply(x, FUN = quantile, probs = 1:3/4)

Sika answered 19/8, 2019 at 11:13 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.