Custom package using parallel or doParallel for multiple OS as a CRAN package
Asked Answered
A

1

7

I am building a package for R which I want to be able to be cross-platform. I am developing under Linux, and the function mclapply will be used from the parallel package. This package is not supported for Windows (which uses doParallel). I really like the parallel package though for it's simplicity and speed, and I do not know if this should be a reason to have 2 different versions available of the package for CRAN, for the separate OS (seems like extra work to maintain), not to mention if it is even allowed.

Thoughts?

Also, for now I am regarding parallel's

mclapply(ldata, function(x), mc.cores=cores)

to be equivalent of doParallel's

cl <- makeCluster(cores)
parLapply(cl, ldata, function(x))

Is that correct?

Anaxagoras answered 3/9, 2013 at 9:39 Comment(4)
Why not use the parLapply also from the parallel package as I believe this is cross platform (or I hope it is since I use it in one of my packages). You can also use an if(){}else{} with Sys.info()["sysname"] to use the correct setup.Kelcie
@Tyler Rinker, does the parLapply show the .Rprofile loading for every new script? If yes, then I think it should be good since it isn't fork.Anaxagoras
parallel is supported on windows, and mclapply can be used. It just reverts to serial evaluation, like a simple lapply.Renatarenate
@MartinMorgan But then that is not the desired fallback behavior unfortunately...Anaxagoras
J
13

First, both mclapply and parLapply are in the parallel package, although mclapply doesn't actually run in parallel on Windows. parLapply runs in parallel on all supported platforms, but isn't always as efficient as mclapply. The doParallel package is used with the foreach package, and acts as an adapter to the parallel package.

To write a package that works on both Windows and non-Windows, you have a variety of reasonable options:

  • Just use parLapply since it works everywhere
  • Use parLapply on Windows and mclapply elsewhere
  • Use doParallel with foreach

The doParallel package is convenient because it makes use of mclapply on non-Windows platforms. For example:

library(doParallel)
registerDoParallel()
foreach(i=1:10, .options.snow=list(preschedule=TRUE)) %dopar% {
    Sys.sleep(2)
}

This uses mclapply on Linux and Mac OS X, but will automatically create a PSOCK cluster object behind the scenes on Windows. The use of preschedule=TRUE (added in doParallel 1.0.3) will cause doParallel to preschedule the tasks using clusterApply internally, much like parLapply.

Note that if you explicitly create and register a cluster object, then mclapply will not be used, regardless of the platform. It will work fine, but may not be as efficient. To use mclapply, you must call registerDoParallel with a numeric argument, or no argument at all.

You can look at the source code for the boot package for an example of how to use either mclapply or parLapply depending on your platform.

Jinni answered 3/9, 2013 at 14:3 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.