R doesn't reset the seed when "L'Ecuyer-CMRG" RNG is used?
Asked Answered
A

2

10

I was doing some parallel simulations in R and I notice that the seed is not changed when the "L'Ecuyer-CMRG" rng is used. I was reading the book "Parallel R", and the option mc.set.seed = TRUE should give each worker a new seed each time mclapply() is called.

Here is my code:

library(parallel)
RNGkind("L'Ecuyer-CMRG")

mclapply(1:2, function(n) rnorm(n), mc.set.seed = TRUE)
[[1]]
[1] -0.7125037

[[2]]
[1] -0.9013552  0.3445190

mclapply(1:2, function(n) rnorm(n), mc.set.seed = TRUE)
[[1]]
[1] -0.7125037

[[2]]
[1] -0.9013552  0.3445190

EDIT: same thing happens both on my desktop and on my laptop (both Ubuntu 12.04 LTS).

Audley answered 25/2, 2013 at 15:25 Comment(4)
are you in windows or UNIX? looks like this option is Ignored on Windows.Botsford
works fine for me on mac with multicore package.Infundibuliform
This works for me in Windows 7 and R-2.15-2. I get different values each time.Ballard
I'm using Ubuntu 12.04 LTS, I have this problem only if I use RNGkind("L'Ecuyer-CMRG") the other rng work fine.Audley
N
5

It appears to me that if you want to guarantee that subsequent calls to mclapply in an R session get different random numbers, you need to either call set.seed with a different value, remove the global variable ".Random.seed", or generate at least one random number in that R session before calling mclapply again.

The reason for this behavior is that mclapply (unlike mcparallel for example) calls mc.reset.stream internally. This resets the seed that is stashed in the "parallel" package to the value of ".Random.seed", so if ".Random.seed" hasn't changed when mclapply is called again, the workers started by mclapply will get the same random numbers as they did previously.

Note that this is not the case with functions such as clusterApply and parLapply, since they use persistent workers, and therefore continue to draw random numbers from their RNG stream. But new workers are forked every time mclapply is called, presumably making it much harder to have that type of behavior.

Here's an example of setting the seed to different values in order to get different random numbers using mclapply:

RNGkind("L'Ecuyer-CMRG")
set.seed(100)
mclapply(1:2, function(i) rnorm(2))
set.seed(101)
mclapply(1:2, function(i) rnorm(2))

Here's an example of removing ".Random.seed":

RNGkind("L'Ecuyer-CMRG")
mclapply(1:2, function(i) rnorm(2))
rm(.Random.seed)
mclapply(1:2, function(i) rnorm(2))

And here's an example of generating random numbers on the master:

RNGkind("L'Ecuyer-CMRG")
mclapply(1:2, function(i) rnorm(2))
rnorm(1)
mclapply(1:2, function(i) rnorm(2))

I'm not sure which is the best approach, but that may depend on what you're trying to do.

Although it appears that simply calling mclapply multiple times without changing ".Random.seed" results in reproducible results, I don't know if that is guaranteed. To guarantee reproducible results, I think you need to call set.seed:

RNGkind("L'Ecuyer-CMRG")
set.seed(1234)
mclapply(1:2, function(i) rnorm(2))
set.seed(1234)
mclapply(1:2, function(i) rnorm(2))
Najera answered 27/2, 2013 at 20:32 Comment(5)
Thanks for your answer Steve! (Cannot vote it up because I don't have enough reputation). What looks strange is that this kind of behaviour should take place on different platform, while it looks like the code in my question gave different results on each mclapply() call on @Ballard Windows 7.Audley
It's strange because I didn't think that parallel provides the mclapply function on Windows.Najera
That's also true! Apart from the RNG, in my system there are many other strange things going on: 1) if I use mclapply and the package "debug" is loaded often R freezes and I have to kill it and its workers 2) options(cores = detectCores()) simply doesn't work and I always have to choose the number of cores manually in mclapply.Audley
They switched the option name from "cores" to "mc.cores", so you have to use options(mc.cores=detectCores()) to work with the parallel version of mclapply.Najera
@SteveWeston, thanks for this answer. It helped me fixed something I was struggling with. However, I am afraid this approach might result in some hard to debug side effects. Would it be possible to take a quick look at this question: stackoverflow.com/q/68638116/5252007? I appreciate it.Binah
D
1

Have you considered using clusterApply instead of mclapply? I think it's easier to guarantee reproducible examples using this framework. I'm attaching an example.

library(parallel)

#---- creating local cluster ----

clust <- makeCluster(detectCores())

#---- seed ----

RNGkind(kind = "L'Ecuyer-CMRG")
set.seed(1234)
s <- .Random.seed
clusterSetRNGStream(cl = clust, iseed = s)

#---- generating random numbers ----

clusterApply(cl = clust, x = 1:2, fun = function(i) rnorm(2))
#> [[1]]
#> [1] -0.0514128  0.7344781
#> 
#> [[2]]
#> [1]  0.3946233 -0.6649782
clusterApply(cl = clust, x = 1:2, fun = function(i) rnorm(2))
#> [[1]]
#> [1] -1.57875323 -0.07533176
#> 
#> [[2]]
#> [1] -0.3698359 -0.1564795

#---- repeating the process ----

set.seed(1234)
s <- .Random.seed
clusterSetRNGStream(cl = clust, iseed = s)

clusterApply(cl = clust, x = 1:2, fun = function(i) rnorm(2))
#> [[1]]
#> [1] -0.0514128  0.7344781
#> 
#> [[2]]
#> [1]  0.3946233 -0.6649782
clusterApply(cl = clust, x = 1:2, fun = function(i) rnorm(2))
#> [[1]]
#> [1] -1.57875323 -0.07533176
#> 
#> [[2]]
#> [1] -0.3698359 -0.1564795

# same result

Created on 2019-03-04 by the reprex package (v0.2.1)

devtools::session_info()
#> Session info -------------------------------------------------------------
#>  setting  value                       
#>  version  R version 3.5.1 (2018-07-02)
#>  system   x86_64, linux-gnu           
#>  ui       X11                         
#>  language (EN)                        
#>  collate  en_US.UTF-8                 
#>  tz       America/Sao_Paulo           
#>  date     2019-03-04
#> Packages -----------------------------------------------------------------
#>  package   * version date       source         
#>  backports   1.1.2   2017-12-13 CRAN (R 3.5.0) 
#>  base      * 3.5.1   2018-07-03 local          
#>  compiler    3.5.1   2018-07-03 local          
#>  datasets  * 3.5.1   2018-07-03 local          
#>  devtools    1.13.6  2018-06-27 CRAN (R 3.5.0) 
#>  digest      0.6.17  2018-09-12 cran (@0.6.17) 
#>  evaluate    0.11    2018-07-17 cran (@0.11)   
#>  graphics  * 3.5.1   2018-07-03 local          
#>  grDevices * 3.5.1   2018-07-03 local          
#>  htmltools   0.3.6   2017-04-28 CRAN (R 3.5.0) 
#>  knitr       1.20    2018-02-20 CRAN (R 3.5.0) 
#>  magrittr    1.5     2014-11-22 cran (@1.5)    
#>  memoise     1.1.0   2017-04-21 CRAN (R 3.5.0) 
#>  methods   * 3.5.1   2018-07-03 local          
#>  parallel  * 3.5.1   2018-07-03 local          
#>  Rcpp        0.12.18 2018-07-23 cran (@0.12.18)
#>  rmarkdown   1.10    2018-06-11 CRAN (R 3.5.1) 
#>  rprojroot   1.3-2   2018-01-03 CRAN (R 3.5.0) 
#>  stats     * 3.5.1   2018-07-03 local          
#>  stringi     1.2.4   2018-07-20 cran (@1.2.4)  
#>  stringr     1.3.1   2018-05-10 CRAN (R 3.5.0) 
#>  tools       3.5.1   2018-07-03 local          
#>  utils     * 3.5.1   2018-07-03 local          
#>  withr       2.1.2   2018-03-15 CRAN (R 3.5.0) 
#>  yaml        2.2.0   2018-07-25 cran (@2.2.0)
Discontinuous answered 4/3, 2019 at 22:55 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.