Is it possible to run multiple chains with JAGS on multiple cores (subdividing chains)
Asked Answered
A

1

7

I’m wondering if it’s possible to subdivide 3 chains in JAGS on 5 or 6 cores, for example. Here is my code:

  library(parallel)
  # There is no progression bar using parallel
           jags.parallel(data = d$data,
                         inits = d$inits,
                         parameters.to.save = d$params,
                         model.file = model.jags,
                         n.chains = 3,
                         n.thin = 10,
                         n.iter = 9000,
                         n.burnin = 3000,
                         working.directory = NULL,
                         n.cluster = 3) ## the number of cluster it’s taking

As you can see, and this is the default, the number of chains (nc here which is 3 in my case) equals the number of core used.

  1. How is this influencing the way the MCMC is sampled?
  2. Is there an optimal number of core to use with R when running MCMC chains in parallel?
  3. I saw that I cannot go under 3 cores if I have 3 chains. It gives me this error: Error in res[[ch]] : subscript out of bounds. Why?
  4. If I increase the number of cores, it takes longer (as a comparison, with 12 cores it takes 7.2 more time than 3 cores)! Shouldn’t it be the reverse?
  5. How can I make the script faster without removing iterations, burn-in or adding thinning (more cores?, change the RAM?)?

My computer has 16 cores, so I have flexibility on the number of cores (also have 64 GB of RAM and 3 GHz Intel Xeon E5 processor).

Aldos answered 24/5, 2016 at 15:50 Comment(1)
You can't split a single chain across multiple cores, because each iteration of the chain depends on the previous iteration.Governance
S
9

It would not be possible to split 3 chains onto multiple cores. When running JAGS in parallel here is effectively what happens:

  1. Do the specified burn in for each chain. In your example, the three chains would run the model for 3000 steps and not store that information.

  2. Once each chain has had the appropriate burn in time, the number of samples you want from the posterior distribution is split equally over each chain. In your example, each chain would run the sampler for 600 steps ((n.iter -n.thin)/n.chains).

So, let's move on to your questions (# 1 is explained above).

  1. The answer to this will depend on what else you are doing on that computer at the time. You never want to run it on all K cores of your computer as it will take up most of your computing power. I generally run K-1 chains on K-1 cores for larger models. For simple models, it does not really matter.

  2. You could have multiple chains run on fewer cores, but then you are slowing things down because each chain on a core would have to be computed sequentially. Conversely, it would not work to farm out fewer chains onto multiple cores. If you have x chains, you should not have > x cores.

  3. This is answered through questions 2 and 3. More chains should increase computing, but more cores without more chains will not.

  4. This really cannot be answered without looking at your model.

Sycosis answered 24/5, 2016 at 19:55 Comment(4)
A sample of (n.iter - n.burnin)/n.thin is taken from each chain (not n.iter * n.thin/n.chains). So for the OP's examples, there will be 600 iterations returned per chain ((9000-6000)/10).Governance
Is that the way rjags does it? I primarily use runjags and it will increase the number of iterations based on how much it is thinned. I'll edit the answer accordingly.Sycosis
I'm pretty sure that jags.parallel is from R2jags, which calculates it as I described (I'm pretty sure...). Not sure about rjags - I thought it was the same.Governance
I was told that it might be possible to run 3 chains on more cores with this package: cran.r-project.org/web/packages/Rmpi/index.html. But that I would have to add a lot of code.Aldos

© 2022 - 2024 — McMap. All rights reserved.