Allow foreach workers to register and distribute sub-tasks to other workers
Asked Answered
H

1

5

I have an R code that involves several foreach workers to perform some tasks in parallel. I am using foreach and doMC for this purpose. I want to let each of the foreach workers recruits some new workers and distribute some parts of their code, which is parallelizable, to them.

The current code looks like:

require(doMC)
require(foreach)
registerDoMC(cores = 8)

foreach (i = (1:8)) %dopar% {
<<some code here>>
    for (j in c(1:4))  {
    <<some other code here>>
    }
}

I am looking for an ideal code that would look like:

require(doMC)
require(foreach)
registerDoMC(cores = 8)

foreach (i = (1:8)) %dopar% {
<<some code here>>
    foreach (j = (1:4)) %dopar% {
    <<some other code here>>
    }
}

I saw an example of multi-paradigm parallelism using doSNOW and doMC here (https://www.rmetrics.org/files/Meielisalp2009/Presentations/Lewis.pdf#page=17). However, I do not know whether it does what I want or not.

Also, it seems Nested foreach is not applicable because it requires merging the two loops (see here), while in my case this is not preferred; the second loop only helps the first one for a portion of the code. Please correct me if I am wrong.

Thanks.

Hirza answered 20/6, 2013 at 20:14 Comment(2)
Maybe not exactly what you want, but you can have nested foreach expressions: cran.r-project.org/web/packages/foreach/vignettes/nested.pdf . I don't know about recruiting more workers within the loops, however.Brainwashing
Thanks. However, it seems nested foreach is not applicable to my case because it requires merging the two nested loops, while I need an internal loop that is called only for a portion of the code. I will update the question to reflect this.Hirza
M
7

There's no particular problem with having a foreach loop inside of a foreach loop. Here's an example of a doMC loop inside a doSNOW loop:

library(doSNOW)
hosts <- c('host-1', 'host-2')
cl <- makeSOCKcluster(hosts)
registerDoSNOW(cl)
r <- foreach(i=1:4, .packages='doMC') %dopar% {
  registerDoMC(2)
  foreach(j=1:8, .combine='c') %dopar% {
    i * j
  }
}
stopCluster(cl)

It seems natural to me to use doMC for the inner loop, but you can do it anyway you want. You could also use doSNOW for both loops, but then you would need to create and stop the snow cluster inside the outer foreach loop.

Here's an example of using doMC inside a doMC loop:

library(doMC)
registerDoMC(2)
r <- foreach(i=1:2, .packages='doMC') %dopar% {
  ppid <- Sys.getpid()
  registerDoMC(2)
  foreach(j=1:2) %dopar% {
    c(ppid, Sys.getpid())
  }
}

The results demonstrate that a total of six processes are forked by the doMC package, although only four execute the body of the inner loop:

> r
[[1]]
[[1]][[1]]
[1] 14946 14949

[[1]][[2]]
[1] 14946 14951


[[2]]
[[2]][[1]]
[1] 14947 14948

[[2]][[2]]
[1] 14947 14950

Of course, you need to be careful not to start too many processes on a single node. I found this kind of nesting a bit awkward, which led to the development of the nesting operator.

Merwyn answered 21/6, 2013 at 19:2 Comment(5)
Thanks. I am using MOAB to submit the jobs, therefore I cannot choose the host names. Is there any best practice to set the number of nodes and processors in the MOAB request in order to avoid Processor Affinity? For example, in the first example above, should I ask for nodes=4;ppn=8?Hirza
@Hirza Do you want to use an MPI or a SOCK cluster? If MPI, are you using Open MPI or something else? And are you using Torque as the resource manager with Moab?Merwyn
The msub script lines start with #PBS, therefore I think it is Torque/Moab. In the current code, I use doMC and foreach for the outer loop, while the inner loop is serial. I do not have administrative privilege, but I can install packages in my home folder. Thanks.Hirza
@Hirza Yes, you should use nodes=4:ppn=8, but your R script should read the contents of $PBS_NODEFILE to determine how to create the SOCK cluster and how many cores to use in the inner foreach loop. I can show you the code to do that, but it won't fit in a comment. I suggest that you ask this as a new question.Merwyn
I was thinking the same. This is the question: https://mcmap.net/q/2035296/-how-to-set-up-dosnow-and-sock-cluster-with-torque-moab-schedulerHirza

© 2022 - 2024 — McMap. All rights reserved.