I seek to parallize an R file on an SLURM HPC using the future.batchtools packages. While the script is executed on multiple nodes, it only use 1 CPU instead of 12 that are available.
So far, I tried different configurations (c.f. code attached) which do not lead to the expected results. My bash file with the configuration is as follows:
#!/bin/bash
#SBATCH --nodes=2
#SBATCH --cpus-per-task=12
R CMD BATCH test.R output
In R, I use a foreach loop:
# First level = cluster
# Second level = multiprocess
# https://cran.r-project.org/web/packages/future.batchtools/vignettes/future.batchtools.html
plan(list(batchtools_slurm, multiprocess))
# Parallel for loop
result <- foreach(i in 100) %dopar% {
Sys.sleep(100)
return(i)
}
I would appreciate if someone can give me guidance on how to configure the code for multiple nodes and multiple cores.
batchtools_slurm
. If you use two layers of foreach parallelization, the second layer will use multiprocess (given thatplan()
). FYI,foreach(i in 100)
will only iterate over a single value. – EtoileR result <- foreach(i= 1:100) %dopar% { foreach(jRun = 1:100) %dopar% { # calulation }}
This code is closer to the one I am using. Still, I do not exploit multiple knots on the HPC. Is there anything else I am overlooking? – Granddaughterplan(list(tweak(batchtools_slurm, resources = list("ntasks-per-node"=4)), multiprocess))
, cf. github.com/HenrikBengtsson/future.batchtools. To verify that this works: With that new plan, callf <- future( availableCores() ); ncores <- value(f)
. That will show how many cores the second layer will have available. You should get four (4). – Etoile