Parallel R on a Windows cluster
Asked Answered
R

1

9

I've got a Windows HPC Server running with some nodes in the backend. I would like to run Parallel R using multiple nodes from the backend. I think Parallel R might be using SNOW on Windows, but not too sure about it. My question is, do I need to install R also on the backend nodes? Say I want to use two nodes, 32 cores per node:

cl <- makeCluster(c(rep("COMP01",32),rep("COMP02",32)),type="SOCK")

Right now, it just hangs.

What else do I need to do? Do the backend nodes need some kind of sshd running to be able to communicate each other?

Ruyter answered 9/7, 2013 at 12:19 Comment(2)
Yes, you do need to install R on each node.Whiplash
@HongOoi and do I need to specify the location of R or it just takes the default one?Ruyter
D
10

Setting up snow on a Windows cluster is rather difficult. Each of the machines needs to have R and snow installed, but that's the easy part. To start a SOCK cluster, you would need an sshd daemon running on each of the worker machines, but you can still run into troubles, so I wouldn't recommend it unless you're good at debugging and Windows system administration.

I think your best option on a Windows cluster is to use MPI. I don't have any experience with MPI on Windows myself, but I've heard of people having success with the MPICH and DeinoMPI MPI distributions for Windows. Once MPI is installed on your cluster, you also need to install the Rmpi package from source on each of your worker machines. You would then create the cluster object using the makeMPIcluster function. It's a lot of work, but I think it's more likely to eventually work than trying to use a SOCK cluster due to the problems with ssh/sshd on Windows.

If you're desperate to run a parallel job once or twice on a Windows cluster, you could try using manual mode. It allows you to create a SOCK cluster without ssh:

workers <- c(rep("COMP01",32), rep("COMP02",32))
cl <- makeSOCKluster(workers, manual=TRUE)

The makeSOCKcluster function will prompt you to start each one of the workers, displaying the command to use for each. You have to manually open a command window on the specified machine and execute the specified command. It can be extremely tedious, particularly with many workers, but at least it's not complicated or tricky. It can also be very useful for debugging in combination with the outfile='' option.

Descendible answered 9/7, 2013 at 14:48 Comment(4)
@_Steve Weston: Thanks for your help Steve. When you say "manual mode" to run a parallel job, what do you mean?Ruyter
@Ruyter I added more information on manual mode to answer your question.Descendible
+1 the first two paragraphs are extremely useful. Why there is no guide/docs that would explain this?Liverish
I'm sure the question asker and answer both already know, but for those of us stumbling around ... as of 0.6-4 in Windows per cran.r-project.org/web/packages/Rmpi/README the Microsoft MPI is used on Windows machines by default, so that needs to be installed first.Politesse

© 2022 - 2024 — McMap. All rights reserved.