Using snow (and snowfall) with AWS for parallel processing in R
Asked Answered
V

2

7

In relation to my earlier similar SO question , I tried using snow/snowfall on AWS for parallel computing.

What I did was:

  • In the sfInit() function, I provided the public DNS to socketHosts parameter like so sfInit(parallel=TRUE,socketHosts =list("ec2-00-00-00-000.compute-1.amazonaws.com"))
  • The error returned was Permission denied (publickey)
  • I then followed the instructions (I presume correctly!) on http://www.imbi.uni-freiburg.de/parallel/ in the 'Passwordless Secure Shell (SSH) login' section
  • I just cat the contents of the .pem file that I created on AWS into the ~/.ssh/authorized_keys of the AWS instance I want to connect to from my master AWS instance and for the master AWS instance as well

Is there anything I am missing out ? I would be very grateful if users can share their experiences in the use of snow on AWS.

Thank you very much for your suggestions.

UPDATE: I just wanted to update the solution I found to my specific problem:

  • I used StarCluster to setup my AWS cluster : StarCluster
  • Installed package snowfall on all the nodes of the cluster
  • From the master node issued the following commands
  • hostslist <- list("ec2-xxx-xx-xxx-xxx.compute-1.amazonaws.com","ec2-xx-xx-xxx-xxx.compute-1.amazonaws.com")
  • sfInit(parallel=TRUE, cpus=2, type="SOCK",socketHosts=hostslist)
  • l <- sfLapply(1:2,function(x)system("ifconfig",intern=T))
  • lapply(l,function(x)x[2])
  • sfStop()
  • The ip information confirmed that the AWS nodes were being utilized
Vergara answered 7/9, 2011 at 12:20 Comment(3)
I believe .pem file to be a X509 certificate, not a RSA public key. You should generate the key pair on the master node, as described in the section, and copy the public key to the authorized_keys of the slave node(s).Broadsword
I believe @Broadsword is correct; he should make an answer of that, rather than just a comment, so we can upvote. :)Jacobo
Perhaps disregard my answer on your other question regarding "use StarCluster" as I now see you have, but give a shot running the whole cluster within the private IP range, I had no need to fuss with keys or certificates once I started doing that.Ipswich
J
1

I believe @Anatoliy is correct: you're using an X.509 certificate. For the precise steps to take to add the SSH keys, look at the "Types of credentials" section of the EC2 Starters Guide.

To upload your own SSH keys, take a look at this page from Alestic.

It is a little confusing at first, but you'll want to keep clear which are your access keys, your certificates, and your key pairs, which may appear in text files with DSA or RSA.

Jacobo answered 7/9, 2011 at 13:17 Comment(4)
Thanks for the links. Somehow this is turning out to be more hairy than I expected.Vergara
It gets easier, but no thanks to Amazon's intro materials - they generally make sense only after a person masters everything. Alestic is a good site to know.Jacobo
What do you think about web.mit.edu/stardev/cluster/docs/0.92rc2/quickstart.html ?Vergara
Just for future reference, StarCluster has a 'createkey' command that will create a new EC2 key pair for you: $ starcluster createkey mykey -o ~/.ssh/ec2key.rsaOpiate
D
2

Looks not that bad but the pem file is wrong. But it is sometimes not that simple and many people have to fight with this issues. A lot of tips you can find in this post:

From my experience most people have problems in these steps:

  • Can you log onto the machines via ssh? (ssh ec2-00-00-00-000.compute-1.amazonaws.com). Try to use the public DNS, not the public IP to connect.
  • You should check your "Security groups" in AWS if the 22 port is open for all machines!

If you plan to start more than 10 worker machines you should work on a MPI installation on your machines (much better performance!)

Markus from cloudnumbers.com :-)

Derose answered 7/9, 2011 at 13:13 Comment(1)
I can ssh into the slave nodes and all the machines belong to the same security group. Also I use the public DNS to connect.Vergara
J
1

I believe @Anatoliy is correct: you're using an X.509 certificate. For the precise steps to take to add the SSH keys, look at the "Types of credentials" section of the EC2 Starters Guide.

To upload your own SSH keys, take a look at this page from Alestic.

It is a little confusing at first, but you'll want to keep clear which are your access keys, your certificates, and your key pairs, which may appear in text files with DSA or RSA.

Jacobo answered 7/9, 2011 at 13:17 Comment(4)
Thanks for the links. Somehow this is turning out to be more hairy than I expected.Vergara
It gets easier, but no thanks to Amazon's intro materials - they generally make sense only after a person masters everything. Alestic is a good site to know.Jacobo
What do you think about web.mit.edu/stardev/cluster/docs/0.92rc2/quickstart.html ?Vergara
Just for future reference, StarCluster has a 'createkey' command that will create a new EC2 key pair for you: $ starcluster createkey mykey -o ~/.ssh/ec2key.rsaOpiate

© 2022 - 2024 — McMap. All rights reserved.