How to configure parallel remote kernels in Mathematica?
Asked Answered
Q

1

15

When I try to configure remote kernels in mathematica via Evaluation>Parallel Kernel Configuration ... then I go to "Remote Kernels" and add hosts. After that I try to Launch the remote kernels and only some of them get launched (the number of them varies). And I get a msg like the following.

KernelObject::rdead: Subkernel connected through remote[nodo2] appears dead. >> LinkConnect::linkc: Unable to connect to LinkObject[[email protected],[email protected],38,12]. >> General::stop: Further output of LinkConnect::linkc will be suppressed during this calculation. >>

Any ideas how to get this working?

Take into account it sometimes does load some of the remote kernels but never all of them. Thanks in advance.


This is my ouput for $ConfiguredKernels // InputForm

{SubKernels`LocalKernels`LocalMachine[4], 
 SubKernels`RemoteKernels`RemoteMachine["nodo2", 2], 
 SubKernels`RemoteKernels`RemoteMachine["nodo1", 2], 
 SubKernels`RemoteKernels`RemoteMachine["nodo3", 2], 
 SubKernels`RemoteKernels`RemoteMachine["nodo4", 2], 
 SubKernels`RemoteKernels`RemoteMachine["nodo5", 2]}

Once it did load all of the kernels, but it commonly doesn't, just one or two remote kernels.

Quevedo answered 21/7, 2011 at 22:58 Comment(5)
BTW, this happens with mathematica 8, used to work with mathematica 7.Quevedo
Has anyone confirmed or replicated this problem?Kershner
I can confirm this doesn't work. I tried connecting to my own local kernel via the remote kernel interface on Mathematica 8.0.1, and it failed with the same error message.Why
To help diagnose/reproduce the problem, the following information would be useful: $ConfiguredKernels // InputFormSavonarola
I just encountered the same problem. And it appears that the reason is that the disk on the remote machine is full. But this might be only apply to my case.Predominate
S
10

There is very little information given, so this answer may not be 100% useful.

The first issue to always consider is licensing on the remote machine. If some kernels launch, but others don't, it is possible you have run out of licenses for kernels on that machine. The rest of this post will assume licensing is not the issue.

Connection Method

The remote kernel interface in Mathematica by default assumes the rsh protocol, which is not the right choice for many environments, because rsh is not a very secure protocol.

The other option is ssh, which is much more widely supported. There are many ssh clients, but I will focus on a client included with Mathematica, namely WolframSSH.jar. This client is java based, which has the added benefit of working the same on all platforms supported by Mathematica (Mac, Window and Linux).

To avoid having to type a password for every kernel connection, it is convenient to create a private/public key pair. The private key stays on your computer and the public key needs to be placed on the remote computer (usually in the .ssh folder of the remote home directory).

To generate a private/public key pair you can use the WolframSSHKeyGen.jar file, like so:

java -jar c:\path\to\mathematica\SystemFiles\Java\WolframSSHKeyGen.jar

and follow the instructions on the dialogs that come up. When done, copy the public key to the .ssh folder on the remote machine. In my case, I called the keys kernel_key and kernel_key.pub was automatically named that way.

You can now test the connection from a command line, like so (using the ls command on the remote machine):

java -jar c:\path\to\mathematica\SystemFiles\Java\WolframSSH.jar --keyfile kernel_key [email protected] ls

If this works, you should be able to finish on the Mathematica side of things.

Remote Kernel Connection

To make a connection you need the following settings, the name of the remote machine:

machine = "machine.example.com";

The login name, usually $UserName:

user = $UserName;

The ssh binary location:

ssh = FileNameJoin[{$InstallationDirectory, "SystemFiles", "Java", "WolframSSH.jar"}];

The private key as described above:

privatekey = "c:\\users\\arnoudb\\kernel_key";

The launch command for the kernel:

math = "math -mathlink -linkmode Connect `4` -linkname `2` -subkernel -noinit >& /dev/null &";

A configuration function to put everything together:

ConfigureKernel[machine_, user_, ssh_, privatekey_, math_, number_] :=
 SubKernels`RemoteKernels`RemoteMachine[
  machine,
  "java -jar \"" <> ssh <> "\" --keyfile \"" <> privatekey <> "\" " <> user <> "@" <> machine <> " \"" <> math <> "\"", number]

This uses the configuration function and defines it to use 4 remote kernels:

remote = ConfigureKernel[machine, user, ssh, privatekey, math, 4]

This launches the kernels:

LaunchKernels[remote]

This command verifies if the kernels are all connected and remote:

ParallelEvaluate[$MachineName]
Savonarola answered 11/11, 2011 at 20:20 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.