any use case for mpirun on slurm-managed cluster?

I was recently looking at this post about mpirun vs mpiexec and this post about srun vs sbatch, but I am wondering how mpirun relates to slurm and srun.

Usually in examples I see, files that get sent to sbatch have srun <program> in them to execute an MPI program, but I sometimes see ones that use mpirun or mpiexec instead. However, I don't understand why one would do this. As exemplified in another question I recently asked, it seems using mpirun or mpiexec potentially produces all sorts of (implementation-dependent?) errors and there is no reason not to use srun.

Is this accurate, or is there a good reason why you would want to use mpirun or mpiexec instead of srun in executing programs on a slurm-managed cluster?

This question is very highly dependent on the flavor of MPI you are using and its integration with SLURM.

For myself, and I fully appreciate that this is a matter of personal preference, I'd say that, having to jungle with multitude of different clusters and environments, I try to reduce as much as possible the span of variability. So if SLURM is available on the cluster I run,I will try to make all the run-time adjustments for my code via SLURM and sbatch, and let MPI inherit them.

For that, I will define what I want and how I want my MPI code to be submitted from my #SBATCH submission parameters: number of nodes, number of cores per process, number of processes per node, etc. Then , the MPI launch will hopefully be as simple as possible via the mpirun, mpiexec or alike command the MPI library gives. For example, most (if not all) recent MPI libraries can directly detect that the job has been submitted within SLURM and inherit SLURM's process placement without any added effort. Usually, for Intel MPI for example, I do use mpirun -bootstrap slurm <mycode> and all processes are placed as expected. Indeed, this -bootstrap slurm option might not even be necessary, but I keep it just in case.

Conversely, using srun instead on the library's mpirun or mpiexec will require that the MPI code has been linked with SLURM's process management library. This may or may not be the case, so that may or may not do what you want it to. But more importantly, even if it does work, it won't give you any extra advantage compared to just using the MPI default launcher, since the process management will already have been done by SLURM at job's submission via sbatch. So for me, except for rare cases of quick and dirty tests, whenever SLURM is used for batch scheduling, srun isn't to be used, but rather the MPI's mpirun or mpiexec default command.

Recommended topics

Hot tags