Slurm: Why use srun inside sbatch?
Asked Answered
V

1

40

In a sbatch script, you can directly launch programs or scripts (for example an executable file myapp) but in many tutorials people use srun myapp instead.

Despite reading some documentation on the topic, I do not understand the difference and when to use each of those syntaxes.

I hope this question is precise enough (1st question on SO), thanks in advance for your answers.

Vulgarize answered 5/12, 2018 at 16:32 Comment(3)
is the scenario you have the same as the submission script I provided as an example in this question: #72092772 ? – Supernaturalism
can you provide an example sbatch submission script for your question? – Supernaturalism
@CharlieParker This question dates from my previous job: I don't have any access to a Slurm HPC now and won't be able to provide any reliable example πŸ€·β€β™‚οΈ – Vulgarize
S
34

The srun command is used to create job 'steps'.

First, it will bring better reporting of the resource usage ; the sstat command will provide real-time resource usage for processes that are started with srun, and each step (each call to srun) will be reported individually in the accounting.

Second, it can be used to setup many instances of a serial program (program that only use one CPU) into a single job, and micro-schedule those programs inside the job allocation.

Finally, for parallel jobs, srun will also play the important role of starting the parallel program and setup the parallel environment. It will start as many instances of the program as were requested with the --ntasks option on the CPUs that were allocated for the job. In the case of a MPI program, it will also handle the communication between the MPI library and Slurm.

Superbomb answered 5/12, 2018 at 20:48 Comment(5)
Thanks a lot for this precise answer – Vulgarize
In the case of setting up many instances of a serial program, a typical case is srun -N1 -n1 myprog & right? If the sbatch job allocation is over > 1 node, then will srun ensure each instance runs on an independent CPU better than just myprog &? In fact, what happens if the script simply has myprog & and the allocation is over > 1 node? – Coarsen
if the script simply has myprog & and the allocation is over > 1 node, only the first node will have processes running, and those processes will fight for access to the same CPUs – Superbomb
what if I have GPUs -- single and multiple? – Supernaturalism
would an example run with srun be srun python main.py? Asking cuz I only know with srun hostname – Supernaturalism

© 2022 - 2024 β€” McMap. All rights reserved.