how do I get the slurm job id?
Asked Answered
S

3

6
#!/bin/bash
#SBATCH -N 1      # nodes requested
#SBATCH -n 1      # tasks requested
#SBATCH -c 4      # cores requested
#SBATCH --mem=10  # memory in Mb
#SBATCH -o outfile  # send stdout to outfile
#SBATCH -e errfile  # send stderr to errfile
#SBATCH -t 0:01:00  # time requested in hour:minute:second

module load anaconda
python hello.py jobid

lets say I have this code and I want to send the jobid to python, how do you get the job id, so when I do

sbatch script.sh
Submitted batch job 10514

how do I get the number 10514 and pass it to python?

Sinusoid answered 31/7, 2022 at 20:55 Comment(3)
The shell script you posted does nothing but call a python script. If you simply echo a result it goes to stdout and you can catch it with var = sys.argv[1] in python as an exampleImpeditive
can you be more specific on how I do thatSinusoid
You can get the job id like this: jobid=$(sbatch --parsable script.sh)December
S
16

You can just read it from environment variable, slurm will set SLURM_JOB_ID for the batch script.

module load anaconda
python hello.py $SLURM_JOB_ID

For all the environment variables that are available in the batch script, you can find them here: OUTPUT ENVIRONMENT VARIABLES

Saucedo answered 25/10, 2022 at 0:41 Comment(0)
T
3

You can use squeue. Following is the list of valid usage of squeue.

Usage: squeue [-A account] [--clusters names] [-i seconds] [--job jobid]
              [-n name] [-o format] [-p partitions] [--qos qos]
              [--reservation reservation] [--sort fields] [--start]
              [--step step_id] [-t states] [-u user_name] [--usage]
              [-L licenses] [-w nodes] [--federation] [--local] [--sibling]
          [-ahjlrsv]

I will show you how to do it with squeue -u which allows you to use your username. In my case my username is s.1915438.

Here I submit a job.

[s.1915438@cl2 ~]$ sbatch jupyter.sh 
Submitted batch job 38529784
[s.1915438@cl2 ~]$ squeue -u s.1915438
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
          38529784  gpu_v100 jupyter- s.191543  R       2:09      1 ccs2101

Here the job ID is 38529784. You can also use the USER variable as follows.

[s.1915438@cl2 ~]$ squeue -u $USER
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
          38529784  gpu_v100 jupyter- s.191543  R       0:47      1 ccs2101

If you echo the USER variable then you will see it outputs your username. This is particularly useful when you write scripts.

[s.1915438@cl2 ~]$ echo $USER
s.1915438

You can do the same if you know the job name using squeue -n.

To get this thing in Python you need to use the os library as follows.

>>> import os
>>> a=os.system("squeue -u $USER | tail -1| awk '{print $1}'")
38529793

Here tail is used to obtain the last row and awk selects the column as per our requirement. As an extra, if you want to cancel a job then use scancel as follows.

[s.1915438@cl2 ~]$ scancel 38529784 

Sometimes scancel may take 5-10 seconds.

Tamarind answered 1/8, 2022 at 10:9 Comment(6)
But this is how I can get the number after I submit a job, I want to pass the job number to the python file in the same bash script.Sinusoid
I think that is not possible. The job id is completely random. Alternatively, you can go for interactive sessions. Just use salloc to submit an interactive job. Run all you bash files using srun and then you can access the job ID using squeue.Tamarind
normally you don't need the job ID. What is your application? Maybe we can find a way around it.Tamarind
arr=(0.000001 0.00001 0.0001 0.001) for lr in "${arr[@]}" # First loop. do sbatch gbash.sh $l1 $l2 $l3 False 'SGD' $lr .2 done I have this for loop that does sbatch gbash.sh on different scenarios, gbash.sh has the python file to run the code and I want to know the job id that gbash.sh gets so I can find it later if I need to.Sinusoid
Its easy with an interactive session. You just need to allocate the resource using salloc and then run any bash file using srun. This way you can easily print the JOB ID.Tamarind
This project project (slurmio) might help. Allows reading out the params after submission in the python scriptTerris
T
-1
#!/bin/bash
#SBATCH -N 1      # nodes requested
#SBATCH -n 1      # tasks requested
#SBATCH -c 4      # cores requested
#SBATCH --mem=10  # memory in Mb
#SBATCH -o outfile  # send stdout to outfile
#SBATCH -e errfile  # send stderr to errfile
#SBATCH -t 0:01:00  # time requested in hour:minute:second

ME=`basename "$0"`
echo "My slurm job id is $ME"

You can run this file as sbatch script.sh

In the outfile you will find:

My slurm job id is 12345678

Tenet answered 9/7, 2023 at 14:9 Comment(1)
This is dependent on the configuration of the cluster. It might just output the name of script.shMouse

© 2022 - 2024 — McMap. All rights reserved.