SLURM display the stdout and stderr of an unfinished job
Asked Answered
T

3

14

I used to use a server with LSF but now I just transitioned to one with SLURM.

What is the equivalent command of bpeek (for LSF) in SLURM?

bpeek bpeek Displays the stdout and stderr output of an unfinished job

I couldn't find the documentation anywhere. If you have some good references for SLURM, please let me know as well. Thanks!

Trapes answered 28/9, 2013 at 1:0 Comment(0)
A
8

You might also want to have a look at the sattach command.

Aleciaaleck answered 4/10, 2013 at 22:32 Comment(2)
sattach doesn't believe in my jobid (Invalid job id specified). The id I specify is identical to that output by squeue, though I'm not sure what 'step' to use. I don't use srun in my script, is that relevant? I ran my job with sbatch and a bash script that has a few slurm parameters, loads a few modules, cd and then runs a single python program.Doralin
Apparently it's possible to have different steps in your batch file, drevicko. I'm not sure what steps are, but doing <jobid>.0 allowed this to work for me. (I guess not setting up steps means you are at step 0 by default.)Kirby
T
8

I just learned that in SLURM there is no need to do bpeek to check the current standard output and standard error since they are printed in running time to the files specified for the stdout and stderr.

Trapes answered 28/9, 2013 at 18:3 Comment(3)
Not in my (admittedly a little limited) experience, though perhaps it's buffered.Doralin
I'd add that you can use scontrol show job <jobid> to figure out where the standard output and standard error is written to. Useful if you have many jobs and can't easily track which job id writes to which output.Thermotaxis
In my case there is no StdOut info in scontrol show job my_job_id. I guess that means it isn't saving it anywhere?Rudolfrudolfo
A
8

You might also want to have a look at the sattach command.

Aleciaaleck answered 4/10, 2013 at 22:32 Comment(2)
sattach doesn't believe in my jobid (Invalid job id specified). The id I specify is identical to that output by squeue, though I'm not sure what 'step' to use. I don't use srun in my script, is that relevant? I ran my job with sbatch and a bash script that has a few slurm parameters, loads a few modules, cd and then runs a single python program.Doralin
Apparently it's possible to have different steps in your batch file, drevicko. I'm not sure what steps are, but doing <jobid>.0 allowed this to work for me. (I guess not setting up steps means you are at step 0 by default.)Kirby
S
1

Here's a workaround that I use. It mimics the bpeek functionality from LSF

Create a file bpeek.sh:

#!/bin/bash
# take as input an argument - slurm job id - and save it into a variable
jobid=$1
# run scontrol show job $jobid and save the output into a variable
#find the string that starts with StdOut= and save it into a variable without the StdOut= part
stdout=$(scontrol show job $jobid | grep StdOut= | sed 's/StdOut=//')
#show last 10 rows of the file if no argument 2 is given
nrows=${2:-10}
tail -f -n $nrows $stdout

Then you can use it: sh bpeek.sh JOBID NROWS(optional)

Or add an alias to ~/.bashrc file: alias bpeek="sh ~/bpeek.sh $1 $2"

and then use it: bpeek JOBID NROWS(optional)

Stanger answered 11/1, 2023 at 12:18 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.