How to get GPU (GRES) Allocation Reports using SLURM
Asked Answered
T

3

7

I read in slurm docs that we could use (after setting up the accounting) sacct --format="JobID,AllocCPUS,**ReqGRES** to get the statistics of requests for GRES. I have also configured my GPUs (there are 2) with gres.conf but this command always returns 0 for ReqGRES or AllocGRES. Any ideas? Thanks in advance

Tomblin answered 6/6, 2016 at 14:49 Comment(0)
J
4

There are many reasons I think you are not root user the sacct display just the user's job login or you must add the option -a or you have problem with your configuration file slurm.conf or the log file of slurm it is necessary to check

sacct -a -X --format=JobID,AllocCPUS,Reqgres

It works.

Johen answered 17/10, 2017 at 13:42 Comment(0)
P
3

I always find these reports very helpful from sreport. Just specify the TRES as done in gres.conf slurm.conf.

$sreport -tminper cluster utilization --tres="gres/gpu" start=2019-05-01T00:00:00
--------------------------------------------------------------------------------
Cluster Utilization 2019-05-01T00:00:00 - 2019-05-14T23:59:59
Usage reported in TRES Minutes/Percentage of Total
--------------------------------------------------------------------------------
  Cluster      TRES Name         Allocated              Down         PLND Down              Idle          Reserved           Reported 
--------- -------------- ----------------- ----------------- ----------------- ----------------- ----------------- ------------------ 
gpugrid+       gres/gpu   8186500(70.06%)     17889(0.96%)          0(0.00%)    1289051(22.97%)          0(0.00%)   9693440(100.00%) 

You can also do per user, per gres eg: --tres="gres/gpu:v100" (configure slurm.conf) etc.

Punishment answered 15/5, 2019 at 9:27 Comment(2)
Is there a way where we can see these live reports? If I understand correctly, this comment shows only the static snapshot of the report at the current time right?Archean
likely you have to check the database for this or build some plugin yourself or ein shell Script via Cron. maybe there is also an improvement in recent versions of slurm, have not checkedPunishment
R
0

Try using AllocTRES

sacct -X --format="JobID, State%-10, JobName%-30, Elapsed, AllocTRES%-42"

You can also use -e to look at the list of available fields that can be specified in the format option. You can also see the list here: https://slurm.schedmd.com/sacct.html#OPT_helpformat

sacct -e
Roseline answered 11/8, 2022 at 0:23 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.