Why are repetitive calls to squeue in Slurm frown upon?

Asked 22/6, 2020 at 11:24 Answered 25/6, 2020 at 11:45

Solved cluster-computing slurm sungridengine lsf

Why is it not recommended to run squeue in a loop to avoid overloading Slurm, but no such limitations are mentioned for the bjobs tool from LSF or qstat from SGE ?

The man page for squeue states:

PERFORMANCE

Executing squeue sends a remote procedure call to slurmctld. If enough calls from squeue or other Slurm client commands that send remote procedure calls to the slurmctld daemon come in at once, it can result in a degradation of performance of the slurmctld daemon, possibly resulting in a denial of service.

Do not run squeue or other Slurm client commands that send remote procedure calls to slurmctld from loops in shell scripts or other programs. Ensure that programs limit calls to squeue to the minimum necessary for the information you are trying to gather.

which to my understanding disapproves the use of e.g. watch squeue. Such a warning is commonly found in site-specific documentation, e.g. here:

Although squeue is a convenient command to query the status of jobs and queues, please be careful not to issue the command excessively, for example, invoking the query for the status of a job every five seconds or so using a script after a job is submitted.

In comparison, I could find no such warning for similar tools on other engines e.g. qstat or bjobs. I see people using all of these tools in a repetitive fashion without distinction, e.g. here for squeue, here for bjobs.

The quote above from Slurm documentation mention a RPC, is it a way of doing different from other engines ? Is there an architecture difference between Slurm and other grid engines that makes querying the status of all jobs more costly ?

Actinopod answered 22/6, 2020 at 11:24 Comment(2)

Every service has its scalability limits. LSF has a dedicated query daemon, which helps. But even then there are limits. e.g., if you have millions of jobs in the cluster, and many job checks their dependencies by calling something like bjobs -uall -a | grep blah, then there will be a service degradation. – Unfaithful 24/6, 2020 at 17:49

Although the docs say this, squeue itself has a -i or --iterate option which will re-run it every N seconds, down to every second. It looks like it can request updates since its last query, which presumably helps a bit, but as far as I can tell it's still sending a new RPC request every time. There's no warning when I use this. – Conchoid 1/12, 2021 at 11:51

Actually the concern about running squeue too quickly often originates more from cluster administrators than developers. In this particular case, looking at the commit message of that specific section of the documentation, we learn that it was actually requested by a customer of SchedMD, so most probably an entity running a production cluster.

The criticality of that advise increases with the size of the cluster and the job turnover. On a 10-node cluster running on average 5-6 jobs per day, from a dozen users you will be find hitting the slurm controller with many squeue requests. But on a 4000-nodes, 10000 users, 10k jobs/day, you might interfere in a visible way with Slurm performances.

I have seen at least one site that overwrote the qstat command with a rate-limiting version based on cached information.

From a technical point of view, RPC is what most of the alternatives use.

Slag answered 25/6, 2020 at 11:45 Comment(0)

You are correct, in that squeue should not be used this way. A HPC resource in Canada also states this:

Do not run sq or squeue from a script or program at high frequency, e.g., every few seconds. Responding to squeue adds load to Slurm, and may interfere with its performance or correct operation. See Email notification below for a much better way to learn when your job starts or ends.

Source: https://docs.computecanada.ca/wiki/Running_jobs#Use_squeue_or_sq_to_list_jobs

As you can see the HPC resource in Canada recommends using email notification when using sbatch:

#SBATCH [email protected]
#SBATCH --mail-type=BEGIN
#SBATCH --mail-type=END
#SBATCH --mail-type=FAIL
#SBATCH --mail-type=REQUEUE
#SBATCH --mail-type=ALL

Source: https://docs.computecanada.ca/wiki/Running_jobs#Email_notification

I was a heavy qstat user on SGE and now prefer email notification. I don't need to actively monitor job status and get record of when job passed through various milestones.

Bant answered 23/6, 2020 at 11:19 Comment(0)

Recommended topics

Hot tags