LSF (bsub): how to specify a single "wrap-up" job to be run after all others finish?
Asked Answered
C

1

8

BASIC PROBLEM: I want to submit N + 1 jobs to an LSF-managed Linux cluster in such a way that the (N + 1)-st "wrap-up" job is not run until all the preceding N jobs have finished.

EXTRA: If possible, it would be ideal if I could arrange matters so that the (N + 1)-st ("wrap-up") job receives, as its first argument, a value of 0 (say) if all the previous N jobs terminated successfully, and a value different from 0 otherwise.

This problem (or at least the part labeled "BASIC PROBLEM") is vastly simpler than what LSF's bsub appears to be designed to handle, so I have a hard time wading through the voluminous documentation for bsub to figure out the simplest way to do what I want to do.

What would be the simplest bsub commands to achieve this arrangement?


To be more concrete, what would I have to replace the various ??? slots below to ensure that wrapup is executed only after all the foo jobs have finished (ideally with an argument that reflects the ending status of the foo jobs)?

bsub -q someq ??? foo 1
bsub -q someq ??? foo 2
bsub -q someq ??? foo 3
bsub -q someq ??? wrapup [???]
Charlena answered 17/10, 2012 at 22:13 Comment(1)
job dependencies? bsub -J 'myjob[1-10]' mycmd; bsub -w myjob wrapup; bjdepinfo -l <jobid>Synonymy
B
12

To expand on Michael Closson's answer, what you're looking for here is bsub's -w option, which allows you to submit a job that will only be scheduled if some dependency condition is met.

The most common conditions to use is the exit status of some other job, if you name each of your "foo $i" jobs with -J:

bsub -q someq -J "job_1" foo 1
bsub -q someq -J "job_2" foo 2
bsub -q someq -J "job_3" foo 3

Then you can submit another job that depends on the exit status of these jobs as follows:

bsub -q someq -w "done(job_1) && done(job_2) && done(job_3)" wrapup

This tells LSF to only schedule "wrapup" if the jobs named job_1, job_2, and job_3 terminate with DONE status. You can also use job-id's instead of job names, or specify the specific status you want to test for with expressions like

done("job_1")   // termination status is DONE
exit("job_1")   // termination status is EXIT
ended("job_1")  // termination status is EXIT or DONE

And combine these with logical operators &&, ||, !

Buitenzorg answered 11/9, 2013 at 16:35 Comment(2)
what if job_1 is an array job_1[1-1000], would all the jobs in the array have to be complete for done("job_1") to evaluate to true?Nodose
@par Yes, I think so. You can write dependency on a particular element using syntax like "done(job_1[27])" to depend on a particular element, or you can have a pointwise dependency where each element of array job_2[1-1000] can depend on the corresponding element of job_1 by using "done(job_1[*])" in job_2's dependency expression. See this answer.Buitenzorg

© 2022 - 2024 — McMap. All rights reserved.