How to run several commands in one PBS job submission
Asked Answered
A

1

5

I have written a code that takes only 1-4 cpus. But when I submit a job on the cluster, I have to take at least one node with 16 cores per job. So I want to run several simulations on each node with each job I submit. I was wondering if there is a way to submit the simulations in parallel in one job.

Here's an example: My code takes 4 cpus. I submit a job for one node, and I want the node to run 4 instances of my code (each instance has different parameters) to take all the 16 cores.

Actualize answered 8/11, 2012 at 2:38 Comment(0)
E
8

Yes, of course; generally such systems will have instructions for how to do this, like these.

If you have (say) 4x 4-cpu jobs that you know will each take the same amount of time, and (say) you want them to run in 4 different directories (so the output files are easier to keep track of), use the shell ampersand to run them each in the background and then wait for all background tasks to finish:

(cd jobdir1; myexecutable argument1 argument2) &
(cd jobdir2; myexecutable argument1 argument2) &
(cd jobdir3; myexecutable argument1 argument2) &
(cd jobdir4; myexecutable argument1 argument2) &
wait

(where myexecutable argument1 argument2 is just a place holder for however you usually run your program; if you use mpiexec or something similar, that goes in there just as you'd normally use it. If you're using OpenMP, you can export the environment variable OMP_NUM_THREADS before the first line above.

If you have a number of tasks that won't all take the same length of time, it's easiest to assign well more than the (say) 4 jobs above and let a tool like gnu parallel launch the jobs as necessary, as described in this answer.

Elyot answered 8/11, 2012 at 6:9 Comment(3)
Thanks, I'll look into it. But if the system doesn't have gnu parallel, can I install it in my home folder and use it? I don't have root privileges.Actualize
Yes; just download it and run ./configure --prefix=$HOME && make && make install. It's a fairly simple install, and it's useful enough that your sysadmins should be amenable to installing it systemwide anyway. The syntax can be a little complicated; there's a good tutorial here -- unethicalblogger.com/2010/11/11/…Elyot
How can I be sure that this won't overlap the processes on the same processor? Also, what about multinode parallel. e.g. a queue for my case only allows 2 nodes strictly, but most of the time I only need a singe node, how can I submit two different job on two node parallely. I have to submit with qsub and I can't specify the node name during submission.Oceanic

© 2022 - 2024 — McMap. All rights reserved.