PBS, refresh stdout
Asked Answered
A

4

14

I have a long running Torque/PBS job and I'd like to monitor output. But log file only gets copied after the job is finished. Is there a way to convince PBS to refresh it?

Abri answered 10/5, 2012 at 3:17 Comment(1)
For the benefit of people (like me) looking for "how do I do this with that?" the Platform/Spectrum LSF analog for this is bpeek.Losing
A
4

Unfortunately, AFAIK, that is not possible with PBS/Torque - the stdout/stderr streams are locally spooled on the execution host and then transferred to the submit host after the job has finished. You can redirect the standard output of the program to a file if you'd like to monitor it during the execution (it makes sense only if the execution and the sumit hosts share a common filesystem).

I suspect the rationale is that it allows for jobs to be executed on nodes that doesn't share their filesystem with the submit node.

Arria answered 10/5, 2012 at 17:20 Comment(2)
I found a -k flag which is not very nice though - so I ended up capturing stdout outside the queue. :/Abri
As a long time SGE user used to being able to see the output files immediately, I do feel your pain. As a coincidence, I've spent half afternoon today looking for an alternative to LSF's bpeek command on a MOAB/Torque system and frustratingly found none.Arria
L
7

This is possible in TORQUE. If you have a shared filesystem you can set

$spool_as_final_name true

in the mom's config file. This will have the file write directly to the final output destination instead of spooling in the spool directory. Once you are set up with that you can tail -f the output file and monitor anything you want.

http://www.adaptivecomputing.com/resources/docs/torque/3-0-3/a.cmomconfig.php (search for spool_as_final_name

Luigiluigino answered 16/5, 2012 at 21:32 Comment(1)
No, it can't be controlled by the user but a lot of sys admins like this feature as well.Luigiluigino
A
4

Unfortunately, AFAIK, that is not possible with PBS/Torque - the stdout/stderr streams are locally spooled on the execution host and then transferred to the submit host after the job has finished. You can redirect the standard output of the program to a file if you'd like to monitor it during the execution (it makes sense only if the execution and the sumit hosts share a common filesystem).

I suspect the rationale is that it allows for jobs to be executed on nodes that doesn't share their filesystem with the submit node.

Arria answered 10/5, 2012 at 17:20 Comment(2)
I found a -k flag which is not very nice though - so I ended up capturing stdout outside the queue. :/Abri
As a long time SGE user used to being able to see the output files immediately, I do feel your pain. As a coincidence, I've spent half afternoon today looking for an alternative to LSF's bpeek command on a MOAB/Torque system and frustratingly found none.Arria
T
1

For me, ssh-ing to the node where the job is running and looking at files under /var/spool/torque/spool/ works, but it might be specific to this particular environment.

Taneshatang answered 4/11, 2012 at 23:40 Comment(0)
V
0

In case you submit a shell script you may also put these two commands in the beginning of the script.

exec 1>file.stdout
exec 2>file.stderr

This will put the output from stdout and stderr in the working directory of your job.

Veto answered 1/11, 2018 at 13:37 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.