I am trying to find a way in Python to run other programs in such a way that:
- The stdout and stderr of the program being run can be logged separately.
- The stdout and stderr of the program being run can be viewed in near-real time, such that if the child process hangs, the user can see. (i.e. we do not wait for execution to complete before printing the stdout/stderr to the user)
- Bonus criteria: The program being run does not know it is being run via python, and thus will not do unexpected things (like chunk its output instead of printing it in real-time, or exit because it demands a terminal to view its output). This small criteria pretty much means we will need to use a pty I think.
Here is what i've got so far... Method 1:
def method1(command):
## subprocess.communicate() will give us the stdout and stderr sepurately,
## but we will have to wait until the end of command execution to print anything.
## This means if the child process hangs, we will never know....
proc=subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=True, executable='/bin/bash')
stdout, stderr = proc.communicate() # record both, but no way to print stdout/stderr in real-time
print ' ######### REAL-TIME ######### '
######## Not Possible
print ' ########## RESULTS ########## '
print 'STDOUT:'
print stdout
print 'STDOUT:'
print stderr
Method 2
def method2(command):
## Using pexpect to run our command in a pty, we can see the child's stdout in real-time,
## however we cannot see the stderr from "curl google.com", presumably because it is not connected to a pty?
## Furthermore, I do not know how to log it beyond writing out to a file (p.logfile). I need the stdout and stderr
## as strings, not files on disk! On the upside, pexpect would give alot of extra functionality (if it worked!)
proc = pexpect.spawn('/bin/bash', ['-c', command])
print ' ######### REAL-TIME ######### '
proc.interact()
print ' ########## RESULTS ########## '
######## Not Possible
Method 3:
def method3(command):
## This method is very much like method1, and would work exactly as desired
## if only proc.xxx.read(1) wouldn't block waiting for something. Which it does. So this is useless.
proc=subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=True, executable='/bin/bash')
print ' ######### REAL-TIME ######### '
out,err,outbuf,errbuf = '','','',''
firstToSpeak = None
while proc.poll() == None:
stdout = proc.stdout.read(1) # blocks
stderr = proc.stderr.read(1) # also blocks
if firstToSpeak == None:
if stdout != '': firstToSpeak = 'stdout'; outbuf,errbuf = stdout,stderr
elif stderr != '': firstToSpeak = 'stderr'; outbuf,errbuf = stdout,stderr
else:
if (stdout != '') or (stderr != ''): outbuf += stdout; errbuf += stderr
else:
out += outbuf; err += errbuf;
if firstToSpeak == 'stdout': sys.stdout.write(outbuf+errbuf);sys.stdout.flush()
else: sys.stdout.write(errbuf+outbuf);sys.stdout.flush()
firstToSpeak = None
print ''
print ' ########## RESULTS ########## '
print 'STDOUT:'
print out
print 'STDERR:'
print err
To try these methods out, you will need to import sys,subprocess,pexpect
pexpect is pure-python and can be had with
sudo pip install pexpect
I think the solution will involve python's pty module - which is somewhat of a black art that I cannot find anyone who knows how to use. Perhaps SO knows :) As a heads-up, i recommend you use 'curl www.google.com' as a test command, because it prints its status out on stderr for some reason :D
UPDATE-1:
OK so the pty library is not fit for human consumption. The docs, essentially, are the source code.
Any presented solution that is blocking and not async is not going to work here. The Threads/Queue method by Padraic Cunningham works great, although adding pty support is not possible - and it's 'dirty' (to quote Freenode's #python).
It seems like the only solution fit for production-standard code is using the Twisted framework, which even supports pty as a boolean switch to run processes exactly as if they were invoked from the shell.
But adding Twisted into a project requires a total rewrite of all the code. This is a total bummer :/
UPDATE-2:
Two answers were provided, one of which addresses the first two criteria and will work well where you just need both the stdout and stderr using
Threads and Queue
. The other answer usesselect
, a non-blocking method for reading file descriptors, and pty, a method to "trick" the spawned process into believing it is running in a real terminal just as if it was run from Bash directly - but may or may not have side-effects. I wish I could accept both answers, because the "correct" method really depends on the situation and why you are subprocessing in the first place, but alas, I could only accept one.
pyinvoke
would help here. Withoutpty=True
you have issues buffering. Withpty=True
, you can't separate stdout, stderr – Romeldash
's callbacks might work to get stdout, stderr separately and incrementally but I don't see that it solves the buffering issue withstderr
(I see onlytty_in
,tty_out
, though it might be enough in most cases and there could be issues with stderr!=STDOUT) – Romeldash
performs some ugly non-portable hacks (likestdbuf
utility) then_out_bufsize
controls a wrong buffer. It is likely that it controls the buffer in the parent python process. Likebufsize
, it won't fix the buffering behavior in the child. Look at the picture that shows libc stdio buffers in my answer (the parent process (the shell) buffers are not shown) – Romelda