Intercepting stdout of a subprocess while it is running
Asked Answered
A

2

25

If this is my subprocess:

import time, sys
for i in range(200):
    sys.stdout.write( 'reading %i\n'%i )
    time.sleep(.02)

And this is the script controlling and modifying the output of the subprocess:

import subprocess, time, sys

print 'starting'
    
proc = subprocess.Popen(
    'c:/test_apps/testcr.py',
    shell=True,
    stdin=subprocess.PIPE,
    stdout=subprocess.PIPE  )

print 'process created'

while True:
    #next_line = proc.communicate()[0]
    next_line = proc.stdout.readline()
    if next_line == '' and proc.poll() != None:
        break
    sys.stdout.write(next_line)
    sys.stdout.flush()
    
print 'done'

Why is readline and communicate waiting until the process is done running? Is there a simple way to pass (and modify) the subprocess' stdout real-time?

I'm on Windows XP.

Aisne answered 9/2, 2009 at 5:43 Comment(1)
Related: How to flush output of Python print?Fishbein
L
16

As Charles already mentioned, the problem is buffering. I ran in to a similar problem when writing some modules for SNMPd, and solved it by replacing stdout with an auto-flushing version.

I used the following code, inspired by some posts on ActiveState:

class FlushFile(object):
    """Write-only flushing wrapper for file-type objects."""
    def __init__(self, f):
        self.f = f
    def write(self, x):
        self.f.write(x)
        self.f.flush()

# Replace stdout with an automatically flushing version
sys.stdout = FlushFile(sys.__stdout__)
Liguria answered 9/2, 2009 at 6:12 Comment(6)
I don't see how that is any different than calling sys.stdout.flush() after every sys.stdout.readline(), which is what I do. I also tried setting bufsize=0 for the subprocess.Aisne
The flush is needed in the subprocess, not the parent process.Lindahl
Yes, in the example the subprocess is also a python script. So replace the stdout in the subprocess. Calling sys.stdout.flush() in the parent process doesn't do anything.Liguria
Ok. I see what i did there. Of course this child process is just a sample. My real process is a giant piece of compiled FORTRAN to which I do not have access to source. In which case I just need to hope that the child does not have buffered output? What then does subprocess.Popen's bufsize do?Aisne
As far as I know, it's the application code that determines the size of the output buffer. I don't think you can do anything externally, unless it's dynamically linked and you preload a library that replaces the system calls. But that's a huge hack and out of the scope of this question :)Liguria
...well, when output goes through the standard C library, there may be tweaks to the buffering available -- modern versions of glibc can have their buffering configuration tweaked using stdbuf: gnu.org/software/coreutils/manual/… -- though I don't know if that'll do any good for a Fortran application.Fullmouthed
F
8

Process output is buffered. On more UNIXy operating systems (or Cygwin), the pexpect module is available, which recites all the necessary incantations to avoid buffering-related issues. However, these incantations require a working pty module, which is not available on native (non-cygwin) win32 Python builds.

In the example case where you control the subprocess, you can just have it call sys.stdout.flush() where necessary -- but for arbitrary subprocesses, that option isn't available.

See also the question "Why not just use a pipe (popen())?" in the pexpect FAQ.

Fullmouthed answered 9/2, 2009 at 5:49 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.