Why is piping output of subprocess so unreliable with Python?
Asked Answered
S

1

3

(Windows)

I wrote some Python code that calls the program SoX (subprocess module), which outputs the progress on STDERR, if you specify it to do so. I want to get the percentage status from the output. If I call it not from the Python script, it starts immediately and has a smooth progression till 100%.

If I call it from the Python script, it lasts a few seconds till it starts and then it alternates between slow output and fast output. Although I read char by char sometimes there RUSHES out a large block. So I don't understand why at other times I can watch the characters getting more one by one. (It generates 15KiB of data in my test, by the way.)

I have tested the same with mkvmerge and mkvextract. They output percentages, too. Reading STDOUT there is smooth.

This is so unreliable! How can I make the reading of sox's stderr stream smoother, and perhaps prevent the delay at the beginning?


How I call and read:

process = subprocess.Popen('sox_call_dummy.bat', stderr = subprocess.PIPE, stdout = subprocess.PIPE)
while True:
    char = process.stderr.read(1).encode('string-escape')
    sys.stdout.write(char)
Select answered 14/4, 2012 at 1:58 Comment(5)
What is your bufsize value? Can you show your subprocess snippet?Dreeda
Zero (default). But I've just tested 1, 1024, 8*1024, 16*1024, 160*1024, it's the same with every value.Select
Post a code example of how you are calling and reading your process.Dreeda
Probably, you need to disable buffering.Vaughan
Possible duplicate: #1184143Dreeda
D
1

As per this closely related thread: Unbuffered read from process using subprocess in Python

process = subprocess.Popen('sox_call_dummy.bat', 
                stderr = subprocess.PIPE, bufsize=0)
while True:
    line = process.stderr.readline()
    if not line: 
        break
    print line

Since you aren't reading stdout, I don't think you need a pipe for it.

If you want to try reading char by char as in your original example, try adding a flush each time:

sys.stdout.write(char)
sys.stdout.flush()

Flushing the stdout every time you write is the manual equivalent of disabling buffering for the python process: python.exe -u <script> or setting the env variable PYTHONUNBUFFERED=1

Dreeda answered 14/4, 2012 at 2:20 Comment(9)
I create a pipe for stdout because of a bug when I try to run the script if there's no console window but an IDLE GUI window.Select
@rynd: Ok well then add it back by all means, if something is reading it.Dreeda
The problem remains. If I look at the CPU load of the subprocess, I see sox is working with 90% load while my script still has a delay. Why does read() not work at this point? Then the first output (information not related to the progress) is output slowly.Select
@rynd: Did the flush not help with your write() ?Dreeda
I think flush() on sys.stdout did a little bit, but there are still points where it twiddles its thumbs although sox already has generated output. "Oops, sox already has 25%, I should hurry."Select
Not sure what to tell ya beyond this point because you would be unbuffered. My experience is that it should be live.Dreeda
OK, thanks! I'll try Python 3, suspending the process shortly and some other things.Select
With Python 3 it's the same and suspending is an unclear approach. I've just tested adding "2> stderr" to the called command, i.e. redirecting stderr to a file called "stderr", and then reading the file. It's the same: alternating slow and fast output. Unbelievable! Seems that Windows doesn't like SoX.Select
@rynd: Maybe windows specific. You might try doing some searching on subprocess buffered output and windows.Dreeda

© 2022 - 2024 — McMap. All rights reserved.