Why a script that uses threads prints extra lines occasionally?
Asked Answered
H

2

7

If print s is replaced by print >>sys.stderr, s then the effect vanishes.

import random, sys, time
import threading

lock = threading.Lock()

def echo(s):
    time.sleep(1e-3*random.random()) # instead of threading.Timer()
    with lock:
        print s

for c in 'abc':
    threading.Thread(target=echo, args=(c,)).start()

Example

# Run until empty line is found:
$ while ! python example.py 2>&1|tee out|grep '^$';do echo -n .;done;cat out

Output

....................
b

c
a

The output should not contain empty lines, but it does. I understand that print is not thread-safe, but I would've thought the lock should help.

The question is why this happens?

My machine:

$ python -mplatform
Linux-2.6.38-11-generic-x86_64-with-Ubuntu-11.04-natty

Extra lines are printed on py26, py27, pypy.

py24, py25, py31, py32 behave as expected (no empty lines).

Variations

  • sys.stdout.flush() after the print doesn't solve the problem:

    with lock:
        print(s)
        sys.stdout.flush()
    
  • even more strange that ordinary sys.stdout.write() doesn't produce empty lines with lock:

    with lock:
        sys.stdout.write(s)
        sys.stdout.write('\n') #NOTE: no .flush()
    
  • print function works as expected (no empty lines).

To reproduce download files and run:

$ tox
Health answered 7/10, 2011 at 13:21 Comment(12)
FWIW, I don't seem to be able to reproduce this (have run a couple of thousand iterations - is that enough?) I'm using Python 2.6.5 on 64-bit Ubuntu 10.04.Angelika
Nor does it happen for me with EPD Python 2.7.1 (both 32- and 64-bit) on the same box.Angelika
@aix: thanks. Yes 1000 is enough. The test runs ~10 iteration on my machine.Health
@aix: I've added tox.ini. Could you run the test?Health
Same problem observed (from time to time) with Python 2.7.2 on OS X Lion (MacPorts version).Ticktack
The different behavior on Python 2 suggests that this has something to do with softspace. There's probably good reason print() was revamped in Python 3.Shurlocke
@aix: how many core do you have? The behavior might be connected with multiple CPUs.Health
Four cores, no hyperthreading.Angelika
@aix: Then it is not it. I also have 4 cores (although with hyperthreading).Health
Possible duplicate of How do I get a thread safe print in Python 2.6?Am‚lie
@CiroSantilli新疆改造中心六四事件法轮功 If you think it is a duplicate, then what answer from the question you've linked does answer this question on your opinion. Do read the code examples in the question carefully.Health
Ah OK you are right.Am‚lie
A
4

Take a look at this stackoverflow thread: How do I get a thread safe print in Python 2.6?. Apparently, printing to sout is not thread-safe.

If you turn on verbose threading, you can see this better:

threading.Thread(target=echo, args=(c,), verbose=True).start()

I get output like this:

MainThread: <Thread(Thread-1, initial)>.start(): starting thread
Thread-1: <Thread(Thread-1, started 6204)>.__bootstrap(): thread started
MainThread: <Thread(Thread-2, initial)>.start(): starting thread
Thread-2: <Thread(Thread-2, started 3752)>.__bootstrap(): thread started
MainThread: <Thread(Thread-3, initial)>.start(): starting thread
Thread-3: <Thread(Thread-3, started 4412)>.__bootstrap(): thread started
MainThread: <Thread(Thread-2, started 3752)>.join(): waiting until thread stops
a
b
Thread-1: <Thread(Thread-1, started 6204)>.__bootstrap(): normal return
Thread-2: <Thread(Thread-2, started 3752)>.__bootstrap(): normal return
MainThread: <Thread(Thread-2, stopped 3752)>.join(): thread stopped
MainThread: <Thread(Thread-3, started 4412)>.join(): waiting until thread stops
Thread-3: <Thread(Thread-3, started 4412)>.__bootstrap(): normal return
MainThread: <Thread(Thread-3, stopped 4412)>.join(): thread stopped
c

You can see that thread 3 is shown as finishing before printing the 'c' character. This clearly cannot be the case, so this leads me to assume that printing to the console is not thread-safe.

This, however, does not explain why printing to sys.stderr appears to work correctly.

Astonishment answered 7/10, 2011 at 14:49 Comment(7)
I don't thing that the output means that thread 3 finishes before printing the 'c' character example.Health
Then I don't understand what "clearly see" means in this context?Health
I've rephrased my explanation. I hope it is clearer now. I was trying to say that the output in the console cannot be correct thread-wise, because it shows that the character c is printed even after all threads are finished. This lead me to believe that printing to standard output is not threadsafe. As you already mentioned in your first comment.Astonishment
thanks, It is clear now. But the issue is an extra output (a newline) not the order of the lines.Health
I've edited my question to include variations on the code. The only thing that doesn't work is the print statement on py26, py27 (older or newer versions are fine); print() as a function works; sys.stdout.write() works.Health
what is the difference between print() as a function and print " " as a statement???Bravery
@Bravery from __future__ import print_functionClariceclarie
A
0

Because print write to stdout text first, then end string. Pseudo-code to explain:

def print(*args, **kwargs):
    write_to_stdout(to_single_string(args))
    write_to_stdout(end)  # usually a newline "\n"

so, in multithreading first string of both threads executes before second, so two newlines printed at the same time. But why lines don't on same line? I don't know. Needs to check realization of python print more deeply.

Alberto answered 18/6, 2019 at 10:34 Comment(1)
Note: there is a lock (it protects both parts). The "Variations" section in the question literally has the exact code example (that works as expected unlike the print on some Python versions).Health

© 2022 - 2024 — McMap. All rights reserved.