Why does this Python code with threading have race conditions?
Asked Answered
L

2

7

This code creates a race condition:

import threading

ITERS = 100000
x = [0]

def worker():
    for _ in range(ITERS):
        x[0] += 1  # this line creates a race condition
        # because it takes a value, increments and then writes
        # some inrcements can be done together, and lost

def main():
    x[0] = 0  # you may use `global x` instead of this list trick too
    t1 = threading.Thread(target=worker)
    t2 = threading.Thread(target=worker)
    t1.start()
    t2.start()
    t1.join()
    t2.join()

for i in range(5):
    main()
    print(f'iteration {i}. expected x = {ITERS*2}, got {x[0]}')

Output:

$ python3 test.py
iteration 0. expected x = 200000, got 200000
iteration 1. expected x = 200000, got 148115
iteration 2. expected x = 200000, got 155071
iteration 3. expected x = 200000, got 200000
iteration 4. expected x = 200000, got 200000

Python3 version:

Python 3.9.7 (default, Sep 10 2021, 14:59:43) 
[GCC 11.2.0] on linux

I thought GIL would prevent it and not allow two threads run together until they do something io-related or call a C library. At least this is what you may conclude from the docs.

Turns out I was wrong. Then, what does GIL actually do, and when do threads run in parallel?

Lafollette answered 27/12, 2021 at 8:55 Comment(10)
See: #1717893 and especially #38266686Australopithecus
This shouldn't be the job for a lock? I mean why don't you use a lock when adding the number of the list? Operation on list do not guarantee atomicity. Have u tried with a lock and the same occur? Also I think using a deque should ensure more atomicity than a simple list. There was an stackoverflow answer where Alex martelli points out that you should you a combination of deque and queue when using threadingBezique
aaaaand web.archive.org/web/20201108091210/http://effbot.org/pyfaq/…Australopithecus
@FedericoBaù it should, of course, when you know this is going to happen. When I learned Python and tried multithreading a decade ago, it seemed that it should be like Javascript, that executes an entire function until it ends and lets the event loop go on.Zoril
@Lafollette found the alex martelli opinion regarding threading https://mcmap.net/q/45218/-how-do-i-use-threading-in-pythonBezique
@Australopithecus oh, that's where the example code came originally.Zoril
@Lafollette ahh yes but is quite different than javascript for sure. I understand the javascript background now because you use race considin and used off course for async work.normally when talking about python threading I normally read block non blocking ;)Bezique
Try it with Python 3.10, I think there was a recent interesting Q&A that pointed out that this doesn't happen anymore (though it's not safe to assume it doesn't).Mahone
i implement this code (python 3.10, windows10), and do not get the issue. all my "got values" were as expected... so a version / platform issue.Fonteyn
@D.L I tested in on my new Ubuntu with Python 3.10, and it also stopped having race condition. Interesting.Zoril
L
3

Reading the docs better, I think there's the answer:

The mechanism used by the CPython interpreter to assure that only one thread executes Python bytecode at a time. This simplifies the CPython implementation by making the object model (including critical built-in types such as dict) implicitly safe against concurrent access. Locking the entire interpreter makes it easier for the interpreter to be multi-threaded, at the expense of much of the parallelism afforded by multi-processor machines.

However, some extension modules, either standard or third-party, are designed so as to release the GIL when doing computationally-intensive tasks such as compression or hashing. Also, the GIL is always released when doing I/O.

I guess this means that each line of source code consists of multiple blocks of bytecode. Bytecode lines/blocks are atomic, i.e. they get executed alone, but the source lines aren't.

Here's the byte code that +=1 exapands to (run dis.dis('x[0] += 1') to see):

          0 LOAD_NAME                0 (x)
          2 LOAD_CONST               0 (0)
          4 DUP_TOP_TWO
          6 BINARY_SUBSCR
          8 LOAD_CONST               1 (1)
         10 INPLACE_ADD
         12 ROT_THREE
         14 STORE_SUBSCR
         16 LOAD_CONST               2 (None)
         18 RETURN_VALUE

When these lines are executed in concurrent way, race condition occurs.

So, GIL does not save you from it. It only prevents race conditions that could damage complex structures like list or dict.

Lafollette answered 27/12, 2021 at 9:23 Comment(1)
use dis.dis("x[0] += 1")Australopithecus
F
1

As per our final comments, it appears as though this has been fixed (ubuntu, windows) with python version 3.10 and above. This issue is no longer experienced.

However, there other scenarios where race conditions can be obsevered. For example this:

import threading
import time
 
x = 10
 
def increment(by):
    global x
 
    local_counter = x
    local_counter += by
 
    time.sleep(1)
 
    x = local_counter
    print(f'{threading.current_thread().name} inc x {by}, x: {x}')
 
def main():
    # creating threads
    t1 = threading.Thread(target=increment, args=(5,))
    t2 = threading.Thread(target=increment, args=(10,))
   
    # starting the threads
    t1.start()
    t2.start()
   
    # waiting for the threads to complete
    t1.join()
    t2.join()
   
    print(f'The final value of x is {x}')
 
for i in range(10):
    main()

which produces this:

Thread-56 (increment) inc x 10, x: 20Thread-55 (increment) inc x 5, x: 15
 
The final value of x is 15
Thread-57 (increment) inc x 5, x: 20Thread-58 (increment) inc x 10, x: 25
 
The final value of x is 25
Thread-60 (increment) inc x 10, x: 35Thread-59 (increment) inc x 5, x: 30
 
The final value of x is 30
Thread-61 (increment) inc x 5, x: 35
Thread-62 (increment) inc x 10, x: 40
The final value of x is 40
Thread-64 (increment) inc x 10, x: 50Thread-63 (increment) inc x 5, x: 45
 
The final value of x is 45

but the fix here is to use the asyncio module to control the flow of the code.

Fonteyn answered 13/9, 2022 at 6:44 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.