Why does Python provide locking mechanisms if it's subject to a GIL?
Asked Answered
E

2

10

I'm aware that Python threads can only execute bytecode one at a time, so why would the threading library provide locks? I'm assuming race conditions can't occur if only one thread is executing at a time.

The library provides locks, conditions, and semaphores. Is the only purpose of this to synchronize execution?

Update:

I performed a small experiment:

from threading import Thread
from multiprocessing import Process

num = 0

def f():
    global num
    num += 1

def thread(func):
    # return Process(target=func)
    return Thread(target=func)


if __name__ == '__main__':
    t_list = []
    for i in xrange(1, 100000):
        t = thread(f)
        t.start()
        t_list.append(t)

    for t in t_list:
        t.join()

    print num

Basically I should have started 100k threads and incremented by 1. The result returned was 99993.

a) How can the result not be 99999 if there's a GIL syncing and avoiding race conditions? b) Is it even possible to start 100k OS threads?

Update 2, after seeing answers:

If the GIL doesn't really provide a way to perform a simple operation like incrementing atomically, what's the purpose of having it there? It doesn't help with nasty concurrency issues, so why was it put in place? I've heard use cases for C-extensions, can someone examplify this?

Eastbound answered 11/11, 2014 at 20:4 Comment(1)
The GIL is there to protect the Python interpreter itself from concurrency issues, rather than code you write. It's really an implementation detail of CPython, and you shouldn't rely on its behavior in your own code, even though its not likely to go away any time soon.Kempe
C
13

The GIL synchronizes bytecode operations. Only one byte code can execute at once. But if you have an operation that requires more than one bytecode, you could switch threads between the bytecodes. If you need the operation to be atomic, then you need synchronization above and beyond the GIL.

For example, incrementing an integer is not a single bytecode:

>>> def f():
...   global num
...   num += 1
...
>>> dis.dis(f)
  3           0 LOAD_GLOBAL              0 (num)
              3 LOAD_CONST               1 (1)
              6 INPLACE_ADD
              7 STORE_GLOBAL             0 (num)
             10 LOAD_CONST               0 (None)
             13 RETURN_VALUE

Here it took four bytecodes to implement num += 1. The GIL will not ensure that x is incremented atomically. Your experiment demonstrates the problem: you have lost updates because the threads switched between the LOAD_GLOBAL and the STORE_GLOBAL.

The purpose of the GIL is to ensure that the reference counts on Python objects are incremented and decremented atomically. It isn't meant to help you with your own data structures.

Creese answered 11/11, 2014 at 20:18 Comment(3)
I apologize for the resurrecting this thread, but I still cannot wrap my head around it: If incrementing(and decrementing) the variable in Python is not guarded by GIL(as threads may switch due ot I/O or 100tick GIL release, or GIL release request in py3k), how can refcount be guarded by it? I mean cannot GIL be released due to abovementioned reasons just in the middle of refcount-=1 operation?Humanitarian
can someone explain this?Slotter
The GIL ensures that one Python bytecode operates atomically. The refcount manipulations are written in C, and happen while the GIL is held. But incrementing a Python integer is three bytecodes. It can be interrupted between those bytecodes.Creese
U
4

Python's native threading works at the bytecode level. That is, after each bytecode (well, actually, I believe the number of bytecodes is configurable), a thread may yield control to another thread.

Any operation on a shared resource that's not a single bytecode needs a lock. And even if a given operation is, in a certain version of CPython, a single bytecode, that might not be the case in every version of every interpreter, so you'd better use a lock anyway.

Same reason you need locks to begin with, really, except at a VM level rather than at a hardware level.

Unobtrusive answered 11/11, 2014 at 20:13 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.