Are Python built-in containers thread-safe?
Asked Answered
L

4

68

I would like to know if the Python built-in containers (list, vector, set...) are thread-safe? Or do I need to implement a locking/unlocking environment for my shared variable?

Leotaleotard answered 9/2, 2010 at 6:17 Comment(0)
M
60

You need to implement your own locking for all shared variables that will be modified in Python. You don't have to worry about reading from the variables that won't be modified (ie, concurrent reads are ok), so immutable types (frozenset, tuple, str) are probably safe, but it wouldn't hurt. For things you're going to be changing - list, set, dict, and most other objects, you should have your own locking mechanism (while in-place operations are ok on most of these, threads can lead to super-nasty bugs - you might as well implement locking, it's pretty easy).

By the way, I don't know if you know this, but locking is very easy in Python - create a threading.lock object, and then you can acquire/release it like this:

import threading
list1Lock = threading.Lock()

with list1Lock:
    # change or read from the list here
# continue doing other stuff (the lock is released when you leave the with block)

In Python 2.5, do from __future__ import with_statement; Python 2.4 and before don't have this, so you'll want to put the acquire()/release() calls in try:...finally: blocks:

import threading
list1Lock = threading.Lock()

try:
    list1Lock.acquire()
    # change or read from the list here
finally:
    list1Lock.release()
# continue doing other stuff (the lock is released when you leave the with block)

Some very good information about thread synchronization in Python.

Mullane answered 9/2, 2010 at 6:29 Comment(5)
I believe that, for someone that hasn't used threading locks before, it should be noted that the lock (in your example, list1Lock) should be shared between the threads, in order for it to work correctly. Two independent locks, one for each thread, would lock nothing, just add silly overhead.Preconceive
Shouldn't this be: with list1Lock: # Do stuffVey
but consider this: “In practice, it means that operations on shared variables of builtin data types (int, list, dict, etc) that “look atomic” really are.” (effbot source). So such operations are thread-safe?Fountain
"but it wouldn't hurt"... either necessary or not.Diseuse
Python docs make only very occasional guarantees about thread-safety (e.g., deque). So if you want to write code that works on all implementations, you typically need locks. However, many people only care about their own python implementation. For example, in CPython many collection methods are thread-safe (and will likely remain so in the future since a lot of code relies on that). Locking may carry a significant overhead; to avoid it, it may be necessary to rely on implementation-specific behavior.Booze
T
12

Yes, but you still need to be careful of course

For example:

If two threads are racing to pop() from a list with only one item, One thread will get the item successfully and the other will get an IndexError

Code like this is not thread-safe

if L:
    item=L.pop() # L might be empty by the time this line gets executed

You should write it like this

try:
    item=L.pop()
except IndexError:
    # No items left
Tetrachord answered 9/2, 2010 at 6:31 Comment(3)
I want pop() to be threadsafe, but I can't find this fact anywhere in the documentation. Can someone help me source this claim before I take it as gospel?Patter
Really? list.pop() is not thread-safe? I saw another article claiming the opposite. effbot.org/pyfaq/…Colter
@Zhongjun'Mark'Jin he said it IS thread safe.. but that doesn't mean you don't have to consider the other threads. If one thread pops the last item and then another thread tries to pop too, it'll get IndexError, as he says.Volt
B
7

They are thread-safe as long as you don't disable the GIL in C code for the thread.

Burrstone answered 9/2, 2010 at 6:28 Comment(6)
This is an implementation detail of CPython you shouldn't relay on. It's possibly going to chance in the future and other implementations don't have it.Rutland
Georg - this aspect of python kind of terrefies me. Never mind all the bugs that will drop out of java programs when 8 cores become common on the desktop - what happens when GIL is removed and multithreaded python apps are suddenly running on 8 core boxes?Tupler
It shouldn't terrify anybody if they don't pretend their code is thread-safe when it clearly isn't. :)Micrococcus
Kylotan - I take this as tongue in cheek (hope that's correct!), but still the following is worth saying. Can you (or anybody) hand on heart say they have never written multithreaded code that has is free of race conditions? I may +think+ I have, but I am realistic enough to accept that I have not achieved this in all cases. GIL is a safety net that could give false confidence in this regard, inculcating bad habits that are then swept away when/if the default python container purges GIL at a time when 8+ PCs are common. This may be a moot point as some say GIL will never go. Time will tell!Tupler
@GeorgSchölly This is an implementation detail of CPython you shouldn't relay on Any reference?Eaglestone
@PiotrDobrogost: The GIL doesn't exist in all Python implementation. python.org: Jython and IronPython have no GIL and can fully exploit multiprocessor systems. Also, CPython tries to get away from the GIL in the long run.Rutland
E
2

The queue module implements multi-producer, multi-consumer queues. It is especially useful in threaded programming when information must be exchanged safely between multiple threads. The Queue class in this module implements all the required locking semantics.

https://docs.python.org/3/library/queue.html

Exserviceman answered 30/10, 2019 at 17:44 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.