An equivalent to Java volatile in Python
Asked Answered
G

2

26

Does Python have the equivalent of the Java volatile concept?

In Java there is a keyword volatile. As far as I know, when we use volatile while declaring a variable, any change to the value of that variable will be visible to all threads running at the same time.

I wanted to know if there is something similar in Python, so that when the value of a variable is changed in a function, its value will be visible to all threads running at the same time.

Giaimo answered 14/12, 2018 at 12:57 Comment(0)
S
48

As far as I know, when we use volatile while declaring a variable, any change to the value of that variable will be visible to all threads running at the same time.

volatile is a little more nuanced than that. volatile ensures that Java stores and updates the variable value in main memory. Without volatile, the JVM is free to store the value in the CPU cache instead, which has the side effect of updates to the value being invisible to different threads running on different CPU cores (threads that are being run concurrently on the same core would see the value).

Python doesn't ever do this. Python stores all objects on a heap, in main memory. Moreover, due to how the Python interpreter loop uses locking (the GIL), only one thread at a time will be actively running Python code. There is never a chance that different threads are running a Python interpreter loop on a different CPU.

So you don't need to use volatile in Python, there is no such concept and you don't need to worry about it.

Swain answered 14/12, 2018 at 13:6 Comment(11)
Interesting. Does it make Python slower for some multi-threaded applications?Ridenour
Isn't the GIL just an implementation artifact of CPython?Kasandrakasevich
@Dici: yes, CPython can't run Python code in parallel, only concurrently. (Native code spawned from Python is not limited that way, so numpy and other extensions are not bound by this restriction).Swain
@MarkRotteveel: yes, but IronPython and Jython and PyPy also don't have volatile because they too just use objects on a heap.Swain
I'm annoying. What difference do you make between running code in parallel and running it concurrently? Do you mean it's parallel as long as no shared data is modified?Ridenour
@Dici: switching rapidly between threads on a single CPU means the threads run concurrently. Running two threads at the same time on two different CPU cores means they run in parallel. Python can only do the former.Swain
@Dici: Concurrency is a broader concept that includes parallel execution, but does covers other forms of concurrency.Swain
@MartijnPieters what about read-modify-write operations such as a = a + 1 in python? As far as I know they can cause a race condition if we don't use a lock. Thread A reads the value of a and it is 5, but doesn't get to modify it and write it back to a for example, then python can give control to Thread B, it also reads 5, since Thread A didn't modify it, now both threads have read 5, then Thread B continues, it adds 1 to 5, and writes 6 to a. At the point Thread A continues to work, it previously read 5, now it adds 1 to it and writes 6 back to a.Prefigure
@MartijnPieters does this mean, that locking a code block that does a = a + 1 guarantee visibility of Thread B's update when we continue with Thread A somehow? How does it work?Prefigure
@pavel_orekhov I think you're correct i.e. updating variable a in multiple threads without locking might (will) lead to incorrect behavior, but it's not due to an issue with CPU cache / memory visibility but rather due to lack of atomicity. As a comparison, Java's volatile also will not guarantee atomicity, so when you'll run volatile int a; a = a + 1 within multiple threads similar problem will occur.Christos
@pavel_orekhov: anything that requires multiple byte code operations, or opcodes, is subject to race conditions. a = a + 1 requires two opcodes, and the + operator can be overloaded in Python code so can trigger code paths with many more opcodes. If in doubt, check with the Python bytecode disassembler how many opcodes expressions use, but take into account any hooks that could be invoked.Swain
C
-5

The keyword "global" is what you are looking for:

import threading

queue = []
l = threading.Lock()

def f():
    global queue
    l.acquire()
    queue.append(1)
    l.release()

def g():
    print(queue)

threads = [
    threading.Thread(target=f),
    threading.Thread(target=g)
]
for t in threads:
    t.start()
for t in threads:
    t.join()
Corycorybant answered 14/12, 2018 at 13:10 Comment(4)
Thanks, how about queue = [] in your code? Since it's declared outside all functions, isn't it also considered a global variable?Giaimo
The keyword global is there to allow you to modify a variable bounded elsewhere in the program. If you don't do this, you will only have read access to itCorycorybant
This has nothing to do with global. The issue the OP is talking about would apply equally to attributes of instances, which are visible in Python just fine, and can be made public in Java.Swain
Moreover, because you never attempt to bind to the queue name in your code, the global queue statement is entirely redundant. It can be removed from your example with no difference in behaviour.Swain

© 2022 - 2024 — McMap. All rights reserved.