Embedding python in multithreaded C application
Asked Answered
M

5

21

I'm embedding the python interpreter in a multithreaded C application and I'm a little confused as to what APIs I should use to ensure thread safety.

From what I gathered, when embedding python it is up to the embedder to take care of the GIL lock before calling any other Python C API call. This is done with these functions:

gstate = PyGILState_Ensure();
// do some python api calls, run python scripts
PyGILState_Release(gstate);

But this alone doesn't seem to be enough. I still got random crashes since it doesn't seem to provide mutual exclusion for the Python APIs.

After reading some more docs I also added:

PyEval_InitThreads();

right after the call to Py_IsInitialized() but that's where the confusing part comes. The docs state that this function:

Initialize and acquire the global interpreter lock

This suggests that when this function returns, the GIL is supposed to be locked and should be unlocked somehow. but in practice this doesn't seem to be required. With this line in place my multithreaded worked perfectly and mutual exclusion was maintained by the PyGILState_Ensure/Release functions.
When I tried adding PyEval_ReleaseLock() after PyEval_ReleaseLock() the app dead-locked pretty quickly in a subsequent call to PyImport_ExecCodeModule().

So what am I missing here?

Materiality answered 16/5, 2012 at 19:43 Comment(0)
M
3

Eventually I figured it out.
After

PyEval_InitThreads();

You need to call

PyEval_SaveThread();

While properly release the GIL for the main thread.

Materiality answered 22/5, 2012 at 12:33 Comment(4)
This is wrong and potentially harmful: PyEval_SaveThread should always be in conjunction with PyEval_RestoreThread. As explained elsewhere, you shouldn't try to release the lock after initializing it; just leave it to Python to release it as part of its regular work.Limestone
I don't see why is it harmful if you put all the calls to python in a Block Allow blocks. On the other hand, if you don't call PyEval_SaveThread(); then your main thread will block the access of other threads to Python. In other words PyGILState_Ensure() deadlocks.Fairchild
This is the only thing that works for both embedding Python and calling into an extension module.Matron
Indeed, PyEval_SaveThread() must be called by the thread that called PyEval_InitThreads(), or a deadlock will happen when a thread tries to call PyGILState_Ensure() (since the GIL is not available for retrieval). PyEval_RestoreThread() should eventually be called by the same thread that called PyEval_SaveThread(), but at that point, it is important that all threads that may call PyGILState_Ensure() have finished, or a deadlock may happen, for just the same reason.Subminiature
C
9

I had exactly the same problem and it is now solved by using PyEval_SaveThread() immediately after PyEval_InitThreads(), as you suggest above. However, my actual problem was that I used PyEval_InitThreads() after PyInitialise() which then caused PyGILState_Ensure() to block when called from different, subsequent native threads. In summary, this is what I do now:

  1. There is global variable:

    static int gil_init = 0; 
    
  2. From a main thread load the native C extension and start the Python interpreter:

    Py_Initialize() 
    
  3. From multiple other threads my app concurrently makes a lot of calls into the Python/C API:

    if (!gil_init) {
        gil_init = 1;
        PyEval_InitThreads();
        PyEval_SaveThread();
    }
    state = PyGILState_Ensure();
    // Call Python/C API functions...    
    PyGILState_Release(state);
    
  4. From the main thread stop the Python interpreter

    Py_Finalize()
    

All other solutions I've tried either caused random Python sigfaults or deadlock/blocking using PyGILState_Ensure().

The Python documentation really should be more clear on this and at least provide an example for both the embedding and extension use cases.

Cadell answered 26/1, 2014 at 15:56 Comment(0)
M
3

Eventually I figured it out.
After

PyEval_InitThreads();

You need to call

PyEval_SaveThread();

While properly release the GIL for the main thread.

Materiality answered 22/5, 2012 at 12:33 Comment(4)
This is wrong and potentially harmful: PyEval_SaveThread should always be in conjunction with PyEval_RestoreThread. As explained elsewhere, you shouldn't try to release the lock after initializing it; just leave it to Python to release it as part of its regular work.Limestone
I don't see why is it harmful if you put all the calls to python in a Block Allow blocks. On the other hand, if you don't call PyEval_SaveThread(); then your main thread will block the access of other threads to Python. In other words PyGILState_Ensure() deadlocks.Fairchild
This is the only thing that works for both embedding Python and calling into an extension module.Matron
Indeed, PyEval_SaveThread() must be called by the thread that called PyEval_InitThreads(), or a deadlock will happen when a thread tries to call PyGILState_Ensure() (since the GIL is not available for retrieval). PyEval_RestoreThread() should eventually be called by the same thread that called PyEval_SaveThread(), but at that point, it is important that all threads that may call PyGILState_Ensure() have finished, or a deadlock may happen, for just the same reason.Subminiature
R
2

Note that the if (!gil_init) { code in @forman's answer runs only once, so it can be just as well done in the main thread, which allows us to drop the flag (gil_init would properly have to be atomic or otherwise synchronized).

PyEval_InitThreads() is meaningful only in CPython 3.6 and older, and has been deprecated in CPython 3.9, so it has to be guarded with a macro.

Given all this, what I am currently using is the following:

In the main thread, run all of

Py_Initialize();
PyEval_InitThreads(); // only on Python 3.6 or older!
/* tstate = */ PyEval_SaveThread(); // maybe save the return value if you need it later

Now, whenever you need to call into Python, do

state = PyGILState_Ensure();
// Call Python/C API functions...    
PyGILState_Release(state);

Finally, from the main thread, stop the Python interpreter

PyGILState_Ensure(); // PyEval_RestoreThread(tstate); seems to work just as well
Py_Finalize()
Roundabout answered 22/1, 2022 at 18:16 Comment(0)
D
0

For those wondering how to make the other suggestions work with openMP multithreading, here is a solution.

First, a few definitions

#define PY_SAVE_MASTER_THREAD \
    PyThreadState *_save = NULL; \
    _Pragma("omp master") \
    _save = PyEval_SaveThread();
#define PY_RESTORE_MASTER_THREAD \
    _Pragma("omp master") \
    PyEval_RestoreThread( _save );
#define PY_ACQUIRE_GIL PyGILState_STATE _state = PyGILState_Ensure();
#define PY_RELEASE_GIL PyGILState_Release( _state );

Then, whenever you start an openMP loop:

PY_SAVE_MASTER_THREAD
#pragma omp for
for( /*....*/ ) {
    PY_ACQUIRE_GIL
    // python calls
    PY_RELEASE_GIL
}
PY_RESTORE_MASTER_THREAD
Deity answered 28/6, 2024 at 7:43 Comment(0)
H
-3

Having a multi-threaded C app trying to communicate from multiple threads to multiple Python threads of a single CPython instance looks risky to me.

As long as only one C thread communicates with Python you should not have to worry about locking even if the Python application is multi-threading. If you need multiple python threads you can set the application up this way and have multiple C threads communicate via a queue with that single C thread that farms them out to multiple Python threads.

An alternative that might work for you is to have multiple CPython instances one for each C thread that needs it (of course communication between Python programs should be via the C program).

Another alternative might the Stackless Python interpreter. That does away with the GIL, but I am not sure you run into other problems binding it to multiple threads. stackless was a drop-in replacement for my (single-threaded) C application.

Hestia answered 17/5, 2012 at 5:50 Comment(6)
You did your best not actually answering the question. I'm not interested in queuing the work to a single thread.Materiality
Your not answering the questionAnacoluthon
@SimonBanks ^Your^You're^Hestia
@Athhon, I'm dyslexic they all look the same to me. You get the point I'm sure.. ;-)Anacoluthon
Your answer didn't attempt to give the problem context or any form of definition.Anon
You offered what seems to be multiple solutions without any way to differentiate how one would be different from the other, let alone the specific problem/s each may solve.Anon

© 2022 - 2025 — McMap. All rights reserved.