What's the point of multithreading in Python if the GIL exists?

Asked 25/9, 2018 at 22:46 Answered 28/12, 2022 at 6:3

python multithreading python-multiprocessing python-multithreading

From what I understand, the GIL makes it impossible to have threads that harness a core each individually.

This is a basic question, but, what is then the point of the threading library? It seems useless if the threaded code has equivalent speed to a normal program.

Kenosis answered 25/9, 2018 at 22:46 Comment(4)

It can be used to unblock the main thread (for example GUI application or similar). If you want to use multiple cores, you should try multiprocessing (docs.python.org/3.7/library/multiprocessing.html) – Martyrdom 25/9, 2018 at 23:15

Try this question. Short answer: it can useful, but maybe not in the way you're imagining. Only one thread can process Python at a time due to the GIL, meaning threaded programs still run serially. The multiprocessing library is more helpful to what you seem to be looking for, as it can actually spawn processes that harness individual cores. – Retinoscopy 25/9, 2018 at 23:15

Thanks @questionable_code and @Tom for your help. I'm looking into multi-processing and I'll probably have to use it. I'm still curious to why they even have the threading library. It seems it's more for code organization. – Kenosis 25/9, 2018 at 23:22

See this (dabeaz.com/python/UnderstandingGIL.pdf) and this (dabeaz.com/python/GIL.pdf) talks, quite interesting. It seems that multithreaded programs work much faster on 1 core than on 2 or 4. The talks are quite old (2010) and there's some mention of a new GIL in Python 3.x, but I didn't try it. – Martyrdom 25/9, 2018 at 23:32

In some cases an application may not utilize even one core fully and using threads (or processes) may help to do that.

Think of a typical web application. It receives requests from clients, does some queries to the database and returns data back to the client. Given that IO operation is order of magnitude slower than CPU operation most of the time such application is waiting for IO to complete. First, it waits to read the request from the socket. Then it waits till the request to the database is written into the socket opened to the DB. Then it waits for response from the database and then for response to be written to the client socket.

Waiting for IO to complete may take 90% (or more) of the time the request is processed. When single threaded application is waiting on IO it just not using the core and the core is available for execution. So such application has a room for other threads to execute even on a single core.

In this case when one thread waits for IO to complete it releases GIL and another thread can continue execution.

Surcingle answered 25/9, 2018 at 23:23 Comment(3)

So, one can conclude that the python threading module is useful when writing IO bound programs. Is that correct? Are there other situations to consider? – Mullet 1/5, 2022 at 18:35

@Mullet plenty of others, even some CPU bound. – Trimester 28/12, 2022 at 6:47

@Mullet - sounds about right – Keek 2/5, 2024 at 3:50

The threading library works very well despite the presence of the GIL.

Before I explain, you should know that Python's threads are real threads - they are normal operating system threads running the Python interpreter. The GIL (or Global Interpreter Lock) is only taken when running pure Python code, and in many cases is completely released and not even checked.

The GIL does not prevent these operations from running in parallel:

IO operations, such as sending & receiving network data or reading/writing to a file.
Heavy builtin CPU bound operations, such as hashing or compressing.
Some C extension operations, such as numpy calculations.

Any of these (and plenty more) would run perfectly fine in a parallel fashion, and in the majority of the programs these are the heftier parts taking the longest time.

Building an example API in Python that takes astronomical data and calculates trajectories would mean that:

Processing the input and assembling the network packets would be done in parallel.
The trajectory calculations should they be in numpy would all be parallel.
Adding the data to a database would be parallel.
Returning the data over the network would be parallel.

Basically the GIL won't affect the vast majority of the program runtime.

Moreover, at least for networking, other methodologies are more prevalent these days such as asyncio which offers cooperative multi-tasking on the same thread, effectively eliminating the downside of thread overload and allowing for considerably more connections to run at the same time. By utilizing that, the GIL is not even relevant.

The GIL can be a problem and make threading useless in programs that are CPU intensive while running pure Python code, such as a simple program calculating Fibonacci's numbers, but in the majority of real world cases, unless you're running an enormously scaled website such as Youtube (which admittedly has encountered problems), the GIL is not a significant concern.

Trimester answered 28/12, 2022 at 6:3 Comment(9)

When you say "The GIL does not prevent these operations from running concurrently" does that mean the points you laid down, are run in concrurrent fashion and NOT parallel? In that case, how is this statement true "Python's threads are real threads"? because thread means parallel – Dockhand 10/8, 2023 at 9:11

@Dockhand for the purposes of this answer, concurrent == parallel. I apologize for the confusion, and have updated the answer accordingly. – Trimester 10/8, 2023 at 10:35

Thanks for the reply. So just to clarify, Are Python threads actually parallel when doing IO using Python C wrappers like: file.open, file.write, file.read, socket.send, socket.recv ? – Dockhand 11/8, 2023 at 5:13

@Dockhand Yup. – Trimester 11/8, 2023 at 12:16

So, if threads are parallel for calls to C API, for networking specifically, let's suppose you are creating a socket server. Why would one choose to use concurrent (async) e.g asyncio.start_server version over a server that spawns a new thread for each new connection? I know there is a overhead for creating (and managing) threads, but in concurrent version you also have context switching overhead. And in the concurrent version, the clients would not see the messages in real time, yes its so fast you don't notice, but it wouldn't be real time – Dockhand 12/8, 2023 at 7:39

@Dockhand the overhead for creating threads is considerable. Having 5k threads sleeping and waiting for an event is slower than filling the system buffer and context switching between 5k connections. It is applicable regardless of the language you choose to program with, GIL or not. The majority of languages multiplex connections over each system thread. In Python you multiplex over a single thread (using asyncio) until the CPU usage becomes a considerable portion (due to "glue" code), at which point you use multiple processes (using for example gunicorn). – Trimester 14/8, 2023 at 23:38

@Dockhand asyncio has further optimizations such as using epoll / IO completion ports which are more effective single-thread scheduling mechanisms, each using different kernel / network driver abilities. Many of the speed gains are within the operating system itself, regardless of Python. – Trimester 14/8, 2023 at 23:50

Thanks for this answer. My understanding from this is -- basically, if Python thread is not running Python byte-code but C code (the 3 cases mentioned above), then GIL is released. – Solarium 24/10, 2023 at 18:45

@YunWu more accurately, Python never releases while it runs byte-code, but sometimes releases on C code. – Trimester 25/10, 2023 at 4:14

Strictly speaking, CPython supports multi-io-bound-thread + single-cpu-bound-thread.

I/O bound method: file.open, file.write, file.read, socket.send, socket.recv, etc. When Python calls these I/O functions, it will release GIL and acquire GIL after I/O function returns implicitly.
CPU bound method: arithmetic calculation, etc.
C extension method: method must call PyEval_SaveThread and PyEval_RestoreThread explicitly to tell the Python interpreter what you are doing.

Visually answered 30/4, 2020 at 10:42 Comment(0)

Please read this: https://opensource.com/article/17/4/grok-gil There're two concepts here:

Cooperative multi-tasking: When one thread perform i/o bound tasks, it surrenders lock on GIL so other threads may proceed.
Preemptive multi-tasking: Essentially every thread runs for a certain duration (in terms of number of byte codes executed or time), it surrender the lock so other threads can proceed. So while one thread runs at a time, (1) means we're still utilizing the core most efficiently - note this is not helping with CPU bound workloads. And (2) means each threads get a fair amount of CPU time allocated.

Orthocephalic answered 28/12, 2022 at 5:29 Comment(0)

Recommended topics

Hot tags