I've read some posts from 2013 that the Gunicorn team was planning to build a threaded buffering layer worker model, similar to how Waitress works. Is that what the gthread async worker does? The gthread workers were released with version 19.0 in 2014.
Waitress has a master async thread that buffers requests, and enqueues each request to one of its sync worker threads when the request I/O is finished.
Gunicorn gthread doesn't have much documentation, but it sounds similar. From the docs:
The worker gthread is a threaded worker. It accepts connections in the main loop, accepted connections are are added to the thread pool as a connection job.
I only ask because I am not super knowledgeable about python async I/O code, though a cursory reading of the gthread.py seems to indicate that it is a socket-buffering process that protects worker threads from long-I/O requests (and buffers the response I/O as well).
https://github.com/benoitc/gunicorn/blob/master/gunicorn/workers/gthread.py
concurrent.futures
. docs.python.org/3/library/concurrent.futures.html – Sanbenito