From the docs - How many workers,
DO NOT scale the number of workers to the number of clients you expect to have. Gunicorn should only need 4-12 worker processes to handle hundreds or thousands of requests per second.
Generally we recommend (2 x $num_cores) + 1 as the number of workers to start off with.
From threads,
The number of worker threads for handling requests.
Run each worker with the specified number of threads.
A positive integer generally in the 2-4 x $(NUM_CORES) range. You’ll want to vary this a bit to find the best for your particular application’s workload.
Now the question is what no of threads and workers can serve hundreds or thousands of requests per second?
Let's say I have a dual-core machine and I set 5 workers and 8 threads. And I can serve 40 concurrent requests?
If I am going to serve hundreds or thousands of requests, I'll need a hundred cores?
this line is very hard to understand:
Gunicorn should only need 4-12 worker processes to handle hundreds or thousands of requests per second.