I am fairly new to creating web services in Python. I have created a Flask web service successfully and run it with Gunicorn (as Flask’s built-in server is not suitable for production). This is how I run my app (with 4 worker nodes).
gunicorn --bind 0.0.0.0:5000 My_Web_Service:app -w 4
The problem is, this only handles 4 requests at a time. I want it to be able to handle potentially 1000's of requests concurrently. Should I be using multi-threading? Any other options/suggestions?