It may be that your liveness check in kubernetes is killing your workers.
If your liveness check is configured as an http request to an endpoint in your service, your main request may block the health check request, and the worker gets killed by your platform because the platform thinks that the worker is unresponsive.
That was my case. I have a gunicorn app with a single uvicorn worker, which only handles one request at a time. It worked fine locally but would have the worker sporadically killed when deployed to kubernetes. It would only happen during a call that takes about 25 seconds, and not every time.
It turned out that my liveness check was configuredto hit the /health
route every 10 seconds, time out in 1 second, and retry 3 times. So this call would time out some times but not always.
If this is your case, a possible solution is to reconfigure your liveness check (or whatever health check mechanism your platform uses) so it can wait until your typical request finishes. Or allow for more threads - something that makes sure that the health check is not blocked for long enough to trigger worker kill.
You can see that adding more workers may help with (or hide) the problem.
Also, see this reply to a similar question: https://mcmap.net/q/243355/-why-are-my-gunicorn-python-flask-workers-exiting-from-signal-term
--shm-size
- but no avail. – Easterner--worker-class gevent
. I suspect Simon is right and this was either an out of memory error, or a background process running for too long and the main process (1) decided to kill it. – Hollister