Sharing data across my gunicorn workers

Asked 18/4, 2017 at 8:29 Answered 21/6, 2021 at 14:38

I have a Flask app, served by Nginx and Gunicorn with 3 workers. My Flask app is a API microservice designed for doing NLP stuff and I am using the spaCy library for it.

My problem is that they are taking huge number of RAM as loading the spaCy pipeline spacy.load('en') is very memory-intensive and since I have 3 gunicorn workers, each will take about 400MB of RAM.

My question is, is there a way to load the pipeline once and share it across all my gunicorn workers?

Certainly answered 18/4, 2017 at 8:29 Comment(6)

Maybe you can use preload_app of gunicorn. see https://mcmap.net/q/49124/-sharing-memory-in-gunicorn – Lucianolucias 18/4, 2017 at 8:40

Have you found a solution? – Colander 17/4, 2018 at 5:58

@Lee, anything you have found out in this area? – Mattress 30/12, 2018 at 9:39

How do you use your gunicorn worker, i.e., thread or process? If process can you use redis? – Odilia 13/3, 2021 at 11:15

This question is almost as if it's posted by me. I have the exact same setup! – Meal 21/6, 2021 at 14:30

Honestly I sorta redesigned my NLP pipeline to be deployed on AWS Lambda, no nginx/gunicorn/servers to manage is indeed a blessing. – Certainly 21/10, 2021 at 7:57

I need to share Gigabytes of data among instances and use a memory mapped file (https://docs.python.org/3/library/mmap.html). If the amount of data you need to retrieve per request from the pool is small this works fine. Otherwise you can mount a ramdisk where you locate the mounted file.

As I am not familiar with SpaCy I am not sure if this helps. I would have one worker for actually processing the data while loading (spacy.load?) and writing the resulting doc (pickling/marshalling) to the mmf where the other workers can read from it.

To get a better feel of mmap have a look at https://realpython.com/python-mmap/

Peraza answered 9/3, 2021 at 9:7 Comment(0)

One workaround is, you can load the spaCy pipeline before-hand, pickle (or any comfortable way of serializing) the resultant object and store it in a DB or file system. Each worker can just fetch the serialized object, and simply deserialize it.

Moire answered 20/1, 2021 at 7:53 Comment(2)

Would pickling it in a module work? I am thinking to load the module in a script, import it, then launch the workers. If doing this would the workers pick up the import? – Caduceus 18/2, 2021 at 15:24

When you load the module in a script and import, the import functionality will work for sure. But you will still face the problem of high memory consumption because each worker will load it independently again. You have to pickle it separately before launching the Flask app and store the serialized file in a place accessible by the workers. – Moire 19/2, 2021 at 9:20

Sharing the pipeline in memory between workers may help you.

Please check gc.freeze

I think, just do this, in your app.py:

freeze the gc
load pipeline or any other resources that is going to use a big amount of memory
unfreeze gc

and,

make sure your worker will not modify (directly or indirectly) any object created during freezing
pass the app.py to gunicorn

When fork happens, those memory page holding big resources will not be truly copied by os, because you make sure there are no write operations on it.

If you do not freeze gc, the memory pages will still be written, because gc is writing object reference counts. That why freeze matters.

I just know this way but I didn't try it.

Astronautics answered 20/1, 2021 at 9:14 Comment(2)

By the way, gc freezing is firstly added in python3.7 – Astronautics 20/1, 2021 at 9:15

freeze > load > unfreeze results in unresponsive workers when workers restart after serving max_requests. load followed by freeze is what worked for us. – Shifty 16/11, 2022 at 11:42

This is an answer that works in 2021 using Python3.6 and 3.9 (tested both). I had the same setup as you, using flask to deploy a Spacy NLU API. The solution was to simply was to append --preload to the gunicorn command like so: gunicorn src.main:myFlaskApp --preload. This would cause the fork to happen after the entire src/main.py file has been executed, and not after the myFlaskApp = Flask(__name__).

Meal answered 21/6, 2021 at 14:38 Comment(0)

Recommended topics

Hot tags