Loading an eventstream through Gunicorn + Flask
Asked Answered
E

1

8

I'm trying to generate a large PDF using a Flask application. The pdf generation involves generating ten long pdfs, and then merging them together. The application runs using Gunicorn with the flags: --worker-class gevent --workers 2.

Here's what my server-side code looks like:

@app.route ('/pdf/create', methods=['POST', 'GET'])
def create_pdf():
    def generate():
        for section in pdfs:
            yield "data: Generating %s pdf\n\n" % section
            # Generate pdf with pisa (takes up to 2 minutes)

        yield "data:  Merging PDFs\n\n"
        # Merge pdfs (takes up to 2 minutes)
        yield "data: /user/pdf_filename.pdf\n\n"

    return Response(stream_with_context(generate()), mimetype='text/event-stream')

The client side code looks like:

var source = new EventSource(create_pdf_url);
source.onopen = function (event) {
  console.log("Creating PDF")
}
source.onmessage = function (event) {
    console.log(event.data);
}
source.onerror = function (event) {
    console.log("ERROR");
}

When I run without GUnicorn, I get provided with steady, real-time updates from the console log. They look like:

Creating PDF
Generating section one
Generating section two
Generating section three
...
Generating section ten
Merging PDFS
/user/pdf_filename.pdf

When I run this code with Gunicorn, I don't get regular updates. The worker runs until Gunicorn's timeout kills it, then I get a dump of all the messages that should've happened, followed by a final error

Creating PDF
Generating section one
Generating section two
ERROR

The Gunicorn log looks like:

[2015-03-19 21:57:27 +0000] [3163] [CRITICAL] WORKER TIMEOUT (pid:3174)

How can I keep Gunicorn from killing the process? I don't think setting a super-large timeout is a good idea. Perhaps there's something in gunicorn's worker classes that I can use to make sure the process is handled correctly?

Erythrocytometer answered 20/3, 2015 at 16:18 Comment(2)
The second source.onmessage should be source.onerror, but probably unrelated to solution.Tallboy
Good catch. Edited to fix typo.Erythrocytometer
E
1

I ended up solving the problem using Celery.

I used this example to guide me in setting up Celery.

Then I used Grinberg's Celery tutorial to stream real-time updates to the user's browser.

Erythrocytometer answered 30/3, 2015 at 15:29 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.