I'm trying to generate a large PDF using a Flask application. The pdf generation involves generating ten long pdfs, and then merging them together. The application runs using Gunicorn with the flags: --worker-class gevent --workers 2.
Here's what my server-side code looks like:
@app.route ('/pdf/create', methods=['POST', 'GET'])
def create_pdf():
def generate():
for section in pdfs:
yield "data: Generating %s pdf\n\n" % section
# Generate pdf with pisa (takes up to 2 minutes)
yield "data: Merging PDFs\n\n"
# Merge pdfs (takes up to 2 minutes)
yield "data: /user/pdf_filename.pdf\n\n"
return Response(stream_with_context(generate()), mimetype='text/event-stream')
The client side code looks like:
var source = new EventSource(create_pdf_url);
source.onopen = function (event) {
console.log("Creating PDF")
}
source.onmessage = function (event) {
console.log(event.data);
}
source.onerror = function (event) {
console.log("ERROR");
}
When I run without GUnicorn, I get provided with steady, real-time updates from the console log. They look like:
Creating PDF
Generating section one
Generating section two
Generating section three
...
Generating section ten
Merging PDFS
/user/pdf_filename.pdf
When I run this code with Gunicorn, I don't get regular updates. The worker runs until Gunicorn's timeout kills it, then I get a dump of all the messages that should've happened, followed by a final error
Creating PDF
Generating section one
Generating section two
ERROR
The Gunicorn log looks like:
[2015-03-19 21:57:27 +0000] [3163] [CRITICAL] WORKER TIMEOUT (pid:3174)
How can I keep Gunicorn from killing the process? I don't think setting a super-large timeout is a good idea. Perhaps there's something in gunicorn's worker classes that I can use to make sure the process is handled correctly?
source.onmessage
should besource.onerror
, but probably unrelated to solution. – Tallboy