we have the need to export a csv-file comprising data from the model from Django admin which runs on Heroku. Therefore we created an action where we created the csv and returned it in the response. This worked fine until our client started exporting huge sets of data and we run into the 30 second timeout of the Web worker.
To circumvent this problem we thought about streaming the csv to the client instead of building it first in memory and sending it in one piece. Trigger was this piece of information:
Cedar supports long-polling and streaming responses. Your app has an initial 30 second window to respond with a single byte back to the client. After each byte sent (either recieved from > the client or sent by your application) you reset a rolling 55 second window. If no data is > sent during the 55 second window your connection will be terminated.
We therefore implemented something that looks like this to test it:
import cStringIO as StringIO
import csv, time
def csv(request):
csvfile = StringIO.StringIO()
csvwriter = csv.writer(csvfile)
def read_and_flush():
csvfile.seek(0)
data = csvfile.read()
csvfile.seek(0)
csvfile.truncate()
return data
def data():
for i in xrange(100000):
csvwriter.writerow([i,"a","b","c"])
time.sleep(1)
data = read_and_flush()
yield data
response = HttpResponse(data(), mimetype="text/csv")
response["Content-Disposition"] = "attachment; filename=test.csv"
return response
The HTTP header of the download looks like this (from FireBug):
HTTP/1.1 200 OK
Cache-Control: max-age=0
Content-Disposition: attachment; filename=jobentity-job2.csv
Content-Type: text/csv
Date: Tue, 27 Nov 2012 13:56:42 GMT
Expires: Tue, 27 Nov 2012 13:56:41 GMT
Last-Modified: Tue, 27 Nov 2012 13:56:41 GMT
Server: gunicorn/0.14.6
Vary: Cookie
Transfer-Encoding: chunked
Connection: keep-alive
"Transfer-encoding: chunked" would indicate that Cedar is actually streaming the content chunkwise we guess.
Problem is that the download of the csv is still interrupted after 30 seconds with these lines in the Heroku log:
2012-11-27T13:00:24+00:00 app[web.1]: DEBUG: exporting tasks in csv-stream for job id: 56,
2012-11-27T13:00:54+00:00 app[web.1]: 2012-11-27 13:00:54 [2] [CRITICAL] WORKER TIMEOUT (pid:5)
2012-11-27T13:00:54+00:00 heroku[router]: at=info method=POST path=/admin/jobentity/ host=myapp.herokuapp.com fwd= dyno=web.1 queue=0 wait=0ms connect=2ms service=29480ms status=200 bytes=51092
2012-11-27T13:00:54+00:00 app[web.1]: 2012-11-27 13:00:54 [2] [CRITICAL] WORKER TIMEOUT (pid:5)
2012-11-27T13:00:54+00:00 app[web.1]: 2012-11-27 13:00:54 [12] [INFO] Booting worker with pid: 12
This should work conceptually, right? Is there anything we missed?
We really appreciate your help. Tom