I've written a HTTP-Server that produces endless HTTP streams consisting of JSON-structured events. Similar to Twitter's streaming API. These events are separated by \n
(according to Server-sent events with Content-Type:text/event-stream) and can vary in length.
The response is
- chunked (HTTP 1.1 Transfer-Encoding:chunked) due to the endless stream
- compressed (Content-Encoding: gzip) to save bandwidth.
I want to consume these lines in Python as soon as they arrive and as resource-efficient as possible, without reinventing the wheel.
As I'm currently using python-requests, do you know how to make it work? If you think, python-requests cannot help here, I'm totally open for alternative frameworks/libraries.
My current implementation is based on requests and uses iter_lines(...)
to receive the lines. But the chunk_size
parameter is tricky. If set to 1
it is very cpu-intense, since some events can be several kilobytes. If set to any value above 1, some events got stuck until the next arrive and the whole buffer "got filled". And the time between events can be several seconds.
I expected that the chunk_size
is some sort of "maximum number of bytes to receive" as in unix's recv(...)
. The corresponding man-page says:
The receive calls normally return any data available, up to the requested amount, rather than waiting for receipt of the full amount requested.
But this is obviously not how it works in the requests-library. They use it more or less as an "exact number of bytes to receive". While looking at their source code, I couldn't identify which part is responsible for that. Maybe httplib's Response or ssl's SSLSocket.
As a workaround I tried padding my lines on the server to a multiple of the chunk-size. But the chunk-size in the requests-library is used to fetch bytes from the compressed response stream. So this won't work until I can pad my lines so that their compressed byte-sequence is a multiple of the chunk-size. But this seems far too hacky.
I've read that Twisted could be used for non-blocking, non-buffered processing of http streams on the client, but I only found code for creating stream responses on the server.
requests
can work around. – Wouldst