Content-Length
The Content-Length
header determines the byte length of the request/response body. If you neglect to specify the Content-Length
header, HTTP servers will implicitly add a Transfer-Encoding: chunked
header. The Content-Length
and Transfer-Encoding
header should not be used together. The receiver will have no idea what the length of the body is and cannot estimate the download completion time. If you do add a Content-Length
header, make sure it matches the entire body in bytes, if it is incorrect, the behaviour of receivers is undefined.
The Content-Length
header will not allow streaming, but it is useful for large binary files, where you want to support partial content serving. This basically means resumable downloads, paused downloads, partial downloads, and multi-homed downloads. This requires the use of an additional header called Range
. This technique is called Byte serving.
Transfer-Encoding
The use of Transfer-Encoding: chunked
is what allows streaming within a single request or response. This means that the data is transmitted in a chunked manner, and does not impact the representation of the content.
Officially an HTTP client is meant to send a request with a TE
header field that specifies what kinds of transfer encodings the client is willing to accept. This is not always sent, however most servers assume that clients can process chunked
encodings.
The chunked
transfer encoding makes better use of persistent TCP connections, which HTTP 1.1 assumes to be true by default.
Content-Encoding
It is also possible to compress chunked or non-chunked data. This is practically done via the Content-Encoding
header.
Note that the Content-Length
is equal to the length of the body after the Content-Encoding
. This means if you have gzipped your response, then the length calculation happens after compression. You will need to be able to load the entire body in memory if you want to calculate the length (unless you have that information elsewhere).
When streaming using chunked encoding, the compression algorithm must also support online processing. Thankfully, gzip supports stream compression. I believe that the content gets compressed first, and then cut up in chunks. That way, the chunks are received, then decompressed to acquire the real content. If it were the other way around, you'll get the compressed stream, and then decompressing would give us chunks. Which doesn't make sense.
A typical compressed stream response may have these headers:
Content-Type: text/html
Content-Encoding: gzip
Transfer-Encoding: chunked
Semantically the usage of Content-Encoding
indicates an "end to end" encoding scheme, which means only the final client or final server is supposed to decode the content. Proxies in the middle are not suppose to decode the content.
If you want to allow proxies in the middle to decode the content, the correct header to use is in fact the Transfer-Encoding
header. If the HTTP request possessed a TE: gzip chunked
header, then it is legal to respond with Transfer-Encoding: gzip chunked
.
However this is very rarely supported. So you should only use Content-Encoding
for your compression right now.
Chunked vs Store & Forward