Tomcat gzip while chunked issue
Asked Answered
D

1

6

I'm expiriencing some problem with one of my data source services. As it says in HTTP response headers it's running on Apache-Coyote/1.1. Server gives responses with Transfer-Encoding: chunked, here sample response:

HTTP/1.1 200 OK
Server: Apache-Coyote/1.1
Content-Type: text/xml;charset=utf-8
Transfer-Encoding: chunked
Content-Encoding: gzip
Date: Tue, 30 Mar 2010 06:13:52 GMT

And problem is when I'm requesting server to send gzipped request it often sends not full response. I recieve response, see that last chunk recieved, but then after ungzipping I see that response is partial. I never seen such behavior with gzip turned off in request headers.

So my question is: is it common tomcat issue? maybe one of it's mod which is doing compression? Or maybe it maybe some kind of proxy issue? I can't tell about versions of tomcat or what gzip mod they use, but feel free to ask, i'll try ask my service provider.

Thanks.

Doyle answered 7/4, 2010 at 3:38 Comment(3)
What client/library are you using to make the request?Rabassa
Can you post your request headers?Rabassa
I'm using my own partial HTTP implementation as I said it works well with no gzip encoding and in most cases works nice for gzipped, but like 30% of gzipped responses are crap after decomression! My request like: POST example.com/Service HTTP/1.1 Content-Length: 1081 Content-Encoding: gzip Accept-Encoding: gzip Host: example.com User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.1.4322; .NET CLR 1.0.3705) Authorization: Basic UENDN0IySjpTb3KxdWE3YjJq SOAPAction: example.com/Service // and here goes my compressed request..Doyle
M
3

Since the content length of a gzipped response is unpredictable and it's potentially expensive and slow to compress it fully in memory first, then calculate the length and then stream the gzipped response from memory, the average webserver will send them in chunks using Transfer-Encoding: chunked without a Content-Length header.

Since it's a homegrown HTTP client, it sounds like as if it doesn't handle chunked requests correctly. You have to determine the Transfer-Encoding response header and if it equals to chunked, then you have to parse it as a chunked stream.

You can learn from the aforementioned HTTP spec links and from Wikipedia how to parse a chunked stream. Each chunk consists of a header denoting the chunk length in hexadecimal, then a CRLF, then the actual chunk content, then a CRLF. This is repeated until a chunk with a header denoting the chunk length of 0. You need to ungzip the chunks separately and then glue them together.

To save all the tedious coding work (likely also for the remnant of your homegrown HTTP client), I strongly recommend to have a look at Apache HttpComponents Client.

Melise answered 7/4, 2010 at 22:11 Comment(3)
It works perfectly with other sites, and with this service if I turn gzipping off. I actually installed tomcat myself on my work machine and it fails to deliver content sometimes too. I would be happy to think that it's my issue, but if I use .net wrapper to call methods of this service (not my http implementation) it too fails to get full XML response sometimes just like mine client. Are you familiar with tomcat?Doyle
How are you sure that the problem is in Tomcat and not in the server side application running on Tomcat? If we should look in this direction for the cause, then I would check if there isn't any Java (servlet) code which manually gzips the output using GzipOutputStream and if so, then check if it properly invokes close() on the outputstream.Melise
actually, it was really MY code doing chunks removal :) strange i didn't expirience with other webservers then tomcat! i will look further to find difference really.Doyle

© 2022 - 2024 — McMap. All rights reserved.