okhttp 3: how to decompress gzip/deflate response manually using Java/Android
Asked Answered
B

4

10

I know that okhttp3 library by default it adds the header Accept-Encoding: gzip and decodes the response automatically for us.

The problem I'm dealing with a host that only accepts a header like: Accept-Encoding: gzip, deflate if I don't add the deflate part it fails. Now when I manually add that header to okhttp client, the library doesn't do the decompression anymore for me.

I've tried multiple solutions to take the response and try to manually decompress that but I've always ended up with an exception i.e. java.util.zip.ZipException: Not in GZIP format, here's what I've tried so far:

//decompresser
public static String decompressGZIP(InputStream inputStream) throws IOException
{
    InputStream bodyStream = new GZIPInputStream(inputStream);
    ByteArrayOutputStream outStream = new ByteArrayOutputStream();
    byte[] buffer = new byte[4096];
    int length;
    while ((length = bodyStream.read(buffer)) > 0) 
    {
        outStream.write(buffer, 0, length);
    }

    return new String(outStream.toByteArray());
}


//run scraper
scrape(api, new Callback()
{
    // Something went wrong
    @Override
    public void onFailure(@NonNull Call call, @NonNull IOException e)
    {
    }

    @Override
    public void onResponse(@NonNull Call call, @NonNull Response response) throws IOException
    {
        if (response.isSuccessful())
        {
            try
            {
                InputStream responseBodyBytes = responseBody.byteStream();
                returnedObject = GZIPCompression.decompress(responseBodyBytes);

                if (returnedObject != null)
                {
                    String htmlResponse = returnedObject.toString();
                }
            }
            catch (ProtocolException e){}

            if(response != null) response.close();
        }
    }
});



private Call scrape(Map<?, ?> api, Callback callback)
{
    MediaType JSON = MediaType.parse("application/json; charset=utf-8");
    String method = (String) api.get("method");
    String url = (String) api.get("url");
    Request.Builder requestBuilder = new Request.Builder().url(url);
    RequestBody requestBody;

    requestBuilder.header("User-Agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:61.0) Gecko/20100101 Firefox/61.0");
    requestBuilder.header("Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8");
    requestBuilder.header("Accept-Language", "en-US,en;q=0.5");
    requestBuilder.header("Accept-Encoding", "gzip, deflate");
    requestBuilder.header("Connection", "keep-alive");
    requestBuilder.header("Upgrade-Insecure-Requests", "1");
    requestBuilder.header("Cache-Control", "max-age=0");

    Request request = requestBuilder.build();

    Call call = client.newCall(request);
    call.enqueue(callback);

    return call;
}

Just a note, the response headers will always return Content-Encoding: gzip and Transfer-Encoding: chunked

One more thing, I've also tried the solution in this topic and it still fails with D/OkHttp: java.io.IOException: ID1ID2: actual 0x00003c68 != expected 0x00001f8b.

Any help would be appreciated..

Backspace answered 17/8, 2018 at 18:51 Comment(0)
B
25

After 6 hours of digging I found the correct solution and as usual it was easier than I thought, so I was basically trying to decompress a page that's not gzipped for that reason it was failing. Now once I hit the second page (which is compressed) I get a gzipped response where the code above should handle it. Also if anyone wants the solution I used a modified interceptor just like the one in this answer so you don't need to use a custom function to handle the decompression.

I modified the unzip method to make the okhttp interceptor work with compressed and uncompressed responses:

OkHttpClient.Builder clientBuilder = new OkHttpClient.Builder().addInterceptor(new UnzippingInterceptor());
OkHttpClient client = clientBuilder.build();

And the Interceptor is like dis:

private class UnzippingInterceptor implements Interceptor {
    @Override
    public Response intercept(Chain chain) throws IOException {
        Response response = chain.proceed(chain.request());
        return unzip(response);
    }
  

// copied from okhttp3.internal.http.HttpEngine (because is private)
private Response unzip(final Response response) throws IOException {
    if (response.body() == null)
    {
        return response;
    }
    
    //check if we have gzip response
    String contentEncoding = response.headers().get("Content-Encoding");
    
    //this is used to decompress gzipped responses
    if (contentEncoding != null && contentEncoding.equals("gzip"))
    {
        Long contentLength = response.body().contentLength();
        GzipSource responseBody = new GzipSource(response.body().source());
        Headers strippedHeaders = response.headers().newBuilder().build();
        return response.newBuilder().headers(strippedHeaders)
                .body(new RealResponseBody(response.body().contentType().toString(), contentLength, Okio.buffer(responseBody)))
                .build();
    }
    else
    {
        return response;
    }
}
}
Backspace answered 17/8, 2018 at 21:2 Comment(2)
I hate it when people don't include import statements.Crowe
thanks work. import okhttp3.Headers; import okhttp3.Interceptor; import okhttp3.OkHttpClient; import okhttp3.Response; import okhttp3.internal.http.RealResponseBody; import okio.GzipSource; import okio.Okio; import org.jetbrains.annotations.NotNull;Pepi
U
1

Version 4.10.0 can already do it automatically if your header contains gzip

Unrivalled answered 28/9, 2022 at 12:6 Comment(0)
P
0

Because okhttp does not support deflate

in BridgeInterceptor.java or BridgeInterceptor.kt

    if (transparentGzip &&
    "gzip".equals(networkResponse.header("Content-Encoding"), ignoreCase = true) &&
    networkResponse.promisesBody()) {
Polydactyl answered 4/10, 2022 at 10:4 Comment(0)
S
0

Thank you for Aksenov Vladimir`s reply. Your answer saved me a lot of time. Everything is working fine after I upgraded okhttp from 3.x to 4.11.

Here are some additional details:

  1. When users explicitly include the "Accept-Encoding: gzip" header, they need to handle the decompression of the response content themselves.
  2. When users do not explicitly specify "Accept-Encoding" and "Range" okhttp will automatically add "Accept-Encoding: gzip" to the request header, and automatically decompress the response content (if "Content-Encoding" is gzip).

The relevant code is as follows: okhttp3.internal.http.BridgeInterceptor

// If we add an "Accept-Encoding: gzip" header field we're responsible for also decompressing
    // the transfer stream.
    var transparentGzip = false
    if (userRequest.header("Accept-Encoding") == null && userRequest.header("Range") == null) {
      transparentGzip = true
      requestBuilder.header("Accept-Encoding", "gzip")
    }

if (transparentGzip &&
        "gzip".equals(networkResponse.header("Content-Encoding"), ignoreCase = true) &&
        networkResponse.promisesBody()) {
      val responseBody = networkResponse.body
      if (responseBody != null) {
        val gzipSource = GzipSource(responseBody.source())
        val strippedHeaders = networkResponse.headers.newBuilder()
            .removeAll("Content-Encoding")
            .removeAll("Content-Length")
            .build()
        responseBuilder.headers(strippedHeaders)
        val contentType = networkResponse.header("Content-Type")
        responseBuilder.body(RealResponseBody(contentType, -1L, gzipSource.buffer()))
      }
    }

Sateia answered 27/3 at 13:9 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.