How to decode/inflate a chunked gzip string?
Asked Answered
R

3

9

After making a gzip deflate request in PHP, I receive the deflated string in offset chunks, which looks like the following

Example shortened greatly to show format:

00001B4E
¾”kŒj…Øæ’ìÑ«F1ìÊ`+ƒQì¹UÜjùJƒZ\µy¡ÓUžGr‡J&=KLËÙÍ~=ÍkR
0000102F
ñÞœÞôΑüo[¾”+’Ñ8#à»0±R-4VÕ’n›êˆÍ.MCŽ…ÏÖr¿3M—èßñ°r¡\+
00000000

I'm unable to inflate that presumably because of the chunked format. I can confirm the data is not corrupt after manually removing the offsets with a Hex editor and reading the gzip archive. I'm wondering if there's a proper method to parse this chunked gzip deflated response into a readable string?

I might be able to split these offsets and join the data together in one string to call gzinflate, but it seems there must be an easier way.

Ratchford answered 3/4, 2012 at 12:50 Comment(0)
V
11

The proper method to deflate a chunked response is roughly as follows:

initialise string to hold result
for each chunk {
  check that the stated chunk length equals the string length of the chunk
  append the chunk data to the result variable
}

Here's a handy PHP function to do that for you (FIXED):

function unchunk_string ($str) {

  // A string to hold the result
  $result = '';

  // Split input by CRLF
  $parts = explode("\r\n", $str);

  // These vars track the current chunk
  $chunkLen = 0;
  $thisChunk = '';

  // Loop the data
  while (($part = array_shift($parts)) !== NULL) {
    if ($chunkLen) {
      // Add the data to the string
      // Don't forget, the data might contain a literal CRLF
      $thisChunk .= $part."\r\n";
      if (strlen($thisChunk) == $chunkLen) {
        // Chunk is complete
        $result .= $thisChunk;
        $chunkLen = 0;
        $thisChunk = '';
      } else if (strlen($thisChunk) == $chunkLen + 2) {
        // Chunk is complete, remove trailing CRLF
        $result .= substr($thisChunk, 0, -2);
        $chunkLen = 0;
        $thisChunk = '';
      } else if (strlen($thisChunk) > $chunkLen) {
        // Data is malformed
        return FALSE;
      }
    } else {
      // If we are not in a chunk, get length of the new one
      if ($part === '') continue;
      if (!$chunkLen = hexdec($part)) break;
    }
  }

  // Return the decoded data of FALSE if it is incomplete
  return ($chunkLen) ? FALSE : $result;

}
Vascular answered 3/4, 2012 at 13:11 Comment(3)
Excellent, works just as expected. That is a handy PHP function indeed, I've been seeking this for awhile now. Much thanks!Ratchford
@Ratchford I have updated the above function, it had an error surrounding the behaviour when the string contains a literal CRLF. This has now been fixed, and this has also provided better detection of malformed strings.Vascular
Thanks again! For anyone still having problems, after calling unchunk_string all I need to do is remove the first 10 bytes using: $data = gzinflate(substr($data,10));Ratchford
W
3

To decode a String use gzinflate, Zend_Http_Client lib will help to do this kind of common tasks, its wasy to use, Refer Zend_Http_Response code if you need to do it on your own

Witting answered 3/4, 2012 at 12:55 Comment(1)
Unfortunately I already tried the method that lib uses, but it does contain some code I might need in the future, thanks!Ratchford
R
1

The solution from user @user1309276 really helped me! Received from the server a gzip-compressed json response with transfer-encoding: chunked header. None of the solutions helped. This solution works like magic for me! It just remove the first 10 bytes.

$data = json_decode(gzinflate(substr($response->getContent(), 10)), true);
Recycle answered 21/12, 2022 at 15:31 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.