Is it necessary to use multiple gzip members for input larger than 4GB?
Asked Answered
C

1

0

By stating

Features:

  • no 4GB limit

...

Idzip just uses multiple gzip members to have no file size limit.

the author of idzip seems to imply that multiple gzip members are necessary to support data > 4GB.

But the deflate algorithm, whose output gzip members merely wrap with header and footer, evidently supports more than 4GB of input.

So is it really necessary to use multiple gzip members to compress more than 4GB of data?

Carbonization answered 6/3, 2015 at 2:29 Comment(0)
C
1

Even .net's GZipStream, which does not support multiple members (contrary to the spec btw), nevertheless supports gzip files with more 4GB, now that (since .net 4.0) the underlying DeflateStream supports it.

So that would seal it: Multiple gzip members are NOT necessary for input greater than 4GB.

The gzip specs do not constrain the size either:

  Each member has the following structure:

     +---+---+---+---+---+---+---+---+---+---+
     |ID1|ID2|CM |FLG|     MTIME     |XFL|OS | (more-->)
     +---+---+---+---+---+---+---+---+---+---+

... [omitting optional headers]

     +=======================+
     |...compressed blocks...| (more-->)
     +=======================+

       0   1   2   3   4   5   6   7
     +---+---+---+---+---+---+---+---+
     |     CRC32     |     ISIZE     |
     +---+---+---+---+---+---+---+---+

     ISIZE (Input SIZE)
        This contains the size of the original (uncompressed) input
        data modulo 2^32.

The important part here is

size of original (uncompressed) input data modulo 2^32.

Carbonization answered 6/3, 2015 at 2:29 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.