According to the specifiction of gz the filesize is saved in the last 4bytes of a .gz file.
I have created 2 files with
dd if=/dev/urandom of=500M bs=1024 count=500000
dd if=/dev/urandom of=5G bs=1024 count=5000000
I gziped them
gzip 500M 5G
I checked the last 4 bytes doing
tail -c4 500M|od -I (returns 512000000 as expected)
tail -c4 5G|od -I (returns 825032704 as not expected)
It seems that hitting the invisible 32bit barrier, makes the value written into the ISIZE completely nonsense. Which is more annoying, than if they had used some error bit instead.
Does anyone know of a way to get the uncompressed .gz filesize from the .gz without extracting it?
thanks
specification: http://www.gzip.org/zlib/rfc-gzip.html
edit: if anyone to try it out, you could use /dev/zero instead of /dev/urandom
dd seek=10G if=/dev/zero of=out.dat count=0
is more handy for the most filesystems – Juvenal