gzip + chunked : must wait the whole file to be downloaded before unzipping?
Asked Answered
H

2

0

I am pretty sure of the answer but I would like someone to confirm it please.

There is no way to unzip only a part of a file when gzip is used in the HTTP headers. I gotta download the whole file before to be able to unzip it to get the data.

Right ?

For example, if I get the first 100 bytes with some code like that:

myfile.read(100)

I won't be able to unzip it at this point.

Thanks.

Hutment answered 4/3, 2012 at 21:37 Comment(0)
S
3

You can start decompressing a gzip stream immediately, for whatever amount of data you have so far. You will be able to extract all of the uncompressed bytes represented in the compressed data you have available so far.

You must always decompress from the beginning. So what you can't do is start decompressing in the middle of a gzip stream. If you want to access data in the middle, you need to decompress all of the data up to that point.

Sturgis answered 6/3, 2012 at 23:47 Comment(1)
Ok thank you. I ended up using Requests and indeed it looks like it handles it correctly so I might have done something wrong when I used urllib2.Hutment
C
1

Wrong. GZIP allows streaming. You might be confusing the format with the ZIP archive format.

Cordwain answered 5/3, 2012 at 8:51 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.