Python ZipFile module extracts password protected zips slowly
Asked Answered
R

2

6

i am trying to write a python-script, which should extract a zip file:

Board: Beagle-Bone black ~ 1GHz Arm-Cortex-a8, debian wheezy Zipfile: /home/milo/my.zip, ~ 8 MB

>>> from zipfile import ZipFile
>>> zip = ZipFile("/home/milo/my.zip")
>>> zip.extractall(pwd="tst")

other solutions with opening and reading-> writing the zipfile and extracting even particular file have the same effect. extracting take about 3-4 minutes.

Extracting the same file with just using unzip-tool takes less than 2 seconds.

Does anyone know what is wonrg with my code, or even with python zipfile lib??

Thanks Ajava

Rademacher answered 1/9, 2014 at 7:2 Comment(3)
Does it affect to the speed of extracting whether the zip is password protected or not?Brame
no it does not, if the same zipfile is not password protected, the same code extracts everything as fast as unzip do!!!!!Rademacher
Even on my own PC (i5, 8GB RAM, Debian Wheezy) extracting a 30 MB password protected zipfile through Python does take more than 1 min!!Rademacher
S
7

This seems to be a documented issue with the ZipFile module in Python 2.7. If you look at the documentation for ZipFile, it clearly mentions:

Decryption is extremely slow as it is implemented in native Python rather than C.

If you need faster performance, you can either invoke an an external program (like unzip or 7zip) from your code, or make sure the zip files you are working with are not password protected.

Semang answered 1/9, 2014 at 7:38 Comment(0)
S
1

Copy from my answer https://mcmap.net/q/661127/-fast-zip-decryption-in-python

It's quite stupid that Python doesn't implement zip decryption in pure c.

So I make it in cython, which is 17 times faster.

Just get the dezip.pyx and setup.py from this gist.

https://gist.github.com/zylo117/cb2794c84b459eba301df7b82ddbc1ec

And install cython and build a cython library

pip3 install cython
python3 setup.py build_ext --inplace

Then run the original script with two more lines.

import zipfile

# add these two lines
from dezip import _ZipDecrypter_C
setattr(zipfile, '_ZipDecrypter', _ZipDecrypter_C)

z = zipfile.ZipFile('./test.zip', 'r')
z.extractall('/tmp/123', None, b'password')
Skiagraph answered 6/6, 2022 at 6:4 Comment(1)
This is working for large zip fileAvra

© 2022 - 2024 — McMap. All rights reserved.