Extracting a multi-part zip with Python
Asked Answered
H

2

6

I'm looking for a way to use python to extract multi-part zip files (eg blah.zip, blah.z01, blah.z02, blah.z03 etc) on Windows without any prerequisite installs (like 7zip). This question has been asked already but the only answer there says to use a local 7zip install which I'd rather avoid, and I don't have enough reputation to add a comment to that thread.

I'm developing a standalone Windows desktop application using python for main functionality, and an important part of one of its main features is being able to extract zip files given to it. Currently I'm doing this with a portable 7zip (7za) distributed with the tool but due to a new situation I need to also support multi part zip files, and I can't seem to do this.

My current code is simply

subprocess.call('7za x -o"'+destinationPath+'" "'+zipPath+'"')

which works for normal zips, but for multi part zips nothing happens and by running it manually through cmd I get the output

7-Zip (A) 9.20  Copyright (c) 1999-2010 Igor Pavlov  2010-11-18
Processing archive: zipname.zip
Error: Can not open file as archive

Even though desktop 7zip is perfectly capable of doing this with the exact same archive. Am I missing something with the syntax of the 7za commands? If not, are there any alternatives besides asking users to ensure they have desktop 7zip installed and try to detect the location of it through the registry etc?

I also tried using python's Zipfile library but that gets me an

error: "BadZipFile: Bad magic number for file header"

as noted by the other thread.

Many thanks in advance!

Handoff answered 11/2, 2019 at 22:6 Comment(1)
did you figured it out ?Charleycharlie
M
2

This sounds more like a bash problem than a python problem considering you're using subprocess.call(). You probably want to cat the parts together than unzip, like so:

cat test.zip.* >test.zip
unzip test.zip

Except in multiple subprocess.call()s.

See here: https://unix.stackexchange.com/questions/40480/how-to-unzip-a-multipart-spanned-zip-on-linux

Mckelvey answered 11/2, 2019 at 22:12 Comment(5)
OP is on windows, not linuxArrhenius
Good point — general technique might still work though? Concatenate files then unzip the result?Mckelvey
So if I directly opened the files as binary in python and strung them together, would I get a working zip at the end? Wouldn't be fast but if that's how it works then it is an option. I tried to use the windows copy file.zip + file.z01 + file.z02 outputfile.zip syntax but got a 3kb unreadable zip file out of itHandoff
I would imagine so. It's possible this could work with bytesIO, since that can act as a file handler. Only issue could be EOF markers in each .zip archiveArrhenius
in windows and MS-DOS you can use copy like „copy file.zip.part1 + file.zip.part1 + file.zip.part1 file.zip“Entoderm
P
-2

I wrote this code, basically reads parts as binary and appends it to the one big zip file. Then you can extract it.

zips = os.listdir(zipPath)
for zipName in zips:
    with open(os.path.join(zipPath, "data.zip"), "ab") as f:
        with open(os.path.join(zipPath, zipName), "rb") as z:
            f.write(z.read())

with ZipFile(os.path.join(zipPath, "data.zip"), "r") as zipObj:
    zipObj.extractall(zipPath)
Peabody answered 10/12, 2020 at 8:7 Comment(3)
When I use part for merging files into one big part it corrupt zipfile and then there is no possibility to unzip it because file is wrong.Charleycharlie
Maybe your zip files are not multi-part zip files, multi-parted zip files are basically split one big zip file, maybe your zip files are just zip files on their own.Peabody
They are. I created multi-part zip file using windows. File like Test.zip, Test.z01 ... Test.z80 and like I said when i megre them file is corrupted. Probably this all is because of diff extends (zip, z01 etc)Charleycharlie

© 2022 - 2024 — McMap. All rights reserved.