Unzip zip files in folders and subfolders
Asked Answered
A

3

8

I try to unzip 150 zip files. All the zip files as different names, and they all spread in one big folder that divided to a lot of sub folders and sub sub folders.i want to extract each archive to separate folder with the same name as the original zip file name and also in the same place as the original zip file . my code is:

import zipfile    
import os,os.path,sys  

pattern = '*.zip'  
folder = r"C:\Project\layers"   
files_process = []  
for root,dirs,files in os.walk(r"C:\Project\layers"):  
    for filenames in files:  
        if filenames == pattern:  
            files_process.append(os.path.join(root, filenames))  
            zip.extract() 

After i run the code nothing happened. Thanks in advance for any help on this.

Adermin answered 5/2, 2015 at 8:6 Comment(2)
Using zipfile module: docs.python.org/2/library/zipfile.htmlJacalynjacamar
extract_zipJacobsohn
A
16

UPDATE:

Finally, this code worked for me:

import zipfile,fnmatch,os

rootPath = r"C:\Project"
pattern = '*.zip'
for root, dirs, files in os.walk(rootPath):
    for filename in fnmatch.filter(files, pattern):
        print(os.path.join(root, filename))
        zipfile.ZipFile(os.path.join(root, filename)).extractall(os.path.join(root, os.path.splitext(filename)[0]))
Adermin answered 2/6, 2015 at 9:15 Comment(1)
Use os.path.splitext(filename)[0] instead of filename.split('.')[0] The latter return a wrong result if there are multiple dots in the filename.Farrica
F
7

You could use Path.rglob() to enumerate zip-files recursively and shutil.unpack_archive() to unpack zip files:

#!/usr/bin/env python3
import logging
from pathlib import Path
from shutil import unpack_archive

zip_files = Path(r"C:\Project\layers").rglob("*.zip")
while True:
    try:
        path = next(zip_files)
    except StopIteration:
        break # no more files
    except PermissionError:
        logging.exception("permission error")
    else:
         extract_dir = path.with_name(path.stem)
         unpack_archive(str(path), str(extract_dir), 'zip')

It "extract[s] each archive to separate folder with the same name as the original zip file name and also in the same place as the original zip file" e.g., it extracts 'layers/dir/file.zip' archive into 'layers/dir/file' directory.

Farrica answered 5/2, 2015 at 13:24 Comment(11)
i get an error: Traceback (most recent call last): File "D:\desktop\python.py", line 2, in <module> from pathlib import Path ImportError: No module named pathlibAdermin
i using python 2.7.8 shellAdermin
@Y.Y.C: there is a backport of pathlib for Python 2. Run pip install pathlib or install Python 3Farrica
where i install pathlib?Adermin
@Y.Y.C: just run the pip command. If you don't know what the command-line is or don't know what pip command does; ask a separate question (or (more likely) find an existing one).Farrica
J.F.Sebastian,i must work with python 2.7.8 ,so i install path lib. but now i get en error: Traceback (most recent call last): File "C:\yaron\shonot\software\gis\tools\YARON_SCRIPTS\unzip.py", line 3, in <module> from shutil import unpack_archive ImportError: cannot import name unpack_archiveAdermin
@Y.Y.C do you see unpack_archive() in the documentation for shutil module? You could use zipfile module, to extract the archive. Use unicode instead of str for filenames on Windows.Farrica
+1 for unpack_archive and rglob. Did not know of either of those.Subacute
The pathlib module for python2 is not supported anymore. The backport of pathlib for python2 is called pathlib2, so you need to pip install pathlib2 if still stuck in the 2.x era.Imbrication
@Davos: note: the script has python3 shebang. You should run it using Python 3 interpreter.Farrica
Agreed, I was just adding new information to your previous comment about the backport, best to use pathlib2 not pathlibImbrication
N
0

To unzip all the files into a temporary folder (Ubuntu)

import tempfile
import zipfile

tmpdirname = tempfile.mkdtemp()

zf = zipfile.ZipFile('/path/to/zipfile.zip')

for fn in zf.namelist():
    temp_file = tmpdirname+"/"+fn
    #print(temp_file)

    f = open(temp_file, 'w')
    f.write(zf.read(fn).decode('utf-8'))
    f.close()
Nyala answered 22/5, 2017 at 18:14 Comment(3)
the decoding assumes the files are textNyala
I would encourage the usage of os.path.join(tmpdirname, fn) instead of tmpdirname+"/"+fnFragonard
You could open the temp_file as binary so you don't need to decode it, and use context manager for the file handle so you don't need the f.close() , e.g. with open(temp_file, 'wb') as f: Even better you could replace the loop over namelist with the zipfile.extractall method and supply tmpdirname as the path. This doesn't answer the question though. The question mentions nothing about temp directories, it asks "i want to extract each archive to separate folder with the same name as the original zip file name and also in the same place as the original zip file"Imbrication

© 2022 - 2024 — McMap. All rights reserved.