Unzip all zipped files in a folder to that same folder using Python 2.7.5
Asked Answered
P

5

43

I would like to write a simple script to iterate through all the files in a folder and unzip those that are zipped (.zip) to that same folder. For this project, I have a folder with nearly 100 zipped .las files and I'm hoping for an easy way to batch unzip them. I tried with following script

import os, zipfile

folder = 'D:/GISData/LiDAR/SomeFolder'
extension = ".zip"

for item in os.listdir(folder):
    if item.endswith(extension):
        zipfile.ZipFile.extract(item)

However, when I run the script, I get the following error:

Traceback (most recent call last):
  File "D:/GISData/Tools/MO_Tools/BatchUnzip.py", line 10, in <module>
    extract = zipfile.ZipFile.extract(item)
TypeError: unbound method extract() must be called with ZipFile instance as first argument (got str instance instead)

I am using the python 2.7.5 interpreter. I looked at the documentation for the zipfile module (https://docs.python.org/2/library/zipfile.html#module-zipfile) and I would like to understand what I'm doing incorrectly.

I guess in my mind, the process would go something like this:

  1. Get folder name
  2. Loop through folder and find zip files
  3. Extract zip files to folder

Thanks Marcus, however, when implementing the suggestion, I get another error:

Traceback (most recent call last):
  File "D:/GISData/Tools/MO_Tools/BatchUnzip.py", line 12, in <module>
    zipfile.ZipFile(item).extract()
  File "C:\Python27\ArcGIS10.2\lib\zipfile.py", line 752, in __init__
    self.fp = open(file, modeDict[mode])
IOError: [Errno 2] No such file or directory: 'JeffCity_0752.las.zip'

When I use print statements, I can see that the files are in there. For example:

for item in os.listdir(folder):
    if item.endswith(extension):
        print os.path.abspath(item)
        filename = os.path.basename(item)
        print filename

yields:

D:\GISData\Tools\MO_Tools\JeffCity_0752.las.zip
JeffCity_0752.las.zip
D:\GISData\Tools\MO_Tools\JeffCity_0753.las.zip
JeffCity_0753.las.zip

As I understand the documentation,

zipfile.ZipFile(file[, mode[, compression[, allowZip64]]])

Open a ZIP file, where file can be either a path to a file (a string) or a file-like object

It appears to me like everything is present and accounted for. I just don't understand what I'm doing wrong.

Any suggestions?

Thank You

Parlous answered 10/7, 2015 at 17:23 Comment(0)
P
78

Below is the code that worked for me:

import os, zipfile

dir_name = 'C:\\SomeDirectory'
extension = ".zip"

os.chdir(dir_name) # change directory from working dir to dir with files

for item in os.listdir(dir_name): # loop through items in dir
    if item.endswith(extension): # check for ".zip" extension
        file_name = os.path.abspath(item) # get full path of files
        zip_ref = zipfile.ZipFile(file_name) # create zipfile object
        zip_ref.extractall(dir_name) # extract file to dir
        zip_ref.close() # close file
        os.remove(file_name) # delete zipped file

Looking back at the code I had amended, the directory was getting confused with the directory of the script.

The following also works while not ruining the working directory. First remove the line

os.chdir(dir_name) # change directory from working dir to dir with files

Then assign file_name as

file_name = dir_name + "/" + item
Parlous answered 11/7, 2015 at 9:21 Comment(3)
Thanks for the explanations mate!! My problem is that all the files that I extract have the same filename inside and when I use extractall it directly smashes the files leaving just the last one. I should change the name of it, but I do not know how. @ChenluTailspin
@Tailspin I would recommend creating a count variable, then adding that to the file name on extract. Inside the loop, append the count variable to the dir name.Parlous
What if I want unzip zip files in folders and subfolders?Vaclav
I
26

I think this is shorter and worked fine for me. First import the modules required:

import zipfile, os

Then, I define the working directory:

working_directory = 'my_directory'
os.chdir(working_directory)

After that you can use a combination of the os and zipfile to get where you want:

for file in os.listdir(working_directory):   # get the list of files
    if zipfile.is_zipfile(file): # if it is a zipfile, extract it
        with zipfile.ZipFile(file) as item: # treat the file as a zip
           item.extractall()  # extract it in the working directory
Ithnan answered 2/7, 2019 at 10:14 Comment(3)
This solution worked for me. It's also more pythonic than the accepted answer.Sykes
Short and easy answers are the best!Quondam
Haven't tested this but it looks to me like you forgot to close the zipfile with item.close() at the endEpoxy
H
9

The accepted answer works great!

Just to extend the idea to unzip all the files with .zip extension within all the sub-directories inside a directory the following code seems to work well:

import os
import zipfile

for path, dir_list, file_list in os.walk(dir_path):
    for file_name in file_list:
        if file_name.endswith(".zip"):
            abs_file_path = os.path.join(path, file_name)

            # The following three lines of code are only useful if 
            # a. the zip file is to unzipped in it's parent folder and 
            # b. inside the folder of the same name as the file

            parent_path = os.path.split(abs_file_path)[0]
            output_folder_name = os.path.splitext(abs_file_path)[0]
            output_path = os.path.join(parent_path, output_folder_name)

            zip_obj = zipfile.ZipFile(abs_file_path, 'r')
            zip_obj.extractall(output_path)
            zip_obj.close()
Honestly answered 25/10, 2017 at 17:20 Comment(0)
J
4

You need to construct a ZipFile object with the filename, and then extract it:

    zipfile.ZipFile.extract(item)

is wrong.

    zipfile.ZipFile(item).extractall()

will extract all files from the zip file with the name contained in item.

I think you should more closely read the documentation to zipfile :) but you're on the right track!

Jugate answered 10/7, 2015 at 17:27 Comment(1)
The docs for extractall are at: python.readthedocs.io/en/latest/library/…Irredeemable
R
2

Recursive version of @tpdance answer.

Use this for for subfolders and subfolder. Working on Python 3.8

import os
import zipfile

base_dir = '/Users/john/data' # absolute path to the data folder
extension = ".zip"

os.chdir(base_dir)  # change directory from working dir to dir with files


def unpack_all_in_dir(_dir):
    for item in os.listdir(_dir):  # loop through items in dir
        abs_path = os.path.join(_dir, item)  # absolute path of dir or file
        if item.endswith(extension):  # check for ".zip" extension
            file_name = os.path.abspath(abs_path)  # get full path of file
            zip_ref = zipfile.ZipFile(file_name)  # create zipfile object
            zip_ref.extractall(_dir)  # extract file to dir
            zip_ref.close()  # close file
            os.remove(file_name)  # delete zipped file
        elif os.path.isdir(abs_path):
            unpack_all_in_dir(abs_path)  # recurse this function with inner folder


unpack_all_in_dir(base_dir)
Raama answered 8/9, 2021 at 11:8 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.