I want to create a script for unzip (.tar.gz) file via (Python)
Asked Answered
C

7

108

I am trying to make a script for unzipping all the .tar.gz files from folders in one directory. For example, I will have a file which it calls ( testing.tar.gz). Then if I do manually, I can press to "extract here" then the .tar.gz file will create a new file, and it calls testing.tar. Finally, if I repeat the process of pressing "extract here", the .tar file prodcudes me all the .pdf files.

I wonder that how can I do it, and I have my code here and it seems doesn't realty work tho.

import os
import tarfile
import zipfile

def extract_file(path, to_directory='.'):
    if path.endswith('.zip'):
        opener, mode = zipfile.ZipFile, 'r'
    elif path.endswith('.tar.gz') or path.endswith('.tgz'):
        opener, mode = tarfile.open, 'r:gz'
    elif path.endswith('.tar.bz2') or path.endswith('.tbz'):
        opener, mode = tarfile.open, 'r:bz2'
    else: 
        raise ValueError, "Could not extract `%s` as no appropriate extractor is found" % path

    cwd = os.getcwd()
    os.chdir(to_directory)

    try:
        file = opener(path, mode)
        try: file.extractall()
        finally: file.close()
    finally:
        os.chdir(cwd)
Center answered 17/6, 2015 at 9:45 Comment(3)
Unless there is a point in using Python, it sounds like a job best fit for a shell script.Mcshane
extractall takes the target directory as a parameter, no need to chdir back and forthPurveyance
It could be, that if you chdir your path to compressed file will outdate.Mintz
J
170

Why do you want to "press" twice to extract a .tar.gz, when you can easily do it once? Here is a simple code to extract both .tar and .tar.gz in one go:

import tarfile

if fname.endswith("tar.gz"):
    tar = tarfile.open(fname, "r:gz")
    tar.extractall()
    tar.close()
elif fname.endswith("tar"):
    tar = tarfile.open(fname, "r:")
    tar.extractall()
    tar.close()
Jarlathus answered 17/6, 2015 at 10:1 Comment(10)
It is because, look like the file I have is (.tar.gz). But the unzipping process, it has to be extract from (.tar.gz) to (.gz) then, extract once more will give out the information I need like .pdf file etcCenter
and your code isn't working : if (fname.endswith("tar.gz")): NameError: name 'fname' is not definedCenter
@Center fname would be a string that is your filename.Nat
@Center fname is the string of the filename that you are trying to un-tar. files = [f for f in os.listdir('.') if os.path.isfile(f)] for fname in files: # do something, e.g. the above "if-elif" code.Jarlathus
Sorry, looks like the inline code does not show up as multiple lines of code, but all the lines are merged into a single line. Hope you can get the idea, if not, please drop a comment, I will explain further.Jarlathus
How do you extract to another location?Amylose
@Amylose You can use the path parameter in the extractall() command e.g. tar.extractall(path="/new/dir/location"). You can have more control too, e.g. if you need to extract only a few files inside the tar file using extract(). For more control, please take a look at the man page. docs.python.org/3/library/tarfile.htmlJarlathus
The specific link to extract() command docs.python.org/3/library/tarfile.html#tarfile.TarFile.extractJarlathus
does this method extractall require root permission? i'm running as non-root and got PermissionError: [Errno 1] Operation not permitted: error.Renarenado
@LeiYang No, you don't need root permission. Check that your directory is writable.Jarlathus
L
61

If you are using python 3, you should use shutil.unpack_archive that works for most of the common archive format.

shutil.unpack_archive(filename[, extract_dir[, format]])

Unpack an archive. filename is the full path of the archive. extract_dir is the name of the target directory where the archive is unpacked. If not provided, the current working directory is used.

For example:

def extract_all(archives, extract_path):
    for filename in archives:
        shutil.unpack_archive(filename, extract_path)
Luminary answered 17/5, 2019 at 8:57 Comment(4)
Is there anyway to control the name of the extracted file.Tangible
when the user has no root permission, tarfile cannot run, but shutil can.Renarenado
Finding the one line of python code that does what I need with minimum fuss sparks joy - thanks! I predict python will be the last programming language.Howarth
@suraj-subramanian, the extract path will contain the new name. For example, if filename was "hello.tar.gz", extract_path might be "/tmp/my_name_here"Jadajadd
C
8

Using context manager:

import tarfile
<another code>
with tarfile.open(os.path.join(os.environ['BACKUP_DIR'],
                  f'Backup_{self.batch_id}.tar.gz'), "r:gz") as so:
    so.extractall(path=os.environ['BACKUP_DIR'])
Carnes answered 17/7, 2019 at 11:37 Comment(0)
G
4

If you are using python in jupyter-notebook and in a linux machine, the below will do:

!tar -xvzf /path/to/file.tar.gz -C /path/to/save_directory

! enables the command to be run in the terminal.

Galantine answered 17/2, 2021 at 13:31 Comment(0)
K
1

The following worked for me for a .tar.gz file. It will extract files in your specified destination:

import tarfile

from os import mkdir
from os.path import isdir

src_path = 'path/to/my/source_file.tar.gz'
dst_path = 'path/to/my/destination'

# create destination dir if it does not exist
if isdir(dst_path) == False:
    mkdir(dst_path)

if src_path.endswith('tar.gz'):
    tar = tarfile.open(src_path, 'r:gz')
    tar.extractall(dst_path)
    tar.close()
Kerman answered 3/6, 2022 at 2:40 Comment(0)
A
0

You can execute a shell script from Python using envoy:

import envoy # pip install envoy

if (file.endswith("tar.gz")):
    envoy.run("tar xzf %s -C %s" % (file, to_directory))

elif (file.endswith("tar")):
    envoy.run("tar xf %s -C %s" % (file, to_directory))
Alack answered 17/10, 2018 at 16:12 Comment(0)
I
-3

When I ran your program, it worked perfectly for a tar.gz and a .tgz file, it didn't give me the correct items when I opened the zip, but .tbz was the only one that raised an error. I think you used the wrong method to unpack a .tbz because the error said I had an incorrect file type, but I didn't. One way you could solve the .zip issue is to us os.command() and unzip it with a command line (depending on your os) because it returned a _MACOSX folder with nothing inside of it even though I entered the path correctly. The only other error I encountered was that you used improper syntax for raising an error.
This is what you should have used:

raise ValueError("Error message here")

You used a comma and no parenthesis. Hope this helps!

Indiscrete answered 11/3, 2018 at 17:25 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.