How do I recursively unzip nested ZIP files?
Asked Answered
D

6

9

Given there is a secret file deep inside a nested ZIP file, i.e. a zip file inside a zip file inside a zip file, etc...

The zip files are named 1.zip, 2.zip, 3.zip, etc...

We don't know how deep the zip files are nested, but it may be thousands.

What would be the easiest way to loop through all of them up until the last one to read the secret file?

My initial approach would have been to call unzip recursively, but my Bash skills are limited. What are your ideas to solve this?

Dialysis answered 10/12, 2015 at 18:10 Comment(4)
so 1.zip is inside 2.zip which in turn is inside 3.zip and so on? Does a zip file have more than 1 file inside? If yes could it be a non zip file? all the zip files have the .zip extension?Georgeta
Take a look: unix.stackexchange.com/q/4367/74329Lafrance
@CristiFati: Each Zip file contains exactly one zip file: 1.zip contains 2.zip, 2.zip contains 3.zip and so onDialysis
Did the link help or should I go on and work on a script?Georgeta
D
10

Thanks Cyrus! The master wizard Shawn J. Goff had the perfect script for this:

while [ "`find . -type f -name '*.zip' | wc -l`" -gt 0 ]; do find -type f -name "*.zip" -exec unzip -- '{}' \; -exec rm -- '{}' \;; done
Dialysis answered 10/12, 2015 at 22:1 Comment(2)
Great solution, except in the pathological case of zip quines like droste.zipMauve
Well I guess the zip quine is a special case, but not that relevant when the goal is to ultimately hide some data inside the zip queue.Dialysis
A
5

Here's my 2 cents.

#!/bin/bash

function extract(){
  unzip $1 -d ${1/.zip/} && eval $2 && cd ${1/.zip/}
  for zip in `find . -maxdepth 1 -iname *.zip`; do
    extract $zip 'rm $1'
  done
}

extract '1.zip'
Abmho answered 10/12, 2015 at 21:31 Comment(1)
This helped me with my nested zip case. Though I should note that I needed to (1) Modify unzip $1 -d ${1/.zip/} && eval $2 && cd ${1/.zip/} to be three separate lines, (2) Put *.zip in single quotes, (3) Add cd .. after the for loop. Also, this doesn't handle zips files that are in subdirectories of the parent zip.Waterless
B
1

I know that this question is a bit old but if someone stumbles upon similar problem then this bash script might be useful.
This script unzips recursively and retains the original folder hierarchy structure inside zip file instead of unzipping everything into the current directory.
This script also handles a bit pathological cases in which there are many zip files within zips or folders alternately in one zip file.

Iterating over files and folders based on this tutorial to avoid problems with white spaces, NULs, newline delimiters, etc.
http://mywiki.wooledge.org/BashFAQ/001

#!/bin/bash

function extractZipsInCurrentDirLevel() {
  find . -mindepth 1 -maxdepth 1 -type f -iname '*.zip' -print0 | 
  while IFS= read -r -d '' zipfile; 
  do
    unzip "${zipfile}" -d "${zipfile%.*}"; # %.* removes file extension
    rm "${zipfile}";
  done
}

function extractZipsRecursively() {
  extractZipsInCurrentDirLevel; # this can generate new folders after unzipping
  find . -mindepth 1 -maxdepth 1 -type d -print0 | 
  while IFS= read -r -d '' folder; 
  do # call this function recursively for every child subdirectory
    cd "${folder}"
    extractZipsRecursively;
    cd ..
  done
}

extractZipsRecursively; # main entry function call of shell script
Boabdil answered 23/9, 2023 at 8:47 Comment(0)
V
0

Probably not the cleanest way, but that should do the trick:

#!/bin/sh
IDX=1 # ID of your first zip file
while [ 42 ]
do
    unzip $IDX.zip # Extract
    if [[ $? != 0 ]]
    then
        break # Quit if unzip failed (no more files)
    fi
    if [ $IDX -ne 1 ]
    then
        rm $IDX.zip # Remove zip to leave your directory clean
    fi
    (( IDX ++ )) # Next file
done
Vallejo answered 10/12, 2015 at 19:53 Comment(0)
M
0

Checkout this java based utility nzip for nested zips.

Extracting and compressing nested zips can be done easily using following commands:

java -jar nzip.jar -c list -s readme.zip 

java -jar nzip.jar -c extract -s "C:\project\readme.zip" -t readme 

java -jar nzip.jar -c compress -s readme -t "C:\project\readme.zip" 

PS. I am the author and will be happy to fix any bugs quickly.
Mcmorris answered 2/5, 2018 at 17:53 Comment(0)
M
0

Here is a solution for windows assuming 7zip is installed in the default location.

@echo off
Setlocal EnableDelayedExpansion
Set source=%1
Set SELF=%~dpnx0
For %%Z in (!source!) do (
    set FILENAME=%%~nxZ
)
set FILENAME=%FILENAME:"=%

"%PROGRAMFILES%\7-zip\7z.exe" x -o* -y "%FILENAME%"

REM DEL "%FILENAME%"
rem " This is just to satisfy stackoverflow code formatting!


For %%Z in (!source!) do (
    set FILENAME=%%~nZ
)
for %%a in (zip rar jar z bz2 gz gzip tgz tar lha iso wim cab rpm deb) do (
    
    forfiles /P ^"%FILENAME%^" /S /M *.%%a /C "cmd /c if @isdir==FALSE \"%SELF%\" @path"
)

This has been adapted from here https://social.technet.microsoft.com/Forums/ie/en-US/ccd7172b-85e3-4b4a-ad93-5902e0abd903/batch-file-extracting-all-files-from-nested-archives?forum=ITCG

Notes:

  1. The only way to do variable modification using the ~ modifiers is to use a dummy for..in loop. If there is a better way please edit.
  2. ~nx modifies the variable to make it a full path+file name.
  3. ~dpnx also does the same thing to %0 i.e. gets the full path and filename of the script.
  4. -o* in the 7zip command line allows 7zip to create folder names without the .zip extension like it does when extracting with a right click in the gui.
  5. ~n modifies the variable to make it a filename without an extension. i.e. drops the .zip
  6. Note that the escape character (for quotes) in FORFILES /P is ^ (caret) while for the CMD /C it is \. This ensures that it handles path and filenames with spaces also recursively without any problem.
  7. You can remove the REM from the DEL statement if you want the zip file to be deleted after unzipping.
Musette answered 8/11, 2021 at 10:31 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.