How to unzip files recursively in Java?
Asked Answered
C

10

35

I have zip file which contains some other zip files.

For example, the mail file is abc.zip and it contains xyz.zip, class1.java, class2.java. And xyz.zip contains the file class3.java and class4.java.

So I need to extract the zip file using Java to a folder that should contain class1.java, class2.java, class3.java and class4.java.

Caliginous answered 11/6, 2009 at 14:52 Comment(3)
This will really mess up this task: aioobe.org/zip-quine. (Infinitely recursive zip-in-a-zip-in-a-zip-in-a-..)Bondstone
Just a thought, all the answers presented here doesn't actually work because you have not considered nested zipsSymposiac
Beware of the code in answers to this question as they contain a security vulnerability at the moment. See here for more.Kieger
T
86

Warning, the code here is ok for trusted zip files, there's no path validation before write which may lead to security vulnerability as described in zip-slip-vulnerability if you use it to deflate an uploaded zip file from unknown client.


This solution is very similar to the previous solutions already posted, but this one recreates the proper folder structure on unzip.

public static void extractFolder(String zipFile) throws IOException {
int buffer = 2048;
File file = new File(zipFile);

try (ZipFile zip = new ZipFile(file)) {
  String newPath = zipFile.substring(0, zipFile.length() - 4);

  new File(newPath).mkdir();
  Enumeration<? extends ZipEntry> zipFileEntries = zip.entries();

  // Process each entry
  while (zipFileEntries.hasMoreElements()) {
    // grab a zip file entry
    ZipEntry entry = zipFileEntries.nextElement();
    String currentEntry = entry.getName();
    File destFile = new File(newPath, currentEntry);
    File destinationParent = destFile.getParentFile();

    // create the parent directory structure if needed
    destinationParent.mkdirs();

    if (!entry.isDirectory()) {
      BufferedInputStream is = new BufferedInputStream(zip.getInputStream(entry));
      int currentByte;
      // establish buffer for writing file
      byte[] data = new byte[buffer];

      // write the current file to disk
      FileOutputStream fos = new FileOutputStream(destFile);
      try (BufferedOutputStream dest = new BufferedOutputStream(fos, buffer)) {

        // read and write until last byte is encountered
        while ((currentByte = is.read(data, 0, buffer)) != -1) {
          dest.write(data, 0, currentByte);
        }
        dest.flush();
        is.close();
      }
    }

    if (currentEntry.endsWith(".zip")) {
      // found a zip file, try to open
      extractFolder(destFile.getAbsolutePath());
    }
  }
}

}

Thinkable answered 18/8, 2011 at 14:12 Comment(6)
I know this is old, but what is the significance of the commented-out line //destFile = new File(newPath, destFile.getName()); being left in, if any?Nedrud
@Liam, there is no signifigance. I think i was just trying out different ways of getting the current file name. I decided on using currentEntry instead of destFile.getName().Thinkable
The code here adds a security vulnerability! The paths in the zip file need to be validated beforehand. See hereKieger
Hmm, I am afraid I cannot see how this fixes the vulnerability. Should one not check against the canonical path of the extraction directory? I cannot find this in this example. Why do you think this is (more) secure?Jackinthepulpit
Instead of writing the zip file and then loading it, you could directly use the class ZipInputStream.Raman
For writing the content of the zip entry input streams you could use Files.copy(InputStream, Path, CopyOption...).Raman
M
9

Here's some untested code base on some old code I had that unzipped files.

public void doUnzip(String inputZip, String destinationDirectory)
        throws IOException {
    int BUFFER = 2048;
    List zipFiles = new ArrayList();
    File sourceZipFile = new File(inputZip);
    File unzipDestinationDirectory = new File(destinationDirectory);
    unzipDestinationDirectory.mkdir();

    ZipFile zipFile;
    // Open Zip file for reading
    zipFile = new ZipFile(sourceZipFile, ZipFile.OPEN_READ);

    // Create an enumeration of the entries in the zip file
    Enumeration zipFileEntries = zipFile.entries();

    // Process each entry
    while (zipFileEntries.hasMoreElements()) {
        // grab a zip file entry
        ZipEntry entry = (ZipEntry) zipFileEntries.nextElement();

        String currentEntry = entry.getName();

        File destFile = new File(unzipDestinationDirectory, currentEntry);
        destFile = new File(unzipDestinationDirectory, destFile.getName());

        if (currentEntry.endsWith(".zip")) {
            zipFiles.add(destFile.getAbsolutePath());
        }

        // grab file's parent directory structure
        File destinationParent = destFile.getParentFile();

        // create the parent directory structure if needed
        destinationParent.mkdirs();

        try {
            // extract file if not a directory
            if (!entry.isDirectory()) {
                BufferedInputStream is =
                        new BufferedInputStream(zipFile.getInputStream(entry));
                int currentByte;
                // establish buffer for writing file
                byte data[] = new byte[BUFFER];

                // write the current file to disk
                FileOutputStream fos = new FileOutputStream(destFile);
                BufferedOutputStream dest =
                        new BufferedOutputStream(fos, BUFFER);

                // read and write until last byte is encountered
                while ((currentByte = is.read(data, 0, BUFFER)) != -1) {
                    dest.write(data, 0, currentByte);
                }
                dest.flush();
                dest.close();
                is.close();
            }
        } catch (IOException ioe) {
            ioe.printStackTrace();
        }
    }
    zipFile.close();

    for (Iterator iter = zipFiles.iterator(); iter.hasNext();) {
        String zipName = (String)iter.next();
        doUnzip(
            zipName,
            destinationDirectory +
                File.separatorChar +
                zipName.substring(0,zipName.lastIndexOf(".zip"))
        );
    }

}
Meryl answered 11/6, 2009 at 15:14 Comment(6)
Why do you have a throws declaration, and yet actually catch the exception and log it? Doesn't that mean callers will be expecting IOExceptions that never get thrown..?Muttonhead
What if the entry is a directory?Lavation
I believe zips only store files, not directories.Meryl
@Meryl I don't think this is true. If I use PeaZip to extract a zip that has a directory structure inside of it, the resulting directory has the folder structure properly recreated. This method (it seems), gets all files no matter what their directory structure is, and just puts them in the base destination directory.Thinkable
The above code works perfectly fine after removing the line No 47 destFile = new File(unzipDestinationDirectory, destFile.getName()); ThanksNovotny
@Novotny man. u r great. So simple solution. awesome dude. And also Charlie, great work. for pretty fast extraction.Anlace
C
7

I take ca.anderson4 and remove the List zipFiles and rewrite a little bit, this is what i got:

public class Unzip {

public void unzip(String zipFile) throws ZipException,
        IOException {

    System.out.println(zipFile);;
    int BUFFER = 2048;
    File file = new File(zipFile);

    ZipFile zip = new ZipFile(file);
    String newPath = zipFile.substring(0, zipFile.length() - 4);

    new File(newPath).mkdir();
    Enumeration zipFileEntries = zip.entries();

    // Process each entry
    while (zipFileEntries.hasMoreElements()) {
        // grab a zip file entry
        ZipEntry entry = (ZipEntry) zipFileEntries.nextElement();

        String currentEntry = entry.getName();

        File destFile = new File(newPath, currentEntry);
        destFile = new File(newPath, destFile.getName());
        File destinationParent = destFile.getParentFile();

        // create the parent directory structure if needed
        destinationParent.mkdirs();
        if (!entry.isDirectory()) {
            BufferedInputStream is = new BufferedInputStream(zip
                    .getInputStream(entry));
            int currentByte;
            // establish buffer for writing file
            byte data[] = new byte[BUFFER];

            // write the current file to disk
            FileOutputStream fos = new FileOutputStream(destFile);
            BufferedOutputStream dest = new BufferedOutputStream(fos,
                    BUFFER);

            // read and write until last byte is encountered
            while ((currentByte = is.read(data, 0, BUFFER)) != -1) {
                dest.write(data, 0, currentByte);
            }
            dest.flush();
            dest.close();
            is.close();
        }
        if (currentEntry.endsWith(".zip")) {
            // found a zip file, try to open
            unzip(destFile.getAbsolutePath());
        }
    }
}

public static void main(String[] args) {
    Unzip unzipper=new Unzip();
    try {
        unzipper.unzip("test/test.zip");
    } catch (ZipException e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
    } catch (IOException e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
    }
}

}

I tested and it works

Ceremonial answered 12/6, 2009 at 22:3 Comment(3)
That code was actually written by ca.anderson4; I merely edited it (I don't like having to scroll side-to-side).Coopt
aaa okey, I dont saw this :-). So i give credits to ca.anderson4 and the editor you ;-)Ceremonial
This one just dumps all the files in all the subfolders into one folder thus destroying the whole archive structureSchoonmaker
P
2

In testing I noticed File.mkDirs() does not work under Windows...

/** * for a given full path name recreate all parent directories **/

    private void createParentHierarchy(String parentName) throws IOException {
        File parent = new File(parentName);
        String[] parentsStrArr = parent.getAbsolutePath().split(File.separator == "/" ? "/" : "\\\\");

        //create the parents of the parent
        for(int i=0; i < parentsStrArr.length; i++){
            StringBuffer currParentPath = new StringBuffer();
            for(int j = 0; j < i; j++){
                currParentPath.append(parentsStrArr[j]+File.separator);
            }
            File currParent = new File(currParentPath.toString());
            if(!currParent.isDirectory()){
                boolean created = currParent.mkdir();
                if(isVerbose)log("creating directory "+currParent.getAbsolutePath());
            }
        }

        //create the parent itself
        if(!parent.isDirectory()){
            boolean success = parent.mkdir();
        }
    }
Preternatural answered 13/6, 2009 at 14:38 Comment(1)
don't use == to compare strings.Ambiguous
E
2

Modified as i needed then mixed in a bit of the best answers. This version will:

  • Recursively Extract a zip to given location

  • Create empty directories

  • Close zip properly


public static void unZipAll(File source, File destination) throws IOException 
{
    System.out.println("Unzipping - " + source.getName());
    int BUFFER = 2048;

    ZipFile zip = new ZipFile(source);
    try{
        destination.getParentFile().mkdirs();
        Enumeration zipFileEntries = zip.entries();

        // Process each entry
        while (zipFileEntries.hasMoreElements())
        {
            // grab a zip file entry
            ZipEntry entry = (ZipEntry) zipFileEntries.nextElement();
            String currentEntry = entry.getName();
            File destFile = new File(destination, currentEntry);
            //destFile = new File(newPath, destFile.getName());
            File destinationParent = destFile.getParentFile();

            // create the parent directory structure if needed
            destinationParent.mkdirs();

            if (!entry.isDirectory())
            {
                BufferedInputStream is = null;
                FileOutputStream fos = null;
                BufferedOutputStream dest = null;
                try{
                    is = new BufferedInputStream(zip.getInputStream(entry));
                    int currentByte;
                    // establish buffer for writing file
                    byte data[] = new byte[BUFFER];

                    // write the current file to disk
                    fos = new FileOutputStream(destFile);
                    dest = new BufferedOutputStream(fos, BUFFER);

                    // read and write until last byte is encountered
                    while ((currentByte = is.read(data, 0, BUFFER)) != -1) {
                        dest.write(data, 0, currentByte);
                    }
                } catch (Exception e){
                    System.out.println("unable to extract entry:" + entry.getName());
                    throw e;
                } finally{
                    if (dest != null){
                        dest.close();
                    }
                    if (fos != null){
                        fos.close();
                    }
                    if (is != null){
                        is.close();
                    }
                }
            }else{
                //Create directory
                destFile.mkdirs();
            }

            if (currentEntry.endsWith(".zip"))
            {
                // found a zip file, try to extract
                unZipAll(destFile, destinationParent);
                if(!destFile.delete()){
                    System.out.println("Could not delete zip");
                }
            }
        }
    } catch(Exception e){
        e.printStackTrace();
        System.out.println("Failed to successfully unzip:" + source.getName());
    } finally {
        zip.close();
    }
    System.out.println("Done Unzipping:" + source.getName());
}
Elouise answered 8/7, 2014 at 21:44 Comment(0)
A
1

One should CLOSE zip file after unzip.

static public void extractFolder(String zipFile) throws ZipException, IOException 
{
    System.out.println(zipFile);
    int BUFFER = 2048;
    File file = new File(zipFile);

    ZipFile zip = new ZipFile(file);
    try
    { 
       ...code from other answers ( ex. NeilMonday )...
    }
    finally
    {
        zip.close();
    }
}
Admittance answered 3/5, 2012 at 14:19 Comment(0)
W
1

No third-party dependencies, guards against zip slip, fully commented, recreates directory structure recursively, ignores empty directories, sane source code nesting, extracts to zip file's directory, and uses UTF-8. Usage:

Path zipFile = Path.of( "/path/to/filename.zip" );
Zip.extract( zipFile );

Here's the code:

import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.nio.file.Files;
import java.nio.file.Path;
import java.util.zip.ZipEntry;
import java.util.zip.ZipFile;

import static java.nio.file.Files.createDirectories;
import static java.nio.file.StandardCopyOption.REPLACE_EXISTING;

/**
 * Responsible for managing zipped archive files.
 */
public final class Zip {
  /**
   * Extracts the contents of the zip archive into its current directory. The
   * contents of the archive must be {@link StandardCharsets#UTF_8}. For
   * example, if the {@link Path} is <code>/tmp/filename.zip</code>, then
   * the contents of the file will be extracted into <code>/tmp</code>.
   *
   * @param zipPath The {@link Path} to the zip file to extract.
   * @throws IOException Could not extract the zip file, zip entries, or find
   *                     the parent directory that contains the path to the
   *                     zip archive.
   */
  public static void extract( final Path zipPath ) throws IOException {
    assert !zipPath.toFile().isDirectory();

    try( final var zipFile = new ZipFile( zipPath.toFile() ) ) {
      iterate( zipFile );
    }
  }

  /**
   * Extracts each entry in the zip archive file.
   *
   * @param zipFile The archive to extract.
   * @throws IOException Could not extract the zip file entry.
   */
  private static void iterate( final ZipFile zipFile )
    throws IOException {
    // Determine the directory name where the zip archive resides. Files will
    // be extracted relative to that directory.
    final var path = getDirectory( zipFile );
    final var entries = zipFile.entries();

    while( entries.hasMoreElements() ) {
      final var zipEntry = entries.nextElement();
      final var zipEntryPath = path.resolve( zipEntry.getName() );

      // Guard against zip slip.
      if( zipEntryPath.normalize().startsWith( path ) ) {
        extract( zipFile, zipEntry, zipEntryPath );
      }
    }
  }

  /**
   * Extracts a single entry of a zip file to a given directory. This will
   * create the necessary directory path if it doesn't exist. Empty
   * directories are not re-created.
   *
   * @param zipFile      The zip archive to extract.
   * @param zipEntry     An entry in the zip archive.
   * @param zipEntryPath The file location to write the zip entry.
   * @throws IOException Could not extract the zip file entry.
   */
  private static void extract(
    final ZipFile zipFile,
    final ZipEntry zipEntry,
    final Path zipEntryPath ) throws IOException {
    // Only attempt to extract files, skipping empty directories.
    if( !zipEntry.isDirectory() ) {
      createDirectories( zipEntryPath.getParent() );

      try( final var in = zipFile.getInputStream( zipEntry ) ) {
        Files.copy( in, zipEntryPath, REPLACE_EXISTING );
      }
    }
  }

  /**
   * Helper method to return the normalized directory where the given archive
   * resides.
   *
   * @param zipFile The {@link ZipFile} having a path to normalize.
   * @return The directory containing the given {@link ZipFile}.
   * @throws IOException The zip file has no parent directory.
   */
  private static Path getDirectory( final ZipFile zipFile ) throws IOException {
    final var zipPath = Path.of( zipFile.getName() );
    final var parent = zipPath.getParent();

    if( parent == null ) {
      throw new IOException( zipFile.getName() + " has no parent directory." );
    }

    return parent.normalize();
  }
}

Now that you have the core algorithm in place, you need to check the file extension for ".zip" and, if present, recursively call Zip.extract( ... ) on that file.

Weinstock answered 29/12, 2022 at 21:16 Comment(0)
R
0

Same as NeilMonday's answer, but extracts empty directories:

static public void extractFolder(String zipFile) throws ZipException, IOException 
{
    System.out.println(zipFile);
    int BUFFER = 2048;
    File file = new File(zipFile);

    ZipFile zip = new ZipFile(file);
    String newPath = zipFile.substring(0, zipFile.length() - 4);

    new File(newPath).mkdir();
    Enumeration zipFileEntries = zip.entries();

    // Process each entry
    while (zipFileEntries.hasMoreElements())
    {
        // grab a zip file entry
        ZipEntry entry = (ZipEntry) zipFileEntries.nextElement();
        String currentEntry = entry.getName();
        File destFile = new File(newPath, currentEntry);
        //destFile = new File(newPath, destFile.getName());
        File destinationParent = destFile.getParentFile();

        // create the parent directory structure if needed
        destinationParent.mkdirs();

        if (!entry.isDirectory())
        {
            BufferedInputStream is = new BufferedInputStream(zip
            .getInputStream(entry));
            int currentByte;
            // establish buffer for writing file
            byte data[] = new byte[BUFFER];

            // write the current file to disk
            FileOutputStream fos = new FileOutputStream(destFile);
            BufferedOutputStream dest = new BufferedOutputStream(fos,
            BUFFER);

            // read and write until last byte is encountered
            while ((currentByte = is.read(data, 0, BUFFER)) != -1) {
                dest.write(data, 0, currentByte);
            }
            dest.flush();
            dest.close();
            is.close();
        }
        else{
            destFile.mkdirs()
        }
        if (currentEntry.endsWith(".zip"))
        {
            // found a zip file, try to open
            extractFolder(destFile.getAbsolutePath());
        }
    }
}
Ribbon answered 29/2, 2012 at 19:30 Comment(0)
S
0

Here is some code, which I tested to be working quite well :

package com.test;

import java.io.BufferedInputStream;
import java.io.BufferedOutputStream;
import java.io.File;
import java.io.FileOutputStream;
import java.util.Enumeration;
import java.util.zip.ZipEntry;
import java.util.zip.ZipFile;

public class Unzipper {  
    private final static int BUFFER_SIZE = 2048;
    private final static String ZIP_FILE = "/home/anton/test/test.zip";
    private final static String DESTINATION_DIRECTORY = "/home/anton/test/";
    private final static String ZIP_EXTENSION = ".zip";
 
    public static void main(String[] args) {
     System.out.println("Trying to unzip file " + ZIP_FILE); 
        Unzipper unzip = new Unzipper();  
        if (unzip.unzipToFile(ZIP_FILE, DESTINATION_DIRECTORY)) {
         System.out.println("Succefully unzipped to the directory " 
             + DESTINATION_DIRECTORY);
        } else {
         System.out.println("There was some error during extracting archive to the directory " 
             + DESTINATION_DIRECTORY);
        }
    } 

 public boolean unzipToFile(String srcZipFileName,
   String destDirectoryName) {
  try {
   BufferedInputStream bufIS = null;
   // create the destination directory structure (if needed)
   File destDirectory = new File(destDirectoryName);
   destDirectory.mkdirs();

   // open archive for reading
   File file = new File(srcZipFileName);
   ZipFile zipFile = new ZipFile(file, ZipFile.OPEN_READ);

   //for every zip archive entry do
   Enumeration<? extends ZipEntry> zipFileEntries = zipFile.entries();
   while (zipFileEntries.hasMoreElements()) {
    ZipEntry entry = (ZipEntry) zipFileEntries.nextElement();
    System.out.println("\tExtracting entry: " + entry);

    //create destination file
    File destFile = new File(destDirectory, entry.getName());

    //create parent directories if needed
    File parentDestFile = destFile.getParentFile();    
    parentDestFile.mkdirs();    
    
    if (!entry.isDirectory()) {
     bufIS = new BufferedInputStream(
       zipFile.getInputStream(entry));
     int currentByte;

     // buffer for writing file
     byte data[] = new byte[BUFFER_SIZE];

     // write the current file to disk
     FileOutputStream fOS = new FileOutputStream(destFile);
     BufferedOutputStream bufOS = new BufferedOutputStream(fOS, BUFFER_SIZE);

     while ((currentByte = bufIS.read(data, 0, BUFFER_SIZE)) != -1) {
      bufOS.write(data, 0, currentByte);
     }

     // close BufferedOutputStream
     bufOS.flush();
     bufOS.close();

     // recursively unzip files
     if (entry.getName().toLowerCase().endsWith(ZIP_EXTENSION)) {
      String zipFilePath = destDirectory.getPath() + File.separatorChar + entry.getName();

      unzipToFile(zipFilePath, zipFilePath.substring(0, 
              zipFilePath.length() - ZIP_EXTENSION.length()));
     }
    }
   }
   bufIS.close();
   return true;
  } catch (Exception e) {
   e.printStackTrace();
   return false;
  }
 } 
}  

I tried with the top voted answer here, and that does not recursively unzip the files, it just unzips the files of the first level.

Source : Solution which extracts files into a given directory

Also, check this solution by the same person : Solution which extracts file in memory

Syncope answered 8/7, 2020 at 4:1 Comment(0)
T
-3
File dir = new File("BASE DIRECTORY PATH");
FileFilter ff = new FileFilter() {

    @Override
    public boolean accept(File f) {
        //only want zip files
        return (f.isFile() && f.getName().toLowerCase().endsWith(".zip"));
    }
};

File[] list = null;
while ((list = dir.listFiles(ff)).length > 0) {
    File file1 = list[0];
    //TODO unzip the file to the base directory
}
Toniatonic answered 11/6, 2009 at 15:6 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.