Creating zip archive in Java
Asked Answered
P

8

37

I have one file created by 7zip program. I used deflate method to compress it. Now I want to create the same archive (with the same MD5sum) in java. When I create zip file, I used the algorithm that I found on the Internet for example http://www.kodejava.org/examples/119.html but when I created zip file with this method the compressed size is higher than size of the uncompressed file so what is going on? This isn't a very useful compression. So how I can create zip file that is exactly same as zip file that I created with 7zip program ? If it helps I have all information about zip file that I created in 7zip program.

Peden answered 23/1, 2011 at 12:28 Comment(0)
K
67
// simplified code for zip creation in java

import java.io.*;
import java.util.zip.*;

public class ZipCreateExample {

    public static void main(String[] args) throws Exception {

        // input file 
        FileInputStream in = new FileInputStream("F:/sometxt.txt");

        // out put file 
        ZipOutputStream out = new ZipOutputStream(new FileOutputStream("F:/tmp.zip"));

        // name the file inside the zip  file 
        out.putNextEntry(new ZipEntry("zippedjava.txt")); 

        // buffer size
        byte[] b = new byte[1024];
        int count;

        while ((count = in.read(b)) > 0) {
            out.write(b, 0, count);
        }
        out.close();
        in.close();
    }
}
Klump answered 1/12, 2011 at 7:45 Comment(1)
I am unable to open the ZIP folder, it says Access to Compressed Folder is deniedXerophagy
L
7

Just to clarify, you used the ZIP algorithm in 7zip for your original? Also 7zip claims to have a 2-10% better compression ratio than other vendors. I would venture a guess that the ZIP algorithm built into Java is not nearly as optimized as the one in 7zip. Your best best is to invoke 7zip from the command line if you want a similarly compressed file.

Are you trying to unpack a ZIP file, change a file within it, then re-compress it so that it has the same MD5 hash? Hashes are meant to prevent you from doing that.

Lazarus answered 23/1, 2011 at 12:55 Comment(3)
Furthermore, there are lots of options available when compressing with 7zip. Even when directly invoking 7zip, you still have to guess the configuration that was used.Crumley
I try a lot of configuration but still compressed size was exactly the same as the uncompressed size or lower but never higherPeden
hm and how I can invoke 7zip from command line ? can you give me an examplePeden
P
6

ZipOutputStream has few methods to tune compression:

public void setMethod(int method)

Sets the default compression method for subsequent entries. This default will be used whenever the compression method is not specified for an individual ZIP file entry, and is initially set to DEFLATED.

public void setLevel(int level)

Sets the compression level for subsequent entries which are DEFLATED. The default setting is DEFAULT_COMPRESSION. level - the compression level (0-9)

When you add after something like:

ZipOutputStream zos = new ZipOutputStream(new FileOutputStream(target));
zos.setMethod( ZipOutputStream.DEFLATED );
zos.setLevel( 5 );
...

does not it improve your compression?

Philosophism answered 23/1, 2011 at 13:0 Comment(1)
as I said before I used deflated method and also I try all level of compression (-1 .. 9) but still nothing changed.Peden
H
4

Here is a function that you pass the absolute path it will create a zip file with the same name as the directory (under which you want zip of all the sub folder and files, everything !!) and return true on success and false on exception if any.

public class FileUtil { 
final static int BUFFER = 2048;
private static Logger log = Logger.getLogger(FileUtil.class);

      public static boolean createZipArchive(String srcFolder) {

    try {
        BufferedInputStream origin = null;



        FileOutputStream    dest = new FileOutputStream(new File(srcFolder+ ".zip"));

        ZipOutputStream out = new ZipOutputStream(new BufferedOutputStream(dest));
        byte data[] = new byte[BUFFER];

        File subDir = new File(srcFolder);
        String subdirList[] = subDir.list();
        for(String sd:subdirList)
        {
                // get a list of files from current directory
                File f = new File(srcFolder+"/"+sd);
                if(f.isDirectory())
                {
                    String files[] = f.list();

                    for (int i = 0; i < files.length; i++) {
                        System.out.println("Adding: " + files[i]);
                        FileInputStream fi = new FileInputStream(srcFolder  + "/"+sd+"/" + files[i]);
                        origin = new BufferedInputStream(fi, BUFFER);
                        ZipEntry entry = new ZipEntry(sd +"/"+files[i]);
                        out.putNextEntry(entry);
                        int count;
                        while ((count = origin.read(data, 0, BUFFER)) != -1) {
                            out.write(data, 0, count);
                            out.flush();
                        }

                    }
                }
                else //it is just a file
                {
                    FileInputStream fi = new FileInputStream(f);
                    origin = new BufferedInputStream(fi, BUFFER);
                    ZipEntry entry = new ZipEntry(sd);
                    out.putNextEntry(entry);
                    int count;
                    while ((count = origin.read(data, 0, BUFFER)) != -1) {
                        out.write(data, 0, count);
                        out.flush();
                    }

                }
        }
        origin.close();
        out.flush();
        out.close();
    } catch (Exception e) {
        log.info("createZipArchive threw exception: " + e.getMessage());        
        return false;

    }


    return true;
}   
  }
Humming answered 6/5, 2013 at 13:56 Comment(0)
V
1

To generate two identical zip files (including identical md5sum) from the same source file, I would recommend using the same zip utility -- either always use the same Java program, or always use 7zip.

The 7zip utility for instance has a lot of options -- many of which are simply defaults that can be customized (or differ between releases?) -- and any Java zip implementation would have to also set these options explicitly. If your Java app can simply invoke an external "7z" program, you'll probably get better performance anyway that a custom Java zip implementation. (This is also a good example of a map-reduce problem where you can easily scale out the implementation.)

But the main issue you will run into if you have a server-side generated zip file and a client-side generated zip file is that the zip file stores two things in addition to just the original file: (1) the file name, and (2) the file timestamp. If either of these have changed, then the resulting zip file will have a different md5sum:

$ ls tst1/
foo.tar

$ cp -r tst1 tst2

$ ( cd tst1; zip foo.zip foo.tar )  ; ( cd tst2; zip foo.zip foo.tar )   ; md5sum tst?/foo.zip
updating: foo.tar (deflated 20%)
updating: foo.tar (deflated 20%)
359b82678a2e17c1ddbc795ceeae7b60  tst1/foo.zip
b55c33c0414ff987597d3ef9ad8d1d08  tst2/foo.zip

But, using "cp -p" (preserve timestamp):

$ cp -p -r tst1 tst2

$ ( cd tst1; zip foo.zip foo.tar )  ; ( cd tst2; zip foo.zip foo.tar )   ; md5sum tst?/foo.zip
updating: foo.tar (deflated 20%)
updating: foo.tar (deflated 20%)
359b82678a2e17c1ddbc795ceeae7b60  tst1/foo.zip
359b82678a2e17c1ddbc795ceeae7b60  tst2/foo.zip

You'll find the same problem with differing filenames and paths, even when the files inside the zip are identical.

Valentino answered 14/12, 2011 at 11:54 Comment(0)
P
0
package comm;

import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;*emphasized text*
import java.io.IOException;
import java.util.zip.ZipEntry;
import java.util.zip.ZipOutputStream;

public class Zip1 {
      public static void main( String[] args )
        {
            byte[] buffer = new byte[1024];

            try{

                File f= new  File("E:\\");
                f.mkdirs();
                File origFile= new File(f,"MyZipFile2.zip");
                FileOutputStream fos = new FileOutputStream(origFile);

                ZipOutputStream zos = new ZipOutputStream(fos);
                ZipEntry ze= new ZipEntry("test.pdf");
                zos.putNextEntry(ze);
                FileInputStream in = new FileInputStream("D:\\Test.pdf");

                int len;
                while ((len = in.read(buffer)) > 0) {
                    zos.write(buffer, 0, len);
                }

                in.close();
                zos.closeEntry();

                //remember close it
                zos.close();

                System.out.println("Done");

            }catch(IOException ex){
               ex.printStackTrace();
            }
        }
}
Phosphorus answered 31/7, 2013 at 12:14 Comment(1)
Thanks for the code but this answer should have some explanations about how your code answers the question.Fluorescence
S
0

Please find in the below code having the functionalities to zip and unzip. Hope it may help someone.

package com.util;

import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Date;
import java.util.List;
import java.util.zip.ZipEntry;
import java.util.zip.ZipInputStream;
import java.util.zip.ZipOutputStream;


/**
 * @author dinesh.lomte
 *
 */
public class ZipUtil {

    /**
     * 
     * @param source
     * @param destination
     */
    public static void unZip(String source, String destination) {

        String method = "unZip(String source, String destination)";
        ZipInputStream zipInputStream = null;
        try {
            // Creating the ZipInputStream instance from the source file
            zipInputStream = new ZipInputStream(new FileInputStream(source));
            // Getting the zipped file list entry
            ZipEntry zipEntry = zipInputStream.getNextEntry();
            // Iterating through the file list entry
            while (zipEntry != null) {
                String fileName = zipEntry.getName();
                File file = new File(new StringBuilder(destination)
                    .append(File.separator)
                    .append(AppUtil.getFileNameWithoutExtension(
                            AppUtil.getNameFromPath(source)))
                    .append(File.separator).append(fileName).toString());                
                // Creating non existing folders to avoid any FileNotFoundException 
                // for compressed folder
                new File(file.getParent()).mkdirs();
                FileOutputStream fileOutputStream = new FileOutputStream(file);
                byte[] buffer = new byte[1024];
                int length;
                while ((length = zipInputStream.read(buffer)) > 0) {
                    fileOutputStream.write(buffer, 0, length);
                }
                fileOutputStream.close();
                zipEntry = zipInputStream.getNextEntry();
            }
        } catch (IOException iOException) {
            System.out.println("Failed to unzip the ''{0}'' file located in ''{1}'' folder. Due to, {2}");

        } finally {
            // Validating if zipInputStream instance in not null
            if (zipInputStream != null) {
                try {
                    zipInputStream.closeEntry();
                    zipInputStream.close();
                } catch (IOException iOException) {                 
                }
            }
        }
    }

    /**
     * Traverse a directory from the source folder location and get all files,
     * and add the file into files list.
     *
     * @param node
     */
    public static void generateFileList(
            String source, File node, List<String> files) {     
        // Validating if the node is a file
        if (node.isFile()) {
            files.add(generateZipEntry(
                    source, node.getPath().toString()));
        }
        // Validating if the node is a directory
        if (node.isDirectory()) {
            String[] subNote = node.list();
            for (String filename : subNote) {
                generateFileList(source, new File(node, filename), files);
            }
        }
    }

    /**
     * Format the file path to zip
     * @param source
     * @param file
     * @return
     */
    private static String generateZipEntry(String source, String file) {
        return file.substring(source.length(), file.length());
    }

    /**
     * 
     * @param source
     * @param destination
     */
    public static void zip(String source, String destination) {

        String method = "zip(String source, String destination)";
        ZipOutputStream zipOutputStream = null;        
        try {            
            // Creating the zipOutputStream instance
            zipOutputStream = new ZipOutputStream(
                    new FileOutputStream(destination));
            List<String> files = new ArrayList<>();
            generateFileList(source, new File(source), files);
            // Iterating the list of file(s) to zip/compress
            for (String file : files) {
                // Adding the file(s) to the zip
                ZipEntry zipEntry = new ZipEntry(file);
                zipOutputStream.putNextEntry(zipEntry);
                FileInputStream fileInputStream = new FileInputStream(
                        new StringBuilder(source).append(File.separator)
                        .append(file).toString());
                int length;
                byte[] buffer = new byte[1024];
                while ((length = fileInputStream.read(buffer)) > 0) {
                    zipOutputStream.write(buffer, 0, length);
                }                
                // Closing the fileInputStream instance
                fileInputStream.close();
                // De-allocating the memory by assigning the null value
                fileInputStream = null;
            }
        } catch (IOException iOException) {
            System.out.println("Failed to zip the file(s) located in ''{0}'' folder. Due to, {1}");
        } finally {
            // Validating if zipOutputStream instance in not null
            if (zipOutputStream != null) {
                try {
                    zipOutputStream.closeEntry();
                    zipOutputStream.close();
                } catch (IOException iOException) {
                }
            }
        }
    }
}
Sherise answered 30/3, 2017 at 13:43 Comment(0)
F
0

how I can create zip file that is exactly same as zip file that I created with 7zip program

This is an old question but I've been working on my SimpleZip Java package over the past month specifically to do what the OP is asking for -- to have full control over the Zip output. I wrote the library because I could not find a Zip replacement which gave me the fine grained control over the metadata in the file-headers or the central-directory entries. Specifically I was seeing problems with rewriting jars within jar files causing classpath loading issues.

My library has a ZipFileCopy.java example program that reads in a Zip and writes it out again as a series of objects without changing a byte. It:

  1. reads in ZipFileHeader
  2. reads in file bytes
  3. reads in optional ZipDataDescriptor
  4. repeat until null returned by ZipInfo.readFileHeader()
  5. reads in ZipCentralDirectoryFileEntry until null
  6. reads in ZipCentralDirectoryEnd

when I created zip file with this method the compressed size is higher than size of the uncompressed file

With my library, the hardest part is going to be determining what compression-level was used and configuring the deflater algorithm being used to generate the same bytes. I'm just delegating to the JDK internal java.util.zip.Deflater class and I'm not sure if the window sizes and the like match the 7zip program. Although there are Zip per-file flags that my library uses to determine the compression level of each file entry, they don't seem to always be assigned by Zip implementations. Without them my library would use the default level (I think 6).

Although the copy code has some comments, there is also some online documentation for SimpleZip.

Fluorescence answered 23/5 at 21:47 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.