Reading from a ZipInputStream into a ByteArrayOutputStream
Asked Answered
M

10

20

I am trying to read a single file from a java.util.zip.ZipInputStream, and copy it into a java.io.ByteArrayOutputStream (so that I can then create a java.io.ByteArrayInputStream and hand that to a 3rd party library that will end up closing the stream, and I don't want my ZipInputStream getting closed).

I'm probably missing something basic here, but I never enter the while loop here:

ByteArrayOutputStream streamBuilder = new ByteArrayOutputStream();
int bytesRead;
byte[] tempBuffer = new byte[8192*2];
try {
    while ((bytesRead = zipStream.read(tempBuffer)) != -1) {
        streamBuilder.write(tempBuffer, 0, bytesRead);
    }
} catch (IOException e) {
    // ...
}

What am I missing that will allow me to copy the stream?

Edit:

I should have mentioned earlier that this ZipInputStream is not coming from a file, so I don't think I can use a ZipFile. It is coming from a file uploaded through a servlet.

Also, I have already called getNextEntry() on the ZipInputStream before getting to this snippet of code. If I don't try copying the file into another InputStream (via the OutputStream mentioned above), and just pass the ZipInputStream to my 3rd party library, the library closes the stream, and I can't do anything more, like dealing with the remaining files in the stream.

Mcdaniels answered 15/9, 2008 at 21:41 Comment(4)
So what does zipEntry.getSize() return?Finished
zipEntry.getSize() returns a reasonable number, 28689, in this case.Mcdaniels
Maybe you don't care now, but you can avoid copying all the data and avoid the 3d party library closing the stream if you wrap the original input stream (zipStream) and override the close method. 1) Make a public class DontCloseInputStream extends FilterInputStream. 2) Create a constructor (InputStream in) that calls super(in) 3) Override close method and do nothing 4) Create new DontCloseInputStream(zipStream) 5) pass it to the library. And voi láSymptomatic
And for copying an InputStream onto an OutputStream there's an utility class called Streams in the commons-fileupload library (Apache). You do Streams.copy(in, out, close?) and it's done.Symptomatic
C
9

Your loop looks valid - what does the following code (just on it's own) return?

zipStream.read(tempBuffer)

if it's returning -1, then the zipStream is closed before you get it, and all bets are off. It's time to use your debugger and make sure what's being passed to you is actually valid.

When you call getNextEntry(), does it return a value, and is the data in the entry meaningful (i.e. does getCompressedSize() return a valid value)? IF you are just reading a Zip file that doesn't have read-ahead zip entries embedded, then ZipInputStream isn't going to work for you.

Some useful tidbits about the Zip format:

Each file embedded in a zip file has a header. This header can contain useful information (such as the compressed length of the stream, it's offset in the file, CRC) - or it can contain some magic values that basically say 'The information isn't in the stream header, you have to check the Zip post-amble'.

Each zip file then has a table that is attached to the end of the file that contains all of the zip entries, along with the real data. The table at the end is mandatory, and the values in it must be correct. In contrast, the values embedded in the stream do not have to be provided.

If you use ZipFile, it reads the table at the end of the zip. If you use ZipInputStream, I suspect that getNextEntry() attempts to use the entries embedded in the stream. If those values aren't specified, then ZipInputStream has no idea how long the stream might be. The inflate algorithm is self terminating (you actually don't need to know the uncompressed length of the output stream in order to fully recover the output), but it's possible that the Java version of this reader doesn't handle this situation very well.

I will say that it's fairly unusual to have a servlet returning a ZipInputStream (it's much more common to receive an inflatorInputStream if you are going to be receiving compressed content.

Conflict answered 16/9, 2008 at 4:54 Comment(1)
ZipInputStream in java DOES NOT HANDLE THIS WELL. Thank you for posting this.Holly
D
7

You probably tried reading from a FileInputStream like this:

ZipInputStream in = new ZipInputStream(new FileInputStream(...));

This won’t work since a zip archive can contain multiple files and you need to specify which file to read.

You could use java.util.zip.ZipFile and a library such as IOUtils from Apache Commons IO or ByteStreams from Guava that assist you in copying the stream.

Example:

ByteArrayOutputStream out = new ByteArrayOutputStream();
try (ZipFile zipFile = new ZipFile("foo.zip")) {
    ZipEntry zipEntry = zipFile.getEntry("fileInTheZip.txt");

    try (InputStream in = zipFile.getInputStream(zipEntry)) {
        IOUtils.copy(in, out);
    }
}
Decorticate answered 15/9, 2008 at 22:47 Comment(0)
S
4

I'd use IOUtils from the commons io project.

IOUtils.copy(zipStream, byteArrayOutputStream);
Skit answered 15/9, 2008 at 21:56 Comment(1)
This looks like it might work. I will try it when I get to work tomorrow. Thanks.Mcdaniels
T
4

You're missing call

ZipEntry entry = (ZipEntry) zipStream.getNextEntry();

to position the first byte decompressed of the first entry.

 ByteArrayOutputStream streamBuilder = new ByteArrayOutputStream();
 int bytesRead;
 byte[] tempBuffer = new byte[8192*2];
 ZipEntry entry = (ZipEntry) zipStream.getNextEntry();
 try {
     while ( (bytesRead = zipStream.read(tempBuffer)) != -1 ){
        streamBuilder.write(tempBuffer, 0, bytesRead);
     }
 } catch (IOException e) {
      ...
 }
Topgallant answered 3/4, 2012 at 18:44 Comment(0)
R
3

You could implement your own wrapper around the ZipInputStream that ignores close() and hand that off to the third-party library.

thirdPartyLib.handleZipData(new CloseIgnoringInputStream(zipStream));


class CloseIgnoringInputStream extends InputStream
{
    private ZipInputStream stream;

    public CloseIgnoringInputStream(ZipInputStream inStream)
    {
        stream = inStream;
    }

    public int read() throws IOException {
        return stream.read();
    }

    public void close()
    {
        //ignore
    }

    public void reallyClose() throws IOException
    {
        stream.close();
    }
}
Reinke answered 16/9, 2008 at 3:31 Comment(0)
F
1

I would call getNextEntry() on the ZipInputStream until it is at the entry you want (use ZipEntry.getName() etc.). Calling getNextEntry() will advance the "cursor" to the beginning of the entry that it returns. Then, use ZipEntry.getSize() to determine how many bytes you should read using zipInputStream.read().

Finished answered 16/9, 2008 at 0:19 Comment(1)
I actually have called getNextEntry() before getting to this snippet. I just added some clarifiaction to the question.Mcdaniels
F
0

It is unclear how you got the zipStream. It should work when you get it like this:

  zipStream = zipFile.getInputStream(zipEntry)
Finished answered 15/9, 2008 at 21:53 Comment(1)
I just added a clarification about this, but it isn't comiing from a file.Mcdaniels
S
0

t is unclear how you got the zipStream. It should work when you get it like this:

  zipStream = zipFile.getInputStream(zipEntry)

If you are obtaining the ZipInputStream from a ZipFile you can get one stream for the 3d party library, let it use it, and you obtain another input stream using the code before.

Remember, an inputstream is a cursor. If you have the entire data (like a ZipFile) you can ask for N cursors over it.

A diferent case is if you only have an "GZip" inputstream, only an zipped byte stream. In that case you ByteArrayOutputStream buffer makes all sense.

Symptomatic answered 15/9, 2008 at 22:56 Comment(0)
L
0

Please try code bellow

private static byte[] getZipArchiveContent(File zipName) throws WorkflowServiceBusinessException {

  BufferedInputStream buffer = null;
  FileInputStream fileStream = null;
  ByteArrayOutputStream byteOut = null;
  byte data[] = new byte[BUFFER];

  try {
   try {
    fileStream = new FileInputStream(zipName);
    buffer = new BufferedInputStream(fileStream);
    byteOut = new ByteArrayOutputStream();

    int count;
    while((count = buffer.read(data, 0, BUFFER)) != -1) {
     byteOut.write(data, 0, count);
    }
   } catch(Exception e) {
    throw new WorkflowServiceBusinessException(e.getMessage(), e);
   } finally {
    if(null != fileStream) {
     fileStream.close();
    }
    if(null != buffer) {
     buffer.close();
    }
    if(null != byteOut) {
     byteOut.close();
    }
   }
  } catch(Exception e) {
   throw new WorkflowServiceBusinessException(e.getMessage(), e);
  }
  return byteOut.toByteArray();

 }
Larcener answered 19/1, 2010 at 11:49 Comment(0)
K
-1

Check if the input stream is positioned in the begging.

Otherwise, as implementation: I do not think that you need to write to the result stream while you are reading, unless you process this exact stream in another thread.

Just create a byte array, read the input stream, then create the output stream.

Kwon answered 15/9, 2008 at 21:51 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.