How can I unzip a file to a .NET memory stream?
Asked Answered
H

7

68

I have files (from 3rd parties) that are being FTP'd to a directory on our server. I download them and process them even 'x' minutes. Works great.

Now, some of the files are .zip files. Which means I can't process them. I need to unzip them first.

FTP has no concept of zip/unzipping - so I'll need to grab the zip file, unzip it, then process it.

Looking at the MSDN zip api, there seems to be no way i can unzip to a memory stream?

So is the only way to do this...

  1. Unzip to a file (what directory? need some -very- temp location ...)
  2. Read the file contents
  3. Delete file.

NOTE: The contents of the file are small - say 4k <-> 1000k.

Homogamy answered 24/3, 2014 at 9:4 Comment(0)
C
125

Zip compression support is built in:

using System.IO;
using System.IO.Compression;
// ^^^ requires a reference to System.IO.Compression.dll
static class Program
{
    const string path = ...
    static void Main()
    {
        using(var file = File.OpenRead(path))
        using(var zip = new ZipArchive(file, ZipArchiveMode.Read))
        {
            foreach(var entry in zip.Entries)
            {
                using(var stream = entry.Open())
                {
                    // do whatever we want with stream
                    // ...
                }
            }
        }
    }
}

Normally you should avoid copying it into another stream - just use it "as is", however, if you absolutely need it in a MemoryStream, you could do:

using(var ms = new MemoryStream())
{
    stream.CopyTo(ms);
    ms.Position = 0; // rewind
    // do something with ms
}
Catchpole answered 24/3, 2014 at 9:14 Comment(9)
Is there any particular reason, why you create file stream, and then use it in ZipArchieve constructor, instead of using ZipFile.OpenRead ?Gunpaper
@Gunpaper well, firstly I'm not sure why that class even exists: all of the methods on ZipFile are actually about the ZipArchive class - to me, they should all be static members on ZipArchive! But more specifically, because the OP is talking about taking data from an existing source - in this case FTP. In that scenario, you can't guarantee that you have a file, but you can usually assume you have a stream. So showing how to do it from a stream is more re-usable and applicable to any context, not just files. But sure: you could use ZipFile.OpenRead here.Catchpole
@Gunpaper also, ZipFile requires an extra assembly reference (System.IO.Compression.FileSystem.dll), just to avoid a simple File.OpenRead - doesn't seem worth itCatchpole
Only in .net 4.5 and later. not support XPBrython
@Brython as professionals, we would do well ourselves to not support XP: doing so would put our customers/clients at risk (by offering implicit approval). That OS is officially dead. The very last EOL date is about 2 weeks away. "After 8th April 2014, support and security updates for Windows XP will no longer be available."Catchpole
What about Server 2003? One more year to go!Brython
@Brython that doesn't mean we should tacitly encourage people to keep using it ;pCatchpole
entry.open() returns a DeflatedStream whose Length and Position properties will throw an exception if you try to access them. Copying to a new stream solves the problemHallowmas
@JasonBaley at the cost of requiring us to pre-emptively deflate everything into memory; pros and consCatchpole
C
28

You can use ZipArchiveEntry.Open to get a stream.

This code assumes the zip archive has one text file.

using (FileStream fs = new FileStream(path, FileMode.Open))
using (ZipArchive zip = new ZipArchive(fs) )
{
    var entry = zip.Entries.First();

    using (StreamReader sr = new StreamReader(entry.Open()))
    {
        Console.WriteLine(sr.ReadToEnd());
    }
}
Crocked answered 24/3, 2014 at 9:13 Comment(8)
obvious comment: this will break in nasty ways if the data isn't text, or if the data is in an unusual encoding but lacks a BOMCatchpole
@Haematin why that edit? IMO the original was preferable and actively better, but either way this doesn't seem edit-worthyCatchpole
@MarcGravell i felt it made the code more explicit for readers who may not appreciate the behavior of omitted bracers.Haematin
@MarcGravell Yeah, I added the StreamReader just to show the simplest use case possible. Of course, if it's not text you're reading, then StreamReader.ReadToEnd is not what you're looking for. (I reverted Gusdor's edit).Crocked
@Crocked I said it was an obvious comment ;p It is my experience, however, that the "obvious" is often anything but, depending on the reader...Catchpole
@MarcGravell Agreed, thanks for pointing it out ^^ I've added a comment to the code, to avoid misleading the readers.Crocked
@MarcGravell I can understand preferable but I'm still struggling with actively better. Can you elaborate?Haematin
@Haematin easier to read, more obvious to understand (IMO), and prevents the code exploding to the right. Code readability is a feature.Catchpole
F
15
using (ZipArchive archive = new ZipArchive(webResponse.GetResponseStream()))
{
     foreach (ZipArchiveEntry entry in archive.Entries)
     {
        Stream s = entry.Open();
        var sr = new StreamReader(s);
        var myStr = sr.ReadToEnd();
     }
} 
Fatback answered 20/10, 2016 at 4:37 Comment(1)
You need a using statements for both Stream s and StreamReader sr to automatically close them.Blase
G
10

Looks like here is what you need:

using (var za = ZipFile.OpenRead(path))
{
    foreach (var entry in za.Entries)
    {
        using (var r = new StreamReader(entry.Open()))
        {
            //your code here
        }
    }
}
Gunpaper answered 24/3, 2014 at 9:11 Comment(0)
T
0

You can use SharpZipLib among a variety of other libraries to achieve this.

You can use the following code example to unzip to a MemoryStream, as shown on their wiki:

using ICSharpCode.SharpZipLib.Zip;

// Compresses the supplied memory stream, naming it as zipEntryName, into a zip,
// which is returned as a memory stream or a byte array.
//
public MemoryStream CreateToMemoryStream(MemoryStream memStreamIn, string zipEntryName) {

    MemoryStream outputMemStream = new MemoryStream();
    ZipOutputStream zipStream = new ZipOutputStream(outputMemStream);

    zipStream.SetLevel(3); //0-9, 9 being the highest level of compression

    ZipEntry newEntry = new ZipEntry(zipEntryName);
    newEntry.DateTime = DateTime.Now;

    zipStream.PutNextEntry(newEntry);

    StreamUtils.Copy(memStreamIn, zipStream, new byte[4096]);
    zipStream.CloseEntry();

    zipStream.IsStreamOwner = false;    // False stops the Close also Closing the underlying stream.
    zipStream.Close();          // Must finish the ZipOutputStream before using outputMemStream.

    outputMemStream.Position = 0;
    return outputMemStream;

    // Alternative outputs:
    // ToArray is the cleaner and easiest to use correctly with the penalty of duplicating allocated memory.
    byte[] byteArrayOut = outputMemStream.ToArray();

    // GetBuffer returns a raw buffer raw and so you need to account for the true length yourself.
    byte[] byteArrayOut = outputMemStream.GetBuffer();
    long len = outputMemStream.Length;
}
Tynes answered 24/3, 2014 at 9:8 Comment(1)
note: you don't need external libraries - zip support is actually present multiple times in the BCLCatchpole
C
0

Ok so combining all of the above, suppose you want to in a very simple way take a zip file called "file.zip" and extract it to "C:\temp" folder. (Note: This example was only tested for compress text files) You may need to do some modifications for binary files.

        using System.IO;
        using System.IO.Compression;

        static void Main(string[] args)
        {
            //Call it like this:
            Unzip("file.zip",@"C:\temp");
        }

        static void Unzip(string sourceZip, string targetPath)
        {
            using (var z = ZipFile.OpenRead(sourceZip))
            {
                foreach (var entry in z.Entries)
                {                    
                    using (var r = new StreamReader(entry.Open()))
                    {
                        string uncompressedFile = Path.Combine(targetPath, entry.Name);
                        File.WriteAllText(uncompressedFile,r.ReadToEnd());
                    }
                }
            }

        }
Copt answered 15/6, 2020 at 3:31 Comment(0)
H
0

It appears that this is what you require:

var stream = file.OpenReadStream();
var archive = new ZipArchive(stream);

var filesPath = Directory.GetCurrentDirectory() + "/TempFile";

var result = from currEntry in archive.Entries
             where !String.IsNullOrEmpty(currEntry.Name)
             select currEntry;

foreach (ZipArchiveEntry entry in result)
{
    entry.ExtractToFile(Path.Combine(filesPath, entry.Name));
}
return HttpStatusCode.OK;
Huffman answered 14/6 at 12:59 Comment(1)
Stream and ZipArchive both implement IDisposable so you should declare them with a using declaration or statement to ensure their resources are properly freed up. Also, the question asks, How can I unzip a file to a .NET memory stream? but you are showing how to extract to a file steam, aren't you? Might you please edit your question to clarify and/or update for use with MemoryStream? Thanks!Squab

© 2022 - 2024 — McMap. All rights reserved.