.NET 4.5 has added new classes to work with zip archives. Now you can do something like this:
using (ZipArchive archive = ZipFile.OpenRead(zipFilePath))
{
foreach (ZipArchiveEntry entry in archive.Entries)
{
// Extract it to the file
entry.ExtractToFile(entry.Name);
// or do whatever you want
using (Stream stream = entry.Open())
{
...
}
}
}
Obviously, if you work with large archives it may take seconds or even minutes to read the files from the archive. So if you were writing some GUI app (WinForms or WPF) you would probably run such code in a separate thread otherwise you will block UI thread and make your app users very upset.
However all I/O operations in this code will be executed in the blocking mode which is considered as "not cool" in 2016. So there are two questions:
- Is it possible to get async I/O with
System.IO.Compression
classes (or maybe with some other third-party .NET library)? - Does it even make sense to do that? I mean compressing/extracting algorithms are very CPU-consuming anyway, so if we even switch from
CPU-boundblocking I/O to async I/O, the performance gain can be relatively small (of course in percentage, not absolute values).
UPDATE:
To reply to the answer from Peter Duniho: yes, you're right. For some reason I didn't think about this option:
using (Stream zipStream = entry.Open())
using (FileStream fileStream = new FileStream(...))
{
await zipStream.CopyToAsync(fileStream);
}
which definitely works. Thanks!
By the way
await Task.Run(() => entry.ExtractToFile(entry.Name));
will still be CPU-bound blocking I/O operation, just in separate thread consume the thread from the thread pool during I/O operations.
However as I can see developers of .NET still use blocking I/O for some archive operations (like this code to enumerate entries in the archive for example: ZipArchive.cs on dotnet@github). I also found an open issue about the lack of asynchronous API for ZipFile APIs.
I guess at this time we have partial async support but it is far from complete.
ExtractToFile()
is easy enough to implement as a true, non-thread-consuming async I/O method (as you've already shown in your update above). – Proudfoot