.NET GZipStream decompress producing empty stream
Asked Answered
P

3

12

I'm trying to serialize and compress a WPF FlowDocument, and then do the reverse - decompress the byte array and deserialize to recreate the FlowDocument - using the .NET GZipStream class. I'm following the example described on MSDN and I have the following test program:

var flowDocumentIn = new FlowDocument();
flowDocumentIn.Blocks.Add(new Paragraph(new Run("Hello")));
Debug.WriteLine("Compress");
byte[] compressedData;
using (var uncompressed = new MemoryStream())
{
    XamlWriter.Save(flowDocumentIn, uncompressed);
    uncompressed.Position = 0;
    using (var compressed = new MemoryStream())
    using (var compressor = new GZipStream(compressed, CompressionMode.Compress))
    {
        Debug.WriteLine(" uncompressed.Length: " + uncompressed.Length);
        uncompressed.CopyTo(compressor);
        Debug.WriteLine(" compressed.Length: " + compressed.Length);
        compressedData = compressed.ToArray();
    }
}

Debug.WriteLine("Decompress");
FlowDocument flowDocumentOut;
using (var compressed = new MemoryStream(compressedData))
using (var uncompressed = new MemoryStream())
using (var decompressor = new GZipStream(compressed, CompressionMode.Decompress))
{
    Debug.WriteLine(" compressed.Length: " + compressed.Length);
    decompressor.CopyTo(uncompressed);
    Debug.WriteLine(" uncompressed.Length: " + uncompressed.Length);
    flowDocumentOut = (FlowDocument) XamlReader.Load(uncompressed);
}

Assert.AreEqual(flowDocumentIn, flowDocumentOut);

However I get an exception at XamlReader.Load line which is normal since the debug output tells that the uncompressed stream has a zero length.

Compress
 uncompressed.Length: 123
 compressed.Length: 202
Decompress
 compressed.Length: 202
 uncompressed.Length: 0

Why doesn't the final uncompressed stream contain the original 123 bytes?

(Please ignore the fact that the "compressed" byte array is bigger than the "uncompressed" byte array - I'll normally be working with much bigger flow documents)

Plug answered 11/8, 2012 at 14:41 Comment(1)
Though you may resolve this, you should consider whether you want to use that class in the first place. See my comments here: #11435700Ripple
S
16

You need to close the GZipStream before getting the compressed bytes from the memory stream. In this case the closing is handled by the Dispose called due to the using.

using (var compressed = new MemoryStream())
{
    using (var compressor = new GZipStream(compressed, CompressionMode.Compress))
    {
        uncompressed.CopyTo(compressor);
    }
    // Get the compressed bytes only after closing the GZipStream
    compressedBytes = compressed.ToArray();
}

This works and you could even remove the using for the MemoryStream since it will be disposed by the GZipStream unless you use the constructor overload that allows you to specify that the underlying stream should be left open. This implies with that code you are calling ToArray on a disposed stream but that is allowed because the bytes are still available which makes disposing memory streams a bit weird but if you don't do it FXCop will annoy you.

Smallsword answered 11/8, 2012 at 18:47 Comment(2)
"You need to close the GZipStream before getting the compressed bytes from the memory stream" - why? And why do you get a different number of bytes if you call .ToArray() before vs. after closing?Plug
Besided the fact that the output is written in blocks, the GZipStream adds an header before the compressed data and a footer after. The footer can only be added at the moment you close the stream.Whim
P
5

Joao's answer did the trick. I've copied the full working example below. I've added a line to output compressedData.Length. Interestingly this outputs 218 bytes, whereas compressedStream.Length outputs only 202 bytes. If you don't close the GZipStream before reading the byte array then compressedData.Length is 202. I'm not sure why closing the GZipStream gives you an extra 16 bytes..

var flowDocumentIn = new FlowDocument();
flowDocumentIn.Blocks.Add(new Paragraph(new Run("Hello")));

Debug.WriteLine("Compress");

byte[] compressedData;

using (var uncompressedStream = new MemoryStream())
{
    XamlWriter.Save(flowDocumentIn, uncompressedStream);
    uncompressedStream.Position = 0;
    using (var compressedStream = new MemoryStream())
    {
        using (var gZipCompressor = new GZipStream(compressedStream, CompressionMode.Compress))
        {
            Debug.WriteLine(" uncompressedStream.Length: " + uncompressedStream.Length);
            uncompressedStream.CopyTo(gZipCompressor);
            Debug.WriteLine(" compressedStream.Length: " + compressedStream.Length);
        }
        compressedData = compressedStream.ToArray();
    }
}

Debug.WriteLine(" compressedData.Length: " + compressedData.Length);

Debug.WriteLine("Decompress");

FlowDocument flowDocumentOut;

using (var compressedStream = new MemoryStream(compressedData))
using (var uncompressedStream = new MemoryStream())
{
    using (var gZipDecompressor = new GZipStream(compressedStream, CompressionMode.Decompress))
    {
        Debug.WriteLine(" compressedStream.Length: " + compressedStream.Length);
        gZipDecompressor.CopyTo(uncompressedStream);
        Debug.WriteLine(" uncompressedStream.Length: " + uncompressedStream.Length);
    }
    uncompressedStream.Position = 0;
    flowDocumentOut = (FlowDocument)XamlReader.Load(uncompressedStream);
}

Debug output:

Compress
 uncompressedStream.Length: 123
 compressedStream.Length: 202
 compressedData.Length: 218
Decompress
 compressedStream.Length: 218
 uncompressedStream.Length: 123

Note also the additional uncompressedStream.Position = 0; before the call to XamlReader.Load.

Plug answered 11/8, 2012 at 20:59 Comment(2)
A deflate compressor (as required for the gzip format) produces compressed output in blocks. The compressor needs to accumulate data to build up a block and generate statistics on it before it is emitted. When you get to the end of your input, you need to tell the deflator to finish up the last block and send it out. Otherwise that data will just sit there waiting for more data to fill a block.Ripple
By the way, the amount by which the compressed data size exceeds the uncompressed data size is one of several bugs in GZipStream.Ripple
P
0

After you copy the decompressed bytes to your stream, you need to set its position to zero so that you can read it properly

Philan answered 23/12, 2015 at 12:59 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.