DeflateStream not decompressing data (the first time)
Asked Answered
D

1

8

So here's a strange one. I have this method to take a Base64-encoded deflated string and return the original data:

public static string Base64Decompress(string base64data)
{
    byte[] b = Convert.FromBase64String(base64data);
    using (var orig = new MemoryStream(b))
    {
        using (var inflate = new MemoryStream())
        {
            using (var ds = new DeflateStream(orig, CompressionMode.Decompress))
            {
                ds.CopyTo(inflate);
                return Encoding.ASCII.GetString(inflate.ToArray());
            }
        }
    }
}

This returns an empty string unless I add a second call to ds.CopyTo(inflate). (WTF?)

   ...
            using (var ds = new DeflateStream(orig, CompressionMode.Decompress))
            {
                ds.CopyTo(inflate);
                ds.CopyTo(inflate);
                return Encoding.ASCII.GetString(inflate.ToArray());
            }
   ...

(Flush/Close/Dispose on ds have no effect.)

Why does the DeflateStream copy 0 bytes on the first call? I've also tried looping with Read(), but it also returns zero on the first call, then works on the second.


Update: here's the method I'm using to compress data.
public static string Base64Compress(string data, Encoding enc)
{
    using (var ms = new MemoryStream())
    {
        using (var ds = new DeflateStream(ms, CompressionMode.Compress))
        {
            byte[] b = enc.GetBytes(data);
            ds.Write(b, 0, b.Length);
            ds.Flush();
            return Convert.ToBase64String(ms.ToArray());
        }
    }
}
Decision answered 11/11, 2010 at 20:27 Comment(7)
This is very interesting. What happens when you replace the first of the two ds.CopyTo() with a ds.Read(...)? The first CopyTo() triggers reading over the footer of the stream. Read() should do the same. Just wondering.Obovoid
Are you sure it's deflate compressed, and not gzip compressed ? And are you sure there's no other stuff infront of the deflate (or gzip?) data?Jojo
@Pieter: a .Read() has the same effect -- it returns 0, but causes the next call to CopyTo() to work.Decision
@nos: Yep. I generated the data with DeflateStream. I also used an external tool to test the data generated by my Compress method and it had no complaints. I'll post the compression method as well.Decision
I have seen this before if the last block of the compression stream was not written out fully (i.e., incomplete); the first call to read/copy to will fail and subsequent calls will access the data. I will see if I can dig up some reference material...Tame
The DeflateStream must be closed to write the final block; see updated answer.Tame
@josh3736: I face same problem. After Copyto of input file stream into DeflateCompress the memory stream size is 0kb if the input file size is less then 100kb.Jibe
T
7

This happens when the compressed bytes are incomplete (i.e., not all blocks are written out).

If I use your Base64Compress with the following Decompress method I will get an InvalidDataException with the message 'Unknown block type. Stream might be corrupted.'

Decompress

public static string Decompress(Byte[] bytes)
{
  using (var uncompressed = new MemoryStream())
  using (var compressed = new MemoryStream(bytes))
  using (var ds = new DeflateStream(compressed, CompressionMode.Decompress))
  {
    ds.CopyTo(uncompressed);
    return Encoding.ASCII.GetString(uncompressed.ToArray());
  }
}

Note that everything works as expected when using the following Compress method

public Byte[] Compress(Byte[] bytes)
{
  using (var memoryStream = new MemoryStream())
  {
    using (var deflateStream = new DeflateStream(memoryStream, CompressionMode.Compress))
      deflateStream.Write(bytes, 0, bytes.Length);

    return memoryStream.ToArray();
  }
}

Update

Oops, foolish me... you cannot ToArray the memory stream until you dispose the DeflateStream (as flush is acutally not implemented (and Deflate/GZip compress blocks of data); the final block is only written on close/dispose.

Re-write compress as:

public static string Base64Compress(string data, Encoding enc)
{
  using (var ms = new MemoryStream())
  {
    using (var ds = new DeflateStream(ms, CompressionMode.Compress))
    {
      byte[] b = enc.GetBytes(data);
      ds.Write(b, 0, b.Length);
    }

    return Convert.ToBase64String(ms.ToArray());
  }
} 
Tame answered 11/11, 2010 at 20:56 Comment(3)
Yes, that's the problem. Technically you should use the DeflateStream() overload that takes the leaveOpen argument and pass true. Without it, closing/disposing the DeflateStream will also dispose the MemoryStream. That this doesn't cause a problem right now is an accident.Trophic
@Hans, definitely not a bad idea, although Disposing a MemoryStream does not actually clear the buffer; rather it only prevents any further reads/writes from taking place on the MemoryStream. So technically there is a duplicate dispose on the MemoryStream, the bytes accessed via ToArray is still accessible regardless.Tame
Yup. I've got a massive amount of cr*p at SO for pointing out that disposing a MemoryStream is silly. Glad to give some of it back :)Trophic

© 2022 - 2024 — McMap. All rights reserved.