.NET GZipStream compress and decompress
Asked Answered
M

4

16

What is wrong with this code below. I always get FALSE, meaning after compression, decompressed data does not match original value.

public static bool Test()
        {
            string sample = "This is a compression test of microsoft .net gzip compression method and decompression methods";
            System.Text.ASCIIEncoding encoding = new System.Text.ASCIIEncoding();
            byte[] data = encoding.GetBytes(sample);
            bool result = false;

            //Compress
            MemoryStream cmpStream;
            cmpStream = new MemoryStream();
            GZipStream hgs = new GZipStream(cmpStream, CompressionMode.Compress);
            hgs.Write(data, 0, data.Length);
            byte[] cmpData = cmpStream.ToArray();

            MemoryStream decomStream;
            decomStream = new MemoryStream(cmpData);
            hgs = new GZipStream(decomStream, CompressionMode.Decompress);
            hgs.Read(data, 0, data.Length);

            string sampleOut = System.BitConverter.ToString(data);

            result = String.Equals(sample, sampleOut) ;
            return result;
        }

I will really appreciate if you can point out where I am making a mistake.

Majolica answered 19/10, 2009 at 20:14 Comment(0)
H
15

Try this code:

public static bool Test()
        {
            string sample = "This is a compression test of microsoft .net gzip compression method and decompression methods";

            System.Text.ASCIIEncoding encoding = new System.Text.ASCIIEncoding();

            byte[] data = encoding.GetBytes(sample);
            bool result = false;

            // Compress
            MemoryStream cmpStream = new MemoryStream();

            GZipStream hgs = new GZipStream(cmpStream, CompressionMode.Compress);

            hgs.Write(data, 0, data.Length);

            byte[] cmpData = cmpStream.ToArray();

            MemoryStream decomStream = new MemoryStream(cmpData);

            hgs = new GZipStream(decomStream, CompressionMode.Decompress);
            hgs.Read(data, 0, data.Length);

            string sampleOut = encoding.GetString(data);

            result = String.Equals(sample, sampleOut);
            return result;
        }

The problem what that you were not using the ASCIIEncoder to get the string back for sampleData.

EDIT: Here's a cleaned up version of the code to help with Closing/Disposing:

public static bool Test()
        {
            string sample = "This is a compression test of microsoft .net gzip compression method and decompression methods";

            System.Text.ASCIIEncoding encoding = new System.Text.ASCIIEncoding();

            byte[] data = encoding.GetBytes(sample);

            // Compress.
            GZipStream hgs;
            byte[] cmpData;

            using(MemoryStream cmpStream = new MemoryStream())
            using(hgs = new GZipStream(cmpStream, CompressionMode.Compress))
            {
                hgs.Write(data, 0, data.Length);
                hgs.Close()

                // Do this AFTER the stream is closed which sounds counter intuitive 
                // but if you do it before the stream will not be flushed
                // (even if you call flush which has a null implementation).
                cmpData = cmpStream.ToArray();
            }  

            using(MemoryStream decomStream = new MemoryStream(cmpData))
            using(hgs = new GZipStream(decomStream, CompressionMode.Decompress))
            {
                hgs.Read(data, 0, data.Length);
            }

            string sampleOut = encoding.GetString(data);

            bool result = String.Equals(sample, sampleOut);
            return result;
        }
Homologize answered 19/10, 2009 at 20:20 Comment(5)
YES! It works! But the actual problem still remains. It works as values in data not changed. If I reset data[] using " data = new byte[data.Length];" right before calling hgs.read(... .. .), result=false. As, the hgs.Read totally fails. It doesn't read anything. If you put "readCount=hgs.Read(...)" you will see the readCount=0, meaning nothing was read. That's the problem I am facing. hope you can shed some light. Thanks. many thanks to all for quick responses.Majolica
Sorry if I've misunderstood here, but are you saying that if you put 'data = new byte[data.Length];' before 'hgs.Read()' call, then the result is false? This is what I would expect, since the data[] array is being wiped of it's value at that point. Not sure I'm understanding things here, I'm must get some more coffee! :)Homologize
I know this is an old question, but @Majolica is correct. This code doesn't work. Observe the return value of hgs.Read(data, 0, data.Length) and you will see that it is zero.Unnamed
You MUST Close() the compression GZipStream BEFORE copying the compressed bytes to an array.Unnamed
@JasonEvans I altered the timing of the ToArray() call otherwise as per other comments it just doesn't work. Either that or a down vote and I see no reason to down vote an otherwise perfectly good answer.Morula
H
20

Close the GZipStream after the Write call.

Without calling Close, there's a possibility that some data is buffered and is not written to the underlying stream yet.

Hilaryhilbert answered 19/10, 2009 at 20:17 Comment(4)
@Blindy: I just checked with Reflector. Only Close is an option. Flush does nothing.Hilaryhilbert
Yes, I tried close after both compress and decompress, still it doesn't work. For some reason hgs.Read( ... ) doesn't read anything :: That's the problem. Problem remains regardless hgs.close() used or not. If you are so sure of close() or flush() can you please paste your WORKING code? Many thanks. =-- Mehdi Anis --=Majolica
Correct - DO NOT FLUSH - Close it. Debugged for an hour until I figured this out.Harte
Exactly what I figured after a long time.Ryurik
H
15

Try this code:

public static bool Test()
        {
            string sample = "This is a compression test of microsoft .net gzip compression method and decompression methods";

            System.Text.ASCIIEncoding encoding = new System.Text.ASCIIEncoding();

            byte[] data = encoding.GetBytes(sample);
            bool result = false;

            // Compress
            MemoryStream cmpStream = new MemoryStream();

            GZipStream hgs = new GZipStream(cmpStream, CompressionMode.Compress);

            hgs.Write(data, 0, data.Length);

            byte[] cmpData = cmpStream.ToArray();

            MemoryStream decomStream = new MemoryStream(cmpData);

            hgs = new GZipStream(decomStream, CompressionMode.Decompress);
            hgs.Read(data, 0, data.Length);

            string sampleOut = encoding.GetString(data);

            result = String.Equals(sample, sampleOut);
            return result;
        }

The problem what that you were not using the ASCIIEncoder to get the string back for sampleData.

EDIT: Here's a cleaned up version of the code to help with Closing/Disposing:

public static bool Test()
        {
            string sample = "This is a compression test of microsoft .net gzip compression method and decompression methods";

            System.Text.ASCIIEncoding encoding = new System.Text.ASCIIEncoding();

            byte[] data = encoding.GetBytes(sample);

            // Compress.
            GZipStream hgs;
            byte[] cmpData;

            using(MemoryStream cmpStream = new MemoryStream())
            using(hgs = new GZipStream(cmpStream, CompressionMode.Compress))
            {
                hgs.Write(data, 0, data.Length);
                hgs.Close()

                // Do this AFTER the stream is closed which sounds counter intuitive 
                // but if you do it before the stream will not be flushed
                // (even if you call flush which has a null implementation).
                cmpData = cmpStream.ToArray();
            }  

            using(MemoryStream decomStream = new MemoryStream(cmpData))
            using(hgs = new GZipStream(decomStream, CompressionMode.Decompress))
            {
                hgs.Read(data, 0, data.Length);
            }

            string sampleOut = encoding.GetString(data);

            bool result = String.Equals(sample, sampleOut);
            return result;
        }
Homologize answered 19/10, 2009 at 20:20 Comment(5)
YES! It works! But the actual problem still remains. It works as values in data not changed. If I reset data[] using " data = new byte[data.Length];" right before calling hgs.read(... .. .), result=false. As, the hgs.Read totally fails. It doesn't read anything. If you put "readCount=hgs.Read(...)" you will see the readCount=0, meaning nothing was read. That's the problem I am facing. hope you can shed some light. Thanks. many thanks to all for quick responses.Majolica
Sorry if I've misunderstood here, but are you saying that if you put 'data = new byte[data.Length];' before 'hgs.Read()' call, then the result is false? This is what I would expect, since the data[] array is being wiped of it's value at that point. Not sure I'm understanding things here, I'm must get some more coffee! :)Homologize
I know this is an old question, but @Majolica is correct. This code doesn't work. Observe the return value of hgs.Read(data, 0, data.Length) and you will see that it is zero.Unnamed
You MUST Close() the compression GZipStream BEFORE copying the compressed bytes to an array.Unnamed
@JasonEvans I altered the timing of the ToArray() call otherwise as per other comments it just doesn't work. Either that or a down vote and I see no reason to down vote an otherwise perfectly good answer.Morula
M
10

There were three issues to solve the problem. 1. After WRITE GZipStream NEEDED to be closed :: hgs.Close();

  1. GZipStream read needed to be used a WHILE loop and writing the smaller buffer of uncompressed data to a MemoryStream :: outStream.Write( ... );

  2. The converting of decompressed byte[] array needed to use encoding conversion :: string sampleOut = encoding.GetString(data);

Here is the final code:-

public static bool Test()
        {
            string sample = "This is a compression test of microsoft .net gzip compression method and decompression methods";
            System.Text.ASCIIEncoding encoding = new System.Text.ASCIIEncoding();
            byte[] data = encoding.GetBytes(sample);
            bool result = false;

            // Compress 
            MemoryStream cmpStream = new MemoryStream();
            GZipStream hgs = new GZipStream(cmpStream, CompressionMode.Compress, true);

            hgs.Write(data, 0, data.Length);
            hgs.Close();


            //DeCompress
            byte[] cmpData = cmpStream.ToArray();
            MemoryStream decomStream = new MemoryStream(cmpData);

            data = new byte[data.Length];
            hgs = new GZipStream(decomStream, CompressionMode.Decompress, true);

            byte[] step = new byte[16]; //Instead of 16 can put any 2^x
            MemoryStream outStream = new MemoryStream();
            int readCount;

            do
            {
                readCount = hgs.Read(step, 0, step.Length);
                outStream.Write(step, 0, readCount);
            } while (readCount > 0);
            hgs.Close();

            string sampleOut = encoding.GetString(outStream.ToArray());
            result = String.Equals(sample, sampleOut);
            return result; 
        }

I had really trouble to get compress/decompress work with Microsoft .NET GZipStream object. Finally, I think I got it in right way. many thanks to all as the solution came from all of you.

Majolica answered 19/10, 2009 at 21:31 Comment(0)
C
4

Here's my cleaned up version of the final solution:


  [Test]
  public void Test_zipping_with_memorystream()
  {
   const string sample = "This is a compression test of microsoft .net gzip compression method and decompression methods";
   var encoding = new ASCIIEncoding();
   var data = encoding.GetBytes(sample);
   string sampleOut;
   byte[] cmpData;

   // Compress 
   using (var cmpStream = new MemoryStream())
   {
    using (var hgs = new GZipStream(cmpStream, CompressionMode.Compress))
    {
     hgs.Write(data, 0, data.Length);
    }
    cmpData = cmpStream.ToArray();
   }

   using (var decomStream = new MemoryStream(cmpData))
   {
    using (var hgs = new GZipStream(decomStream, CompressionMode.Decompress))
    {
     using (var reader = new StreamReader(hgs))
     {
      sampleOut = reader.ReadToEnd();
     }
    }
   }

   Assert.IsNotNullOrEmpty(sampleOut);
   Assert.AreEqual(sample, sampleOut);
  }
Chemisorption answered 29/11, 2010 at 12:15 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.