GZipStream machine dependence
Asked Answered
G

5

15

I'm running into some strange machine/OS dependent GZipStream behavior in .NET 4.0. This is the relevant code:

public static string Compress(string input) {
    using(var ms = new MemoryStream(Encoding.UTF8.GetBytes(input)))
    using(var os = new MemoryStream()) {
        using(var gz = new GZipStream(os,CompressionMode.Compress,true)) {
            ms.CopyTo(gz);
        }
        return string.Join("",os.ToArray().Select(b=>b.ToString("X2")));
    }
}

Running Compress("freek") gives me

1F8B08000000000004004B2B4A4DCD06001E33909D05000000

on Windows 7 and

1F8B0800000000000400ECBD07601C499625262F6DCA7B7F4AF54AD7E074A10880601324D8904010ECC188CDE692EC1D69472329AB2A81CA6556655D661640CCED9DBCF7DE7BEFBDF7DE7BEFBDF7BA3B9D4E27F7DFFF3F5C6664016CF6CE4ADAC99E2180AAC81F3F7E7C1F3F22CEEB3C7FFBFF040000FFFF1E33909D05000000

on Windows Server 2008R2. Both are 64bit. I would expect the results to be the same.

Both machines give the correct result when I decompress either result. I already found out that on W7 ms.Length == 25 while on W2K8R2 ms.Length==128, but no clue why.

What's going on?

Greeting answered 21/3, 2012 at 14:14 Comment(5)
This could just as easily be about the MemoryStream. Tried it w/o gzip?Yablon
@Henk why whould you think that? Is MemoryStream making up the other 123 bytes on W2K8R2?Greeting
Have you checked in the task manager if both machines show this as a 64 bit process or not? How are your build settings?Hatfield
128 bytes looks like a block size for a stream. Compression is more likely to use 128 bitsYablon
Lossless compression guarantees to recover the same result after a compression-decompression cycle but I see no reason to expect the compressed form to always be identical. Why do you expect the answers to be identical?Dreadfully
F
18

It was announced that .NET 4.5 Beta includes zip compression improvements to reduce the size:

Starting with the .NET Framework 4.5 RC, the DeflateStream class uses the zlib library for compression. As a result, it provides a better compression algorithm and, in most cases, a smaller compressed file than it provides in earlier versions of the .NET Framework.

Do you perhaps have .Net 4.5+ installed on the Win7 machine?

Frazzled answered 21/3, 2012 at 15:42 Comment(2)
This bit us here on Stack Overflow because of running a mixed 4.0/4.5 tier and gzipping content before shoving it into (what should be) unique sets in redis for caching. If you rely on the same result of compression (e.g. to removing an item from a set), beware of running a mixed environment of 4.0 and 4.5 servers.Chasten
Great! Now I know how to explain during customer interview why my program isn't working!Misdirect
G
5

It seems there is a change in the algorithm used by DeflateStream in .NET 4.5:

Starting with the .NET Framework 4.5 Beta, the DeflateStream class uses the zlib library for compression. As a result, it provides a better compression algorithm and, in most cases, a smaller compressed file than it provides in earlier versions of the .NET Framework.

Since I had 4.5 installed, this was causing the problem.

Greeting answered 21/3, 2012 at 15:57 Comment(2)
I wouldn't consider this to be a breaking change. It's merely a performance improvement. Apps shouldn't have any expectations on the result of this API other than that it is valid per the gzip spec.Frazzled
It is a breaking change, but it's worse than breaking changes in previous releases. .NET 4.5 replaces .NET 4.0, which means an install an app on the same computer that uses .NET 4.5 may break an already installed app on the same computer that was using .NET 4.0, but will start using .NET 4.5.Unarm
C
1

I ran your code on my Windows 7 64 bit machine and got the following, which equals your Win2k8SP2:

1F8B0800000000000400ECBD07601C499625262F6DCA7B7F4AF54AD7E074A10880601324D8904010ECC188CDE692EC1D69472329AB2A81CA6556655D661640CCED9DBCF7DE7BEFBDF7DE7BEFBDF7BA3B9D4E27F7DFFF3F5C6664016CF6CE4ADAC99E2180AAC81F3F7E7C1F3F229ED579FEF6FF090000FFFF1A1C515C05000000

Essentially, I think the result has to do with the word-length of the machine. I.e., your windows-7 machine is perhaps 32 bit?

NOTE: I wrote a little decompress for your strings and I have to second that they indeed decompress well. I ran my version in both 32 bit and 64 bit and the outcome was equal. Only possible difference remains: different runtimes?

EDIT:

different runtimes?

Apparently, as Henk Holterman suggested below and Robert Levy formalized in his answer, this was indeed the non-obvious case here.

Callow answered 21/3, 2012 at 14:23 Comment(4)
Both are 64bit, forgot to mention that.Greeting
Which Visual Studio are you running?Greeting
I get the same output as Abel. Are we sure the VS version is the same?Cloud
@Henk that seemed to be the issue, verifying now. From MSDN: Starting with the .NET Framework 4.5 Beta, the DeflateStream class uses the zlib library for compression. As a result, it provides a better compression algorithm and, in most cases, a smaller compressed file than it provides in earlier versions of the .NET Framework.Greeting
D
1

As opposed to Abel's answer, I get the result of

1F8B08000000000004004B2B4A4DCD06001E33909D05000000

on my Windows 7 x64 Ultimate SP1. Perhaps there's a .NET Framework update you don't have on one of the boxes? The version of my mscorlib.dll is 4.0.30319.17379.

ETA: If I retarget to .NET 2 (and change the .NET 4-specific constructs to their .NET 2 equivalents), I do get the result of

1F8B0800000000000400EDBD07601C499625262F6DCA7B7F4AF54AD7E074A10880601324D8904010ECC188CDE692EC1D69472329AB2A81CA6556655D661640CCED9DBCF7DE7BEFBDF7DE7BEFBDF7BA3B9D4E27F7DFFF3F5C6664016CF6CE4ADAC99E2180AAC81F3F7E7C1F3F22CEEB3C7FFBFF001E33909D05000000

on the same machine/OS.

Doering answered 21/3, 2012 at 15:6 Comment(2)
I'm suspecting it's because of the SP1 update. I'm testing that now.Greeting
In fact, I do, on this box. I read the comment you made re: the increased deflatability in 4.5 and that would hit the nail on the head. Nice find!Doering
S
0

I suspect one of the operating systems is 32 bit and the other is 64 bit.

Sextant answered 21/3, 2012 at 14:21 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.