string.GetHashCode() returns different values in debug vs release, how do I avoid this?

Asked 23/9, 2011 at 19:59 Answered 21/8, 2013 at 8:6

Solved c#string debugging release gethashcode

To my surprise the folowing method produces a different result in debug vs release:

int result = "test".GetHashCode();

Is there any way to avoid this?

I need a reliable way to hash a string and I need the value to be consistent in debug and release mode. I would like to avoid writing my own hashing function if possible.

Why does this happen?

FYI, reflector gives me:

[ReliabilityContract(Consistency.WillNotCorruptState, Cer.MayFail), SecuritySafeCritical]
public override unsafe int GetHashCode()
{
    fixed (char* str = ((char*) this))
    {
        char* chPtr = str;
        int num = 0x15051505;
        int num2 = num;
        int* numPtr = (int*) chPtr;
        for (int i = this.Length; i > 0; i -= 4)
        {
            num = (((num << 5) + num) + (num >> 0x1b)) ^ numPtr[0];
            if (i <= 2)
            {
                break;
            }
            num2 = (((num2 << 5) + num2) + (num2 >> 0x1b)) ^ numPtr[1];
            numPtr += 2;
        }
        return (num + (num2 * 0x5d588b65));
    }
}

Tiemroth answered 23/9, 2011 at 19:59 Comment(3)

If you need the hash code to remain consistent, you're using it wrong. If I recall correctly, they explicitly force it to be inconsistent in debug mode so that nobody relies on it internally at Microsoft. – Gang 23/9, 2011 at 20:5

For more, see: Eric Lippert's post on guidelines for GetHashCode – Gang 23/9, 2011 at 20:8

Also GetHashCode returns different values on .NET 32bit vs .NET 64bit. – Brannen 23/9, 2011 at 20:13

GetHashCode() is not what you should be using to hash a string, almost 100% of the time. Without knowing what you're doing, I recommend that you use an actual hash algorithm, like SHA-1:

using(System.Security.Cryptography.SHA1Managed hp = new System.Security.Cryptography.SHA1Managed()) {
    // Use hp.ComputeHash(System.Text.Encoding.ASCII (or Unicode, UTF8, UTF16, or UTF32 or something...).GetBytes(theString) to compute the hash code.
}

Update: For something a little bit faster, there's also SHA1Cng, which is significantly faster than SHA1Managed.

Berar answered 23/9, 2011 at 20:4 Comment(7)

I already have a lot of code expecting an int, it's also performance critical, thus the reason why I wanted to use the internal method. Can you create a fast hash that returns an int.. I will package it into an extender method such as GetHashCodeStable() – Tiemroth 25/9, 2011 at 4:17

@Joe: It's performance-critical? What exactly is your situation? If it just needs to be somewhat speedy, hashing functions are still pretty fast. Maybe try MD5. (Anyways, the result can easily be converted to an int, just take the last 4 bytes or something.) – Berar 25/9, 2011 at 16:13

Somewhat speedy is OK, I was always under the assumption that SHA1, MD5, etc were slow relative to some simple loop like the decompiled GetHashCode – Tiemroth 26/9, 2011 at 20:37

@Joe: It's essentially one more loop :) But you can also create your own method that hashes to an int if you test it and performance is unacceptable; there are several algorithms online. One I just found is the last post in: linuxquestions.org/questions/programming-9/… – Berar 27/9, 2011 at 15:44

I ended up using a modified version of the release GetHashCode implantation and called it GetHashcodeStabe() I'm giving you the correct answer because I think your solution is really the right way to go, I only used a different method because of performance requirements, although as stated this method is not very slow – Tiemroth 8/3, 2012 at 21:35

What say you to the comments in #16840 to the effect that it is overkill to use a cryptographic hash? – Symmetrize 27/7, 2012 at 2:34

@JasonPlutext: That's a different context; in this case, it sounded like the OP wanted a hashing solution that's consistent across platforms. Of course, if performance matters and you want a consistent hash, by all means do what Joe ended up doing :) I just used a cryptographic hash for convenience. – Berar 27/7, 2012 at 4:36

Here's a better approach that is much faster than SHA and you can replace the modified GetHasCode with it: C# fast hash murmur2

There are several implementations with different levels of "unmanaged" code, so if you need fully managed it's there and if you can use unsafe it's there too.

Rochellerochemont answered 1/8, 2012 at 17:55 Comment(0)

    /// <summary>
    /// Default implementation of string.GetHashCode is not consistent on different platforms (x32/x64 which is our case) and frameworks. 
    /// FNV-1a - (Fowler/Noll/Vo) is a fast, consistent, non-cryptographic hash algorithm with good dispersion. (see http://isthe.com/chongo/tech/comp/fnv/#FNV-1a)
    /// </summary>
    private static int GetFNV1aHashCode(string str)
    {
        if (str == null)
            return 0;
        var length = str.Length;
        // original FNV-1a has 32 bit offset_basis = 2166136261 but length gives a bit better dispersion (2%) for our case where all the strings are equal length, for example: "3EC0FFFF01ECD9C4001B01E2A707"
        int hash = length;
        for (int i = 0; i != length; ++i)
            hash = (hash ^ str[i]) * 16777619;
        return hash;
    }

I guess this implementation is slower than the unsafe one posted here. But it's much simpler and safe. Works good in case super speed is not needed.

Kobold answered 21/8, 2013 at 8:6 Comment(0)

Recommended topics

Hot tags