How is GetHashCode() implemented for Int32?
Asked Answered
L

2

46

I've been looking all over the place, but I can't find anything. Can anyone shed some light on this?

Literature answered 8/10, 2010 at 19:39 Comment(0)
B
77

According to Reflector:

public override int GetHashCode()
{
    return this;
}

Makes sense, does it?

Bionics answered 8/10, 2010 at 19:41 Comment(3)
I suppose. I was thinking all value types would have a common implementation.Literature
No, each Value type has it's own. UInt32 casts itself to Int32, this simply turning itself to a Signed Int. Int16 and Int64 do some funky bit shifting to generate a 32-Bit Value. System.Boolean returns 0 or 1 depending on it's state.Bionics
Interesing, why then int a = 10; and int b = 10.GetHashCode(); provides different x86 instructions. We see that this one-line method will be inlined so it should be the same but it isn'tAlkene
Q
3

Best way to hash 32 bit value to 32 bit is not to invent wheel, use value itself. Very fast, no collisions, indeed a perfect way.

Quirk answered 8/10, 2010 at 19:43 Comment(3)
Yes, but that doesn't necessarily mean it was implemented that way. :)Literature
Actually it is an AWFUL way to implement it. Per MS "For the best performance, a hash function should generate an even distribution for all input, including input that is heavily clustered. An implication is that small modifications to object state should result in large modifications to the resulting hash code for best hash table performance." (msdn.microsoft.com/en-us/library/system.object.gethashcode.aspx) This implementation while simple causes bad clustering and can lead to horrible performance when ints are used as hashtable keys.Syphon
@JeffWalkerCodeRanger That is generally true. But most hash tables are implemented by doing hashCode % bucketLength. Therefore the common sequence of 0, 1, 2, 3, ... will have perfect hash codes. However, if your input is 0, 32, 64, 96, 128, ... any power-of-two bucket size array <= 32 will have 100% hash collisions. If MS would change it to do some bit shuffling, there'll be another sequence that'll give 100% hash collisions. My point is, there is no context for Int32, and without context, you cannot make a good hash function. If you need to hash data, write your own hasher.Adamec

© 2022 - 2024 — McMap. All rights reserved.