C#: How would you unit test GetHashCode?
Asked Answered
N

7

38

Testing the Equals method is pretty much straight forward (as far as I know). But how on earth do you test the GetHashCode method?

Neilneila answered 8/11, 2009 at 15:34 Comment(0)
C
44

Test that two distinct objects which are equal have the same hash code (for various values). Check that non-equal objects give different hash codes, varying one aspect/property at a time. While the hash codes don't have to be different, you'd be really unlucky to pick different values for properties which happen to give the same hash code unless you've got a bug.

Coppersmith answered 8/11, 2009 at 15:37 Comment(7)
It's quite easy to get the same hash code for some built in types. For example new Point(1,1).GetHashCode() and new Point(2,2).GetHashCode() gives the same value...Knitwear
Just because someone else's code fails to produce well distributed hash values doesn't mean it's not a good test for your code.Wotan
@Tony: What do you usually use then?Neilneila
@Svish: I can't remember the name, but repeated multiplication and addition - find answers by me about GetHashCode and I'm sure you'll see plenty of examples :)Coppersmith
Yeah. Point pretty obviously has a bad implementation of the hash function ;)Panettone
As far as I know, the purpose of unit tests is to verify, whether method works as it is documented. The only requirement to GetHashCode is: If two objects compare as equal, the GetHashCode method for each object must return the same value. Returning different values for different objects is a matter of performance: GetHashCode always returning 0 is a serious performance flaw, but actually it still is a valid hash function. This should be a subject to some kind of code review, not unit tests. Am I right?Fragmentation
What I want to say is, that only a test checking, if GetHashCode returns the same value for equal objects should be implemented in case, that GetHashCode is overridden and implemented by user.Fragmentation
M
10

Gallio/MbUnit v3.2 comes with convenient contract verifiers which are able to test your implementation of GetHashCode() and IEquatable<T>. More specifically you may be interested by the EqualityContract and the HashCodeAcceptanceContract. See here, here and there for more details.

public class Spot
{
  private readonly int x;
  private readonly int y;

  public Spot(int x, int y)
  {
    this.x = x;
    this.y = y;
  }

  public override int GetHashCode()
  {
    int h = -2128831035;
    h = (h * 16777619) ^ x;
    h = (h * 16777619) ^ y;
    return h;
  }
}

Then you declare your contract verifier like this:

[TestFixture]
public class SpotTest
{
  [VerifyContract]
  public readonly IContract HashCodeAcceptanceTests = new HashCodeAcceptanceContract<Spot>()
  {
    CollisionProbabilityLimit = CollisionProbability.VeryLow,
    UniformDistributionQuality = UniformDistributionQuality.Excellent,
    DistinctInstances = DataGenerators.Join(Enumerable.Range(0, 1000), Enumerable.Range(0, 1000)).Select(o => new Spot(o.First, o.Second))
  };
}
Mckown answered 19/5, 2010 at 12:31 Comment(0)
V
5

It would be fairly similar to Equals(). You'd want to make sure two objects which were the "same" at least had the same hash code. That means if .Equals() returns true, the hash codes should be identical as well. As far as what the proper hashcode values are, that depends on how you're hashing.

Veliz answered 8/11, 2009 at 15:38 Comment(1)
+1 - that is definitely one thing to test. Forget distribution, but same objects MUST have the same hash code.Panettone
R
4

From personal experience. Aside from obvious things like same objects giving you same hash codes, you need to create large enough array of unique objects and count unique hash codes among them. If unique hash codes make less than, say 50% of overall object count, then you are in trouble, as your hash function is not good.

        List<int> hashList = new List<int>(testObjectList.Count);
        for (int i = 0; i < testObjectList.Count; i++)
        {
            hashList.Add(testObjectList[i]);
        }

        hashList.Sort();
        int differentValues = 0;
        int curValue = hashList[0];
        for (int i = 1; i < hashList.Count; i++)
        {
            if (hashList[i] != curValue)
            {
                differentValues++;
                curValue = hashList[i];
            }
        }

        Assert.Greater(differentValues, hashList.Count/2);
Redon answered 8/11, 2009 at 16:12 Comment(0)
I
1

In addition to checking that object equality implies equality of hashcodes, and the distribution of hashes is fairly flat as suggested by Yann Trevin (if performance is a concern), you may also wish to consider what happens if you change a property of the object.

Suppose your object changes while it's in a dictionary/hashset. Do you want the Contains(object) to still be true? If so then your GetHashCode had better not depend on the mutable property that was changed.

Iridescence answered 10/5, 2016 at 17:8 Comment(0)
T
0

I would pre-supply a known/expected hash and compare what the result of GetHashCode is.

Typhoid answered 8/11, 2009 at 15:36 Comment(1)
That makes the test very fragile. For example, you should be able to make GetHashCode return the negated value of what it would have given in the previous version, and the method is still valid. Test what you care about - which is comparing hash codes of equal and non-equal values.Coppersmith
K
0

You create separate instances with the same value and check that the GetHashCode for the instances returns the same value, and that repeated calls on the same instance returns the same value.

That is the only requirement for a hash code to work. To work well the hash codes should of course have a good distribution, but testing for that requires a lot of testing...

Knitwear answered 8/11, 2009 at 15:40 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.