What is the best way for calculating hashcode of a class with string properties? [duplicate]
Asked Answered
D

2

7

I have a class with string properties and I need to override GetHashCode() method.

class A
{
    public string Prop1 { get; set; }
    public string Prop2 { get; set; }
    public string Prop3 { get; set; }
}

The first idea is to do something like this:

public override int GetHashCode()
{
    return Prop1.GetHashCode() ^ Prop2.GetHashCode() ^ Prop3.GetHashCode();
}

The second idea is:

public override int GetHashCode()
{
    return String.Join(";", new[] {Prop1, Prop2, Prop3}).GetHashCode();
}

What is the best way?

Dodiedodo answered 28/11, 2012 at 5:39 Comment(3)
May be helpful Guidelines and rules for GetHashCode-Eric LippertDanczyk
@Danczyk Thank you, it's really very helpful resource, but the problem is still alive. As I found from the article, GetHashCode() method should be as fast as possible from one side and unique from another side (but it is not required). So, the first idea seams to be faster, but second - more unique (it will provide less number of collisions). I believe that the both ideas are applicable, but it would be great to know what other developers think about that.Dodiedodo
There are generic hash implementers here #263900, so that helps you with string propertiesDerisive
C
4

You shouldn't just XOR them together, because this doesn't account for ordering. Imagine you have two objects:

"foo", "bar", "baz"

and

"bar", "foo", "baz"

With a simple XOR, both of these will have the same hash. Luckily it's pretty easy to work around. This is the code I use to combine hashes:

static int MultiHash(IEnumerable<object> items)
{
    Contract.Requires(items != null);

    int h = 0;

    foreach (object item in items)
    {
         h = Combine(h, item != null ? item.GetHashCode() : 0);
    }

    return h;
}

static int Combine(int x, int y)
{
    unchecked
    {
         // This isn't a particularly strong way to combine hashes, but it's
         // cheap, respects ordering, and should work for the majority of cases.
         return (x << 5) + 3 + x ^ y;
    }
}

There are a lot of ways to combine hashes, but usually something very simple like this will do. If for some reason it doesn't work for your situation, MurmurHash has pretty robust hash combining you can pull.

Chino answered 10/12, 2012 at 16:54 Comment(1)
Isn't it a problem with negative hashcode ? ( unchecked clause)Darbee
H
3

Just XOR the hashes of each string together. It is cheaper (performance wise) than the string concatenation, and as far as I can see, it is not more prone to collisions. Let's assume that each string is 5 characters long and that each character takes up 1 byte. In the first one, you are hashing 15 bytes to 4 bytes (int). In the second one you are concatenating all 3 strings (an expensive operation) to end up with one string of 15 bytes, and they you are hashing it to 4 bytes. Both transform 15 bytes to 4, therefore in theory both are quite similar in terms of collisions.

In reality there is a bit of a difference in the probabilities of collisions, but in practice it may not always matter. It depends on the data the strings will have. If all 3 strings are equal and that they each hash to 0001 (I am using a simple number just for the sake of the example). If all 3 are equal then xoring the first two will get you 0000 and xoring the third one with that will get you back to 0001. By concatenating the strings this can be avoided at the cost of some performance (if you are writing a performance critical program, I wouldn't concatenate strings in the inner loop).

So in the end, I haven't really given an answer after all, for the simple reason that there really isn't one. It all depends on where and how it will be used.

Harborage answered 10/12, 2012 at 15:48 Comment(1)
Another way to phrase this is XOR is associative, meaning order does not matter. That means you'll get same hash code for any given set of strings, regardless of whichever property a given string was assigned. That doesn't make this answer bad at all, but it is a caveat to consider.Killoran

© 2022 - 2024 — McMap. All rights reserved.