C# GetHashCode question
Asked Answered
T

3

8

What would be the best way to override the GetHashCode function for the case, when my objects are considered equal if there is at least ONE field match in them.

In the case of generic Equals method the example might look like this:

    public bool Equals(Whatever other)
    {
        if (ReferenceEquals(null, other)) return false;
        if (ReferenceEquals(this, other)) return true;

        // Considering that the values can't be 'null' here.
        return other.Id.Equals(Id) || Equals(other.Money, Money) ||
               Equals(other.Code, Code);
    }

Still, I'm confused about making a good GetHashCode implementation for this case.

How should this be done?

Thank you.

Thitherto answered 5/4, 2011 at 15:53 Comment(1)
It doesn't make sense to put these objects into a hash table.Grain
P
18

This is a terrible definition of Equals because it is not transitive.

Consider

x = { Id = 1, Money = 0.1, Code = "X" }
y = { Id = 1, Money = 0.2, Code = "Y" }
z = { Id = 3, Money = 0.2, Code = "Z" }

Then x == y and y == z but x != z.

Additionally, we can establish that the only reasonable implementation of GetHashCode is a constant map.

Suppose that x and y are distinct objects. Let z be the object

z = { Id = x.Id, Money = y.Money, Code = "Z" }

Then x == z and y == z so that x.GetHashCode() == z.GetHashCode() and y.GetHashCode() == z.GetHashCode() establishing that x.GetHashCode() == y.GetHashCode(). Since x and y were arbitrary we have established that GetHashCode is constant.

Thus, we have shown that the only possible implementation of GetHashCode is

private readonly int constant = 17;
public override int GetHashCode() {
    return constant;
}

All of this put together makes it clear that you need to rethink the concept you are trying model, and come up with a different definition of Equals.

Pilau answered 5/4, 2011 at 16:1 Comment(1)
Good point. More generally, equality should be an equivalence relation. That is, it should be reflexive (x equals x is always true), symmetric (x equals y is the same as y equals x) and transitive (if x equals y and y equals z then x equals z). Note that in C# equality across types is not an equivalence relation; two strings compared as strings may give a different result when compared as objects. And in a few cases equality is not an equivalence relation within a type (Double.NaN != Double.NaN, for instance.) But it is best to strive to make equality an equivalence relation.Feld
W
7

I don't think you should be using Equals for this. People have a very explicit notion of what equals means, and if the Ids are different but the code or name are the same, I would not consider those "Equal". Maybe you need a different method like "IsCompatible".

If you want to be able to group them, you could use the extension method ToLookup() on a list of these objects, to use a predicate which would be your IsCompatible method. Then they would be grouped.

Waxen answered 5/4, 2011 at 15:57 Comment(6)
I'd still want to know if the answer to the original question is "it's not possible to implement GetHashCode for this". At this point I believe it's not possible :)Indistinct
I'm not sure I agree. I see your point, generally, but the particular use of his object may very well be that they are functionally equal in these situations, and he would definitely want to use hash code for things like avoiding duplicates in a HashSet where another method wouldn't easily serve the purpose.Inlet
@František Žiačik: It's possible, but the only possible map is a constant map. Please see my answer for a proof: https://mcmap.net/q/1241568/-c-gethashcode-question/…Pilau
@jamietre: I meant "impossible" by means of "impossible to make it relevant", but you're right as I already saw in Jon's answer; thanksIndistinct
@jamietre, they are not unique but if the objects are equal the hash code must be equal otherwise objects will be grouped by hashcode and then compared for equality. If an object A which is Equal to object B hve different hashcodes then they may end up in different buckets and so would never be compared with equals, which would result in incorrect behaviour, as you may search for an object and although there is one in the hashset which is equal it would not be compared to the given object as only those with the same hashcode would be comparedGav
Yeah - totally missed the complexity of the question (and deserved my downvotes). Slaps forehead.Inlet
A
6

The golden rule is: if the objects compare equal, they must produce the same hash code.

Therefore a conforming (but let's say, undesirable) implementation would be

public override int GetHashCode()
{
    return 0;
}

Frankly, if Id, Name and Code are independent of each other then I don't know if you can do any better. Putting objects of this type in a hash table is going to be painful.

Amaty answered 5/4, 2011 at 15:58 Comment(7)
I would appreciate a reason for the downvote so that I can understand how this answer is wrong and how to improve it.Amaty
I initially downvoted you since I felt your answer was not at all useful; obviously "return 0" is a "conforming" implementation, if by "conforming" you mean "useless." However, I changed my mind, since your final sentence is accurate and a real answer.Grain
Nice, didn't think of this kind of implementation.Indistinct
@mquander: It's going to make your hashtable degenerate into an array performance-wise, but I wouldn't call that useless. Arrays aren't useless, and O(n) is OK for some values of n.Amaty
@mquander: Don't get me wrong: it would be silly to put these in a hashtable. But better to have bad perf than to have non-deterministic bugs.Amaty
@Sam: Yes, this is correct. But read carefully Jason's answer (in particular the first part) and you'll see that any Equality implementation would be wrong here...Hype
@Jon: If you're putting something into a hash table, one would have to expect that it's because you want fast lookup. O(n) may be OK, but putting it in a sorted list or binary tree for O(log n) would be better in almost any conceivable circumstance. If you want to put your things into an array and look them up by linear search, then you ought to be clear about it and use an array or a vector (List<T>), not a hash table with meaningless hashing.Grain

© 2022 - 2024 — McMap. All rights reserved.