Generic IEqualityComparer<T> and GetHashCode
Asked Answered
C

6

15

Being somewhat lazy about implementing lots of IEqualityComparers, and given that I couldn't easily edit class implementations of the objects being compared, I went with the following, meant to be used with Distinct() and Except() extension methods:

public class GenericEqualityComparer<T> : IEqualityComparer<T>
{
    Func<T, T, bool> compareFunction;
    Func<T, int> hashFunction;

    public GenericEqualityComparer(Func<T, T, bool> compareFunction, Func<T, int> hashFunction)
    {
        this.compareFunction = compareFunction;
        this.hashFunction = hashFunction;
    }

    public bool Equals(T x, T y)
    {
        return compareFunction(x, y);
    }

    public int GetHashCode(T obj)
    {
        return hashFunction(obj);
    }
}

This seems nice, but is giving a hash function every time REALLY necessary?

I understand that the hash code is used to put objects in buckets. If in different buckets, objects are not equal, and Equal is not called.

If GetHashCode returns the same value, Equals is called. (From: Why is it important to override GetHashCode when Equals method is overridden?)

So what could go wrong, if for example (and I hear a lot of programmers screaming in horror), GetHashCode returns a constant, to force the call to Equal?

Cloister answered 6/5, 2011 at 9:15 Comment(1)
I'd guess that if you did that and used the class in question in a dictionary you would end up with some pretty poor performance when doing any lookups.Kant
B
15

Nothing would go wrong, but in hash-table based containers, you're going from approx O(1) to O(n) performance when doing a lookup. You'd be better off simply storing everything in a List and brute force searching it for items that fulfil equality.

Brunette answered 6/5, 2011 at 9:18 Comment(1)
Ok, that's what I thought. It's not really an issue as I'm Except'ing enumerables with like 1000's of objects, and not very often. I'm a bit disappointed, I hoped for more troublesome issues ;)Cloister
H
12

If a common use-case is comparing objects according to one of their properties, you could add an additional constructor and implement and call it like this:

public GenericEqualityComparer(Func<T, object> projection)
{
    compareFunction = (t1, t2) => projection(t1).Equals(projection(t2));
    hashFunction = t => projection(t).GetHashCode();
}

var comaparer = new GenericEqualityComparer( o => o.PropertyToCompare);

This will automatically use the hash as implemented by the property.

EDIT: a more efficient and robust implementation inspired my Marc's comments:

public static GenericEqualityComparer<T> Create<TValue>(Func<T, TValue> projection)
{
    return new GenericEqualityComparer<T>(
        (t1, t2) => EqualityComparer<TValue>.Default.Equals( projection(t1), projection(t2)),
        t => EqualityComparer<TValue>.Default.GetHashCode(projection(t)));
}

var comparer = GenericEqualityComparer<YourObjectType>.Create( o => o.PropertyToCompare); 
Horrocks answered 6/5, 2011 at 9:27 Comment(5)
this one is nice, but I have to compare 2 propertiesCloister
@mathieu: Maybe htis would work: new GenericEqualityComparer( o => new Tuple<T1,T2>(o.P1, o.P2));Horrocks
You would do better to use a second generic (for the projected value) and use EqualityComparer<TValue>.Default to run Equals and GetHashCode; otherwise you get boxing and issues with null.Blackford
@Marc: thanks for the valuable input. I understand the boxing issue, but what issues with null are you seeing? If the objects to be compared can be null, this should be handled by the lambda which is passed to the constructor.Horrocks
@Horrocks - calling .GetHashCode(), for example ;p i.e. foo => foo.Name is risky. The inbuilt equality comparer avoids this (and avoids boxing, and supports IEquatable<T>, and supports Nullable<T>)Blackford
S
2

Your performance will go down the drain. Distinct and Except are efficient operations when implemented on set data structures. By providing a constant hash value you essentially destroy this characteristic and force naive algorithm using a linear search.

You need to see whether this is acceptable for your data volume. But for somewhat larger data sets, the difference will be pronounced. For example, Except will increase from expected time O(n) to O(n²), which can be a big deal.

Rather than providing a constant, why not just call the object’s own GetHashCode method? It may not give a particularly good value but it cannot be worse than using a constant, and correctness will still be preserved unless the GetHashCode method of the object is overridden to return wrong values.

Spense answered 6/5, 2011 at 9:22 Comment(3)
Context : lists with 1000 elements, no necessity to implement lots of IEquality comparers, and giving an hash function seemed overkill, for this precise case ;)Cloister
@Cloister well, go for it. But I’d really use the default GetHashCode before a constant.Spense
of course. But I comparing objects that dont override IEquatable. And apart from this precise use case, they don't need too. And this comparer is implemented privately inside my class, so it shouldnt be misused.Cloister
E
1

Found this one on CodeProject - A Generic IEqualityComparer for Linq Distinct() nicely done.

Use case:

IEqualityComparer<Contact> c =  new PropertyComparer<Contact>("Name");
IEnumerable<Contact> distinctEmails = collection.Distinct(c); 

Generic IEqualityComparer

public class PropertyComparer<T> : IEqualityComparer<T>
{
    private PropertyInfo _PropertyInfo;

    /// <summary>
    /// Creates a new instance of PropertyComparer.
    /// </summary>
    /// <param name="propertyName">The name of the property on type T 
    /// to perform the comparison on.</param>
    public PropertyComparer(string propertyName)
    {
        //store a reference to the property info object for use during the comparison
        _PropertyInfo = typeof(T).GetProperty(propertyName, 
    BindingFlags.GetProperty | BindingFlags.Instance | BindingFlags.Public);
        if (_PropertyInfo == null)
        {
            throw new ArgumentException(string.Format("{0} 
        is not a property of type {1}.", propertyName, typeof(T)));
        }
    }

    #region IEqualityComparer<T> Members

    public bool Equals(T x, T y)
    {
        //get the current value of the comparison property of x and of y
        object xValue = _PropertyInfo.GetValue(x, null);
        object yValue = _PropertyInfo.GetValue(y, null);

        //if the xValue is null then we consider them equal if and only if yValue is null
        if (xValue == null)
            return yValue == null;

        //use the default comparer for whatever type the comparison property is.
        return xValue.Equals(yValue);
    }

    public int GetHashCode(T obj)
    {
        //get the value of the comparison property out of obj
        object propertyValue = _PropertyInfo.GetValue(obj, null);

        if (propertyValue == null)
            return 0;

        else
            return propertyValue.GetHashCode();
    }

    #endregion
}  
Elma answered 5/3, 2014 at 13:12 Comment(2)
This was what I was looking for!Somite
There is going to be a performance hit with the use of reflection.Hanser
P
0

I needed to rewrite Henrik solution as a class implementing IEqualityComparer which gives this:

    public class GenericEqualityComparer<T,TKey> : IEqualityComparer<T>
    {
        private readonly Func<T, TKey> _keyFunction;

        public GenericEqualityComparer(Func<T, TKey> keyFunction)
        {
            _keyFunction = keyFunction;
        }

        public bool Equals(T x, T y) => EqualityComparer<TKey>.Default.Equals(_keyFunction(x), _keyFunction(y));

        public int GetHashCode(T obj)=> EqualityComparer<TKey>.Default.GetHashCode(_keyFunction(obj));
    }
Phrensy answered 10/3, 2020 at 10:46 Comment(0)
C
-1

Try this code:

public class GenericCompare<T> : IEqualityComparer<T> where T : class
{
    private Func<T, object> _expr { get; set; }
    public GenericCompare(Func<T, object> expr)
    {
        this._expr = expr;
    }
    public bool Equals(T x, T y)
    {
        var first = _expr.Invoke(x);
        var sec = _expr.Invoke(y);
        if (first != null && first.Equals(sec))
            return true;
        else
            return false;
    }
    public int GetHashCode(T obj)
    {
        return obj.GetHashCode();
    }
}

Example: collection = collection.Except(ExistedDataEles, new GenericCompare(x=>x.Id)).ToList();

Cavan answered 13/5, 2014 at 6:12 Comment(1)
What's so bad about it? (Except the if (true) return true; else return false; thing)Rubiaceous

© 2022 - 2024 — McMap. All rights reserved.