This is something I had not noticed until today. Apparently, the .NET implementation of the much used tuple classes (Tuple<T>
, Tuple<T1, T2>
etc) causes boxing penalties for value types when equality based operations are performed.
Here is how the class is kind of implemented in the framework (source via ILSpy):
public class Tuple<T1, T2> : IStructuralEquatable
{
public T1 Item1 { get; private set; }
public T2 Item2 { get; private set; }
public Tuple(T1 item1, T2 item2)
{
this.Item1 = item1;
this.Item2 = item2;
}
public override bool Equals(object obj)
{
return this.Equals(obj, EqualityComparer<object>.Default);
}
public override int GetHashCode()
{
return this.GetHashCode(EqualityComparer<object>.Default);
}
public bool Equals(object obj, IEqualityComparer comparer)
{
if (obj == null)
{
return false;
}
var tuple = obj as Tuple<T1, T2>;
return tuple != null
&& comparer.Equals(this.Item1, tuple.Item1)
&& comparer.Equals(this.Item2, tuple.Item2);
}
public int GetHashCode(IEqualityComparer comparer)
{
int h1 = comparer.GetHashCode(this.Item1);
int h2 = comparer.GetHashCode(this.Item2);
return (h1 << 5) + h1 ^ h2;
}
}
The problem I see is it causes a two stage boxing-unboxing, say for Equals
calls, one, at the comparer.Equals
which boxes the item, two, the EqualityComparer<object>
calls the non-generic Equals
which in turn will internally have to unbox the item to orginal type.
Instead why wouldn't they do something like:
public override bool Equals(object obj)
{
var tuple = obj as Tuple<T1, T2>;
return tuple != null
&& EqualityComparer<T1>.Default.Equals(this.Item1, tuple.Item1)
&& EqualityComparer<T2>.Default.Equals(this.Item2, tuple.Item2);
}
public override int GetHashCode()
{
int h1 = EqualityComparer<T1>.Default.GetHashCode(this.Item1);
int h2 = EqualityComparer<T2>.Default.GetHashCode(this.Item2);
return (h1 << 5) + h1 ^ h2;
}
public bool Equals(object obj, IEqualityComparer comparer)
{
var tuple = obj as Tuple<T1, T2>;
return tuple != null
&& comparer.Equals(this.Item1, tuple.Item1)
&& comparer.Equals(this.Item2, tuple.Item2);
}
public int GetHashCode(IEqualityComparer comparer)
{
int h1 = comparer.GetHashCode(this.Item1);
int h2 = comparer.GetHashCode(this.Item2);
return (h1 << 5) + h1 ^ h2;
}
I was surprised to see equality implemented this way in .NET tuple class. I was using tuple type as a key in one of the dictionaries.
Is there any reason why this has to be implemented as shown in the first code? Its a bit discouraging to make use of this class in that case.
I dont think code refactoring and non-duplicating data should have been the major concerns. The same non-generic/boxing implementation has gone behind IStructuralComparable
too, but since IStructuralComparable.CompareTo
is less used its not a problem often.
I benchmarked the above two approaches with a third approach which is still less taxing, like this (only the essentials):
public override bool Equals(object obj)
{
return this.Equals(obj, EqualityComparer<T1>.Default, EqualityComparer<T2>.Default);
}
public bool Equals(object obj, IEqualityComparer comparer)
{
return this.Equals(obj, comparer, comparer);
}
private bool Equals(object obj, IEqualityComparer comparer1, IEqualityComparer comparer2)
{
var tuple = obj as Tuple<T1, T2>;
return tuple != null
&& comparer1.Equals(this.Item1, tuple.Item1)
&& comparer2.Equals(this.Item2, tuple.Item2);
}
for a couple of Tuple<DateTime, DateTime>
fields a 1000000 Equals
calls. This is the result:
1st approach (original .NET implementation) - 310 ms
2nd approach - 60 ms
3rd approach - 130 ms
The default implementation is about 4-5 times slower than the optimal solution.
IEqualityComparer<object>
was used, but I'm not saying there is none. But you can still make it a little better: pastebin.com/tNA2FYjq – BurbleTuple<,>
implementsIStructuralEquatable
which has these definitionsbool Equals(object other, IEqualityComparer comparer); int GetHashCode(IEqualityComparer comparer);
– LiqueurTuple
types are different. Personally if you are using a tuple for a key, it is likely a candidate for creating a properly named class, with a definitive equality implementation. This would improve the readability. Also, can you provide the code you used to test the performance? I've encountered poor hash-code related performance issues like this in the past, and especially onstruct
keys, providing a custom type with equality implemented, or a custom equality comparer drastically improves performance as you've seen. – JailhouseTuple
on .NET 2.0) and reviewed all cases where composite dictionaries were used. We landed on two approaches. 1) A composite key in the dictionary, made from a custom class or struct with custom equality on the type itself, or on the dictionary. 2) A composite dictionary structure of nested dictionaries. Creates a lot of dictionaries depending on the key breadth, but very fast access. Performance jumped. – Jailhousestruct
which has a strange custom implementation of hash-code, providing a custom comparer for that improved performance a lot also. – JailhouseTuple
default code might be less than optimal, but unless the person who coded it / approved the code turns up, you are unlikely to understand the drive behind the resulting implementation. – Jailhouse