Because it gives us something that's useful. Consider the following:
var countSameName = from p in PersonInfoStore
group p.Id by new {p.FirstName, p.SecondName} into grp
select new{grp.Key.FirstName, grp.Key.SecondName, grp.Count()};
The works because the implementation of Equals()
and GetHashCode()
for anonymous types works on the basis of field-by-field equality.
- This means the above will be closer to the same query when run against at
PersonInfoStore
that isn't linq-to-objects. (Still not the same, it'll match what an XML source will do, but not what most databases' collations would result in).
- It means we don't have to define an
IEqualityComparer
for every call to GroupBy
which would make group by really hard with anonymous objects - it's possible but not easy to define an IEqualityComparer for anonymous objects - and far from the most natural meaning.
- Above all, it doesn't cause problems with most cases.
The third point is worth examining.
When we define a value type, we naturally want a value-based concept of equality. While we may have a different idea of that value-based equality than the default, such as matching a given field case-insensitively, the default is naturally sensible (if poor in performance and buggy in one case*). (Also, reference equality is meaningless in this case).
When we define a reference type, we may or may not want a value-based concept of equality. The default gives us reference equality, but we can easily change that. If we do change it, we can change it for just Equals
and GetHashCode
or for them and also ==
.
When we define an anonymous type, oh wait, we didn't define it, that's what anonymous means! Most of the scenarios in which we care about reference equality aren't there any more. If we're going to be holding an object around for long enough to later wonder if it's the same as another one, we're probably not dealing with an anonymous object. The cases where we care about value-based equality come up a lot. Very often with Linq (GroupBy
as we saw above, but also Distinct
, Union
, GroupJoin
, Intersect
, SequenceEqual
, ToDictionary
and ToLookup
) and often with other uses (it's not like we weren't doing the things Linq does for us with enumerables in 2.0 and to some extent before then, anyone coding in 2.0 would have written half the methods in Enumerable
themselves).
In all, we gain a lot from the way equality works with anonymous classes.
In the off-chance that someone really wants reference equality, ==
using reference equality means they still have that, so we don't lose anything. It's the way to go.
*The default implementation of Equals()
and GetHashCode()
has an optimisation that let's it use a binary match in cases where it's safe to do so. Unfortunately there's a bug that makes it sometimes mis-identify some cases as safe for this faster approach when they aren't (or at least it used to, maybe it was fixed). A common case is if you have a decimal
field, in a struct, then it'll consider some instances with equivalent fields as unequal.