If I have two collections of type T, and an IEqualityComparer that compares a subset of their properties, which collection would the resulting elements of an Intersect or Union come from?
The tests I've run so far suggest the following:
- the item(s) from col1 win
- if col1 or col2 contain duplicate items (as defined by the comparer) within themselves, the first entry (within col1, then col2) wins.
I'm aware this shouldn't be an issue, as (by definition) I should be viewing the resulting objects as equal. It just occurred to me that using Union with a custom comparer could be a bit neater than the equivalent Join - though this only holds true if the above assumptions are guaranteeed.
class DummyComparer : IEqualityComparer<Dummy>
{
public bool Equals(Dummy x, Dummy y)
{
return x.ID == y.ID;
}
public int GetHashCode(Dummy obj)
{
return obj.ID.GetHashCode();
}
}
class Dummy
{
public int ID { get; set; }
public string Name { get; set; }
}
[Test]
public void UnionTest()
{
var comparer = new DummyComparer();
var d1 = new Dummy { ID = 0, Name = "test0" };
var d2 = new Dummy { ID = 0, Name = "test1" };
var d3 = new Dummy { ID = 1, Name = "test2" };
var d4 = new Dummy { ID = 1, Name = "test3" };
var col1 = new Dummy[] { d1, d3 };
var col2 = new Dummy[] { d2, d4 };
var x1 = col1.Union(col2, comparer).ToList();
var x2 = col2.Union(col1, comparer).ToList();
var y1 = col1.Except(col2, comparer).ToList();
var y2 = col2.Except(col1, comparer).ToList();
var z1 = col1.Intersect(col2, comparer).ToList();
var z2 = col2.Intersect(col1, comparer).ToList();
Assert.AreEqual(2, x1.Count);
Assert.Contains(d1, x1);
Assert.Contains(d3, x1);
Assert.AreEqual(2, x2.Count);
Assert.Contains(d2, x2);
Assert.Contains(d4, x2);
Assert.AreEqual(0, y1.Count);
Assert.AreEqual(0, y2.Count);
Assert.AreEqual(2, z1.Count);
Assert.Contains(d1, z1);
Assert.Contains(d3, z1);
Assert.AreEqual(2, z2.Count);
Assert.Contains(d2, z2);
Assert.Contains(d4, z2);
}
Intersect
. Maybe check out EduLinq, it goes into detail on the original implementations. – CrestingIntersect
with ILSpy and it enumerates the second collection first and then the first ,even if is documented the other way around. What could be the reason? Edit Actually Jon Skeet has also mentioned this "lie": msmvps.com/blogs/jon_skeet/archive/2010/12/30/… (in his words: "This is demonstrably incorrect.") – HutchensIntersect
returns - I read it as "marks elements in the second sequence, and returns them", but you clearly read it the other way around. (Yeah, that's the page I was thinking of when I linked EduLinq.) – Cresting