Compare contents of two IEnumerables [duplicate]
Asked Answered
R

10

59

Is there a built in LINQ method thing I can use to find out if two sequences contains the same items, not taking the order into account?

For example:

{1, 2, 3} == {2, 1, 3}
{1, 2, 3} != {2, 1, 3, 4}
{1, 2, 3} != {1, 2, 4}

You have the SequenceEquals, but then I would have to Order both sequences first, wouldn't I?

Rakel answered 10/3, 2009 at 13:52 Comment(0)
U
51

There are quite a few ways. Assume A and B is IEnumerable.

!A.Except(B).Any() && !B.Except(A).Any()
A.Count() == B.Count() && A.Intersect(B).Count() == B.Count()
etc
Uprear answered 10/3, 2009 at 13:55 Comment(13)
using Intersect there, do you need to do it both ways?Rakel
I think this is much slower than ordering the sequences. Unless you are dealing with expression trees.Arise
@Svish: Yes, you do need to do it both ways A = B iff A Δ B = Φ.Arise
What does "A = B iff A Δ B = Φ" mean?Rakel
Set theory. A, B are two sets. A Δ B means (A - B) U (B - A).Arise
How can it be slower? Except is an O(n) op, where sorting is at least O(n log n).Uprear
@svish: Updated, you just need to make sure the count is the same, then 1 intersect is sufficient. For except, you need both ways, always good to determine differences (added/removed).Uprear
How does Except work if it's only O(n)? In order for it to be O(n) wouldn't the lists need to be sorted?Victualler
By adding the elements of the first list to a dictionary.Uprear
query.Count() == 0 is the same as !query.Any(), but the any call is faster because it doesn't evaluate the entire list.Roodepoortmaraisburg
This code only works if the sequences are sets i.e. there are no duplicate items in themSculpin
The .Any() version: !A.Except(B).Any() && !B.Except(A).Any()Spaniard
Unless I'm missing something these don't work if there are duplicates in the enumerables. The Except version returns true for { 1, 2, 3, 3 } and { 1, 2, 2, 3} The Intersect version returns false if both are { 1, 2, 3, 3 }Zebec
G
12

If you don't care about duplicates (i.e. you'd consider {1, 2, 3} to be equal to {1, 2, 3, 2}) then:

new HashSet<int>(A).SetEquals(B)

(Or whatever type is the element type instead of int).

Otherwise:

public static bool SequenceEqualUnordered<T>(IEnumerable<T> first, IEnumerable<T> second)
{
    if (first == null)
        return second == null; // or throw if that's more appropriate to your use.
    if (second == null)
        return false;   // likewise.
    var dict = new Dictionary<T, int>(); // You could provide a IEqualityComparer<T> here if desired.
    foreach(T element in first)
    {
        int count;
        dict.TryGetValue(element, out count);
        dict[element] = count + 1;
    }
    foreach(T element in second)
    {
        int count;
        if (!dict.TryGetValue(element, out count))
            return false;
        else if (--count == 0)
            dict.Remove(element);
        else
            dict[element] = count;
    }
    return dict.Count == 0;
}

Keep a tally of each element in the first sequence, then check the second against it. The moment you have one too many in the second sequence you can return false, otherwise if you have nothing left in the dictionary of tallies they are equal, or false if there's any elements left.

Rather than the two O(n log n) sorts of using OrderBy() followed by the O(n) comparison, you've an O(n) operation building the set of tallies, and an O(n) check against it.

Gilded answered 11/1, 2016 at 14:49 Comment(0)
S
11

With two IEnumerables (A and B) :

bool equal = (A.Count() == B.Count() && (!A.Except(B).Any() || !B.Except(A).Any()))

I think this is better than Except(A).Count because the entire Excep will not be evaluated. It will stop as soon as one element is found in the Except. With the Count, the entire Except is evaluated. On top of this, we can avoid the evaluation of these costly Except just by checking the Count properties first. If Counts are not Equal, then we check the Excepts.

Scabious answered 3/1, 2013 at 13:41 Comment(2)
Love that Except will yield on first discrepency--nice!Paleography
The || between the Except() should rather be && as this answer suggests. Otherwise the following would be true, too: {1,2,3} == {1,2,1}. I am not sure if it's correct - at least not for my case :).Incumber
B
4

Try the HashSet class:

var enumA = new[] { 1, 2, 3, 4 };
var enumB = new[] { 4, 3, 1, 2 };

var hashSet = new HashSet<int>(enumA);
hashSet.SymmetricExceptWith(enumB);
Console.WriteLine(hashSet.Count == 0); //true => equal

But that does only work correctly if the values are distinct.

For example

var enumA = new[] { 1, 1, 1, 2 };
var enumB = new[] { 1, 2, 2, 2 };

are also considered as "equal" with the mentioned method.

Belfry answered 10/3, 2009 at 16:6 Comment(1)
return hashSet.SetEquals(enumB);Appanage
V
2

Sticking with your example, you can make both of IEnumerable to be of type List and then use SequenceEqual as the example below:

var first = Enumerable.Range(1, 3);
var second = Enumerable.Range(1, 3);
var areTheyEqual = first.ToList().SequenceEqual(second.ToList());
if (areTheyEqual)
{ /* do something... */}
Vanburen answered 8/9, 2019 at 9:36 Comment(1)
this method work without convert to list :)Retrogradation
S
2

To compare the data in the two objects, I simply used this

A.Except(B).Any() || B.Except(A).Any()
Scavenge answered 7/7, 2020 at 7:39 Comment(3)
If my boolean algebra is correct, that's equal to !A.Except(B).Any() && !B.Except(A).Any(), but looks a bit cleaner. 👍Rakel
Correct @Svish. That I believe is completely upto the developer, what purpose you want to use it for.Scavenge
@Rakel To clarify, !A.Except(B).Any() && !B.Except(A).Any() is not equal to A.Except(B).Any() || B.Except(A).Any(). It's the logical inverse. The former checks for item-wise equivalency; the latter checks for item-wise inequivalency.Winonawinonah
H
1

I did this for merging new items into a collection without duplicates, it takes two collections and returns all the items with out any duplicates

List<Campaign> nonMatching = (from n in newCampaigns 
where !(from e in Existing select e.Id).Contains<int>(n.Id) 
select n).ToList<Campaign>();

Now by removing the ! for the contains statement

List<Campaign> nonMatching = (from n in newCampaigns 
where (from e in Existing select e.Id).Contains<int>(n.Id) 
select n).ToList<Campaign>();

it will return the duplicates

Herrle answered 10/3, 2009 at 13:52 Comment(0)
A
0

I think ordering the sequence is the fastest way you can achieve this.

Arise answered 10/3, 2009 at 13:53 Comment(5)
For SequenceEquals it has to be the same order.Uprear
"The SequenceEqual<(Of <(TSource>)>)(IEnumerable<(Of <(TSource>)>), IEnumerable<(Of <(TSource>)>)) method enumerates the two source sequences in parallel and compares corresponding elements by using the default equality comparer for TSource, Default."Rakel
Yeah, I misread the question. This is O(nlogn) which is the fastest asymptotic time an algorithm can achieve for this purpose.Arise
Theoretically, the fastest algorithm is to sort one and binary search every element of the other in the sorted version. This does require writing code manually. The other fast algorithm is to build a dictionary of one of them and look up elements of the other through it.Arise
Except is an O(n) op, where sorting is at least O(n log n)Uprear
A
0

If you're really just testing to see if there are duplicates, then leppie's suggestion should work:

if (A.Except(B).Count == 0 && B.Except(A).Count == 0) {...}

But if you just need to arrive at an IEnumerable with no duplicates:

var result = A.Union(B).Distinct();
Antifouling answered 10/3, 2009 at 15:51 Comment(0)
D
-1

For people finding question that also care about order, here is something that may be useful. A static CompareTo method for enumerables and an IComparer implementation. Tests included afterwards:

    public class EnumerableComparer<T> : IComparer<IEnumerable<T>>
    {
        readonly IComparer<T> comparer;
        
        public EnumerableComparer(IComparer<T> comparer = null)
        {
            this.comparer = comparer;
        }
        
        public int Compare(IEnumerable<T> first, IEnumerable<T> second)
        {
            return first.CompareTo(second, comparer);
        }
    }
    
    public static int CompareTo<T>(this IEnumerable<T> first, IEnumerable<T> second, IComparer<T> comparer = null)
    {
        comparer ??= Comparer<T>.Default;
        
        if (ReferenceEquals(first, second))
            return 0;

        if (first == null)
            return -1;

        if (second == null)
            return 1;
        
        using (var iter1 = first.GetEnumerator())
        using (var iter2 = second.GetEnumerator())
        {
            while (iter1.MoveNext())
            {
                if (iter2.MoveNext())
                {
                    var result = comparer.Compare(iter1.Current, iter2.Current);

                    if (result != 0)
                        return result;
                }
                else
                {
                    return 1;
                }
            }
            while (iter2.MoveNext())
            {
                return -1;
            }

            return 0;
        }
    }
    [Fact]
    public void CompareToWorksForEnumerables()
    {
        Array.Empty<int>().CompareTo(Array.Empty<int>()).Should().Be(0);
        ((IEnumerable<int>)null).CompareTo(null).Should().Be(0);
        Array.Empty<int>().CompareTo(null).Should().Be(1);
        ((IEnumerable<int>)null).CompareTo(Array.Empty<int>()).Should().Be(-1);
        new []{1, 2, 3}.CompareTo(new []{1, 2, 3}).Should().Be(0);
        new []{1, 2}.CompareTo(new []{1, 2, 3}).Should().Be(-1);
        new []{1, 2, 3}.CompareTo(new []{1, 2}).Should().Be(1);
        new []{2, 2}.CompareTo(new []{1, 2, 3}).Should().Be(1);
    }
Declared answered 24/2 at 0:33 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.