Is there a built-in way to compare IEnumerable<T> (by their elements)?
Asked Answered
K

4

4

I would like to compare lists of elements of a given type, to see which list is "bigger".

new BuiltInComparer<IEnumerable<int>>().Compare(
    new[] {3,2,3}, 
    new[] {1,2,3})

...would return 1

new BuiltInComparer<IEnumerable<int>>().Compare(
    new[] {1,2,3}, 
    new[] {1,2,4})

...would return -1 etc

Is there any such built in comparer?

Kithara answered 11/5, 2010 at 14:52 Comment(4)
There is not enough information in the examples to answer the question. Is your comparison solely based on the size of the list? Is the list {2, 2, 0} bigger or smaller than the list {0, 2, 2}? And so on.Philia
updated examples to better explain what I'm after. The comparer should compare element-by-element. The length of the lists should only play a role if all the elements are equal up to the point where the shorter list ends.Kithara
I think your returned values are the wrong way round - normally Compare will return a negative value if the first value is less than the second value.Holily
@JonSkeet: Fixed. I always get those wrong.Kithara
H
11

I don't think there's anything built into the framework - and as Eric says, you haven't provided the comparison criteria. If you mean "compare element-wise in the natural way, and assume a 'missing' element is smaller than any present element" (i.e. a longer sequence beats a shorter subsequence if they're equal where possible) then something like this would do it:

public int SequenceCompare<T>(IEnumerable<T> source1, IEnumerable<T> source2)
{
    // TODO: Parameter validation :)
    // You could add an overload with this as a parameter
    IComparer<T> elementComparer = Comparer<T>.Default;       
    using (IEnumerator<T> iterator1 = source1.GetEnumerator())
    using (IEnumerator<T> iterator2 = source2.GetEnumerator())
    {
        while (true)
        {
            bool next1 = iterator1.MoveNext();
            bool next2 = iterator2.MoveNext();
            if (!next1 && !next2) // Both sequences finished
            {
                return 0;
            }
            if (!next1) // Only the first sequence has finished
            {
                return -1;
            }
            if (!next2) // Only the second sequence has finished
            {
                return 1;
            }
            // Both are still going, compare current elements
            int comparison = elementComparer.Compare(iterator1.Current,
                                                     iterator2.Current);
            // If elements are non-equal, we're done
            if (comparison != 0)
            {
                return comparison;
            }
        }
    }
}
Holily answered 11/5, 2010 at 15:2 Comment(6)
Nice. Offtopic: why use the using's ? What's there to dispose? Is it for the case where, let's say, we're enumerating over something pulled from a database, or does this have value even for plain old lists?Kithara
@Cristi: IEnumerator<T> implements IDisposable. This fact alone is enough to wrap it in a using statement. Whether there is something to dispose or not, but in Jon's example, the function accepts IEnumerable<T> arguments, which are interfaces. There is therefore no way to know whether there is anything to dispose or not. But it is very well possible that the used enumerator calls to the database. Not disposing it could leave the connection open.Irremissible
@Cristi: It's very important to dispose of IEnumerator<T> values. That's the way that finally blocks get to be executed in iterator blocks - for example, if you have an iterator which reads lines from a file, disposing of the iterator should close the file.Holily
It's to support the poorly contrived notion of try { yield return x; } finally { } -- Well once bitten, twice shy, I don't like yield for this very reason.Carola
csharptest.net: I like using for this very reason.Zerk
heads up: there's a bug in the code above: "element" is undefined. You meant elementComparer instead, correct?Revelationist
B
4

More polished version:

public static class EnumerableExtensions
{
    /// <summary>
    /// Performs lexical comparison of 2 IEnumerable collections holding elements of type T. 
    /// </summary>
    /// <typeparam name="T">Type of collection elements.</typeparam>
    /// <param name="first">The first collection to compare.</param>
    /// <param name="second">The second collection to compare.</param>
    /// <returns>A signed integer that indicates the relative values of a and b:
    /// Less than zero: first is less than second;
    /// Zero: first is equal to second;
    /// Greater than zero: first is greater than second.
    /// </returns>
    /// <remarks>
    /// Can be called as either static method: EnumerableExtensions.Compare(a, b) or
    /// extension method: a.Compare(b).
    /// </remarks>
    public static int Compare<T>(this IEnumerable<T> first, IEnumerable<T> second)
    {
        // If one of collection objects is null, use the default Comparer class
        // (null is considered to be less than any other object)
        if (first == null || second == null)
            return Comparer.Default.Compare(first, second);

        var elementComparer = Comparer<T>.Default;
        int compareResult;

        using (var firstEnum = first.GetEnumerator())
        using (var secondEnum = second.GetEnumerator())
        {
            do
            {
                bool gotFirst = firstEnum.MoveNext();
                bool gotSecond = secondEnum.MoveNext();

                // Reached the end of collections => assume equal
                if (!gotFirst && !gotSecond)
                    return 0;

                // Different sizes => treat collection of larger size as "greater"
                if (gotFirst != gotSecond)
                    return gotFirst ? 1 : -1;

                compareResult = elementComparer.Compare(firstEnum.Current, secondEnum.Current);
            } while (compareResult == 0);
        }

        return compareResult;
    }
}
Breccia answered 13/8, 2013 at 14:11 Comment(0)
Z
2

If you're on .NET 4 (and it doesn't sound like you are), I think you might be able to do something clever with Enumerable.Zip. Something like:

var r = x.Zip(y, comparer.Compare).FirstOrDefault(c => c != 0);

though I can't see right now how to efficiently deal with the case where the shorter one is the same as the longer one, as far as it goes.

Edit: If you're only comparing arrays (or otherwise don't care about measuring your collections twice), then I think you can simply add:

if (r == 0) {
    r = int.Compare(x.Count(), y.Count());
}

You could even combine these as:

var r = x.Zip(y, comparer.Compare)
         .Concat(new [] { int.Compare(x.Count(), y.Count()) })
         .FirstOrDefault(c => c != 0)

(And if you're on .NET 3.5, then add a Zip extension method, because it's easy to write and seriously useful all over the place! I don't know why it wasn't included in the initial Linq release.)

Zerk answered 11/5, 2010 at 16:36 Comment(2)
Re: why no zip in the initial release: (1) even small methods have testing, documentation, etc, burden and we were very short on time for that release, (2) we were considering adding a special syntax for zip joins to the query comprehension syntax. Ultimately we decided not to, but it would have looked pretty foolish to write a method and then write a zip join query syntax that was incompatible with that method for some reason. It's a good idea to design these things all at once if you can.Philia
Of course any feature needs work to release. My comment was more to wonder why other methods made it in, and not Zip. There's methods from Linq's initial release I've still never even seen used once, anywhere. :-)Zerk
M
0

There isn't a built-in comparer. However, this is a requirement that arises frequently. I've covered the topic at substantial length in my SequenceComparer<T> article; here is a simplified implementation:

public class SequenceComparer<TElement> : Comparer<IEnumerable<TElement>>
{
    private readonly IComparer<TElement> _elementComparer;

    public SequenceComparer(IComparer<TElement> elementComparer = null)
    {
        _elementComparer = elementComparer ?? Comparer<TElement>.Default;
    }

    public override int Compare(IEnumerable<TElement> x, IEnumerable<TElement> y)
    {
        // Get enumerators to iterate over both sequences in sync.
        using (IEnumerator<TElement> xEnumerator = x.GetEnumerator())
        using (IEnumerator<TElement> yEnumerator = y.GetEnumerator())
        {
            // Advance both enumerators to their next element, 
            // until at least one passes the end of its sequence.
            bool xMove, yMove;
            while ((xMove = xEnumerator.MoveNext()) &&
                   (yMove = yEnumerator.MoveNext()))
            {
                // Compare the current pair of elements across the two sequences,
                // seeking element inequality.
                int elementComparison = _elementComparer.Compare(xEnumerator.Current, yEnumerator.Current);
                if (elementComparison != 0)
                    return elementComparison;
            }

            // Determine the relative length of the two sequences based on the final values of xMove and yMove.
            return xMove.CompareTo(yMove);
        }
    }
}
Michaels answered 24/3, 2016 at 22:0 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.