Handling warning for possible multiple enumeration of IEnumerable
Asked Answered
L

9

432

In my code I need to use an IEnumerable<> several times, resulting in the ReSharper error of "Possible multiple enumeration of IEnumerable".

Sample code:

public List<object> Foo(IEnumerable<object> objects)
{
    if (objects == null || !objects.Any())
        throw new ArgumentException();
        
    var firstObject = objects.First();
    var list = DoSomeThing(firstObject);        
    var secondList = DoSomeThingElse(objects);
    list.AddRange(secondList);
    
    return list;
}
  • I can change the objects parameter to be List and then avoid the possible multiple enumeration but then I don't get the highest object that I can handle.
  • Another thing that I can do is to convert the IEnumerable to List at the beginning of the method:

 public List<object> Foo(IEnumerable<object> objects)
 {
    var objectList = objects.ToList();
    // ...
 }

But this is just awkward.

What would you do in this scenario?

Leary answered 23/11, 2011 at 10:47 Comment(0)
G
557

The problem with taking IEnumerable as a parameter is that it tells callers "I wish to enumerate this". It doesn't tell them how many times you wish to enumerate.

I can change the objects parameter to be List and then avoid the possible multiple enumeration but then I don't get the highest object that I can handle.

The goal of taking the highest object is noble, but it leaves room for too many assumptions. Do you really want someone to pass a LINQ to SQL query to this method, only for you to enumerate it twice (getting potentially different results each time?)

The semantic missing here is that a caller, who perhaps doesn't take time to read the details of the method, may assume you only iterate once - so they pass you an expensive object. Your method signature doesn't indicate either way.

By changing the method signature to IList/ICollection, you will at least make it clearer to the caller what your expectations are, and they can avoid costly mistakes.

Otherwise, most developers looking at the method might assume you only iterate once. If taking an IEnumerable is so important, you should consider doing the .ToList() at the start of the method.

It's a shame .NET doesn't have an interface that is IEnumerable + Count + Indexer, without Add/Remove etc. methods, which is what I suspect would solve this problem.

Gormand answered 23/11, 2011 at 10:53 Comment(10)
Does ReadOnlyCollection<T> meet your desired interface requirements? msdn.microsoft.com/en-us/library/ms132474.aspxOrganist
@DanNeely I would suggest IReadOnlyCollection(T) (new with .net 4.5) as the best interface to convey the idea that it is an IEnumerable(T) which is intended to be enumerated multiple times. As this answer states, IEnumerable(T) itself is so generic that it may even refer to un-resetable content which cannot be enumerated over again without side effects. But IReadOnlyCollection(T) implies re-iterability.Raff
If you do the .ToList() at the start of the method, you first have to check if the parameter is not null. That means you'll get two places where you throw an ArgumentException: if the parameter is null and if it doesn't have any items.Heddi
I read this article on the semantics of IReadOnlyCollection<T> vs IEnumerable<T> and then posted a question about using IReadOnlyCollection<T> as a parameter, and in which cases. I think the conclusion I eventually came to is the logic to follow. That it should be used where enumeration is expected, not necessarily where there could be multiple enumeration.Fadein
Nice idea, except that it messes up other LinQ calls. YourMethod(inMemoryData.Where(Filter)) won't work if YourMethod accepts an IReadOnlyCollection, unless you call .ToList() or .ToArray() on the argument. While it is important not to pass LinQ to SQL queries to your method, I think it is easier to make sure .ToListAsync() is called on all your queries at a few places than adding .ToList() to tons of calls near your methods, which is annoying. The method itself can call .ToList() to avoid multiple enumerations - if needed.Fabriane
Why does taking IList over IEnumerable silence this analysis warning in Resharper? Usually the practice is to take the most open/generic type; so I don't see how IList fixes the potential multiple iterations issue. This whole warning just feels very confusing to me.Cryptoclastic
@Cryptoclastic an IList is supposed to be list-like, meaning an object that can be iterated multiple times. When you use IEnumerable, it doesn't know what concrete type you gave it and assume it may be an object that can be consumed only once. Objects like that won't be okay being cast as IList anyway.Barbabra
@binki: The idea is good, but unfortunately neither IList<T> nor ICollection<T> implement IReadOnlyCollection<T>. This is a big disadvantage for using it. I guess that interface would be used much more often if .Net had implemented a proper inheritance chain of these interfaces. :-(Marilou
@TobiasKnauss I agree that that is annoying. It is an unfortunate result of backwards compatibility. Also, even though ICollection<T> and IReadOnlyCollection<T> have similar-sounding names, they are actually rather unlike each other. For example, ICollection<T>.Contains(T) exists while IReadOnlyCollection<T>.Contains(T) does not. List<T> returned by .ToList<T>() does implement all of those interfaces.Raff
@TobiasKnauss Also, you can provide an adapter class to enable you to consume IList<T> as IReadOnlyList<T> or, if writing extension methods, just write more overloads (including overloads for various concrete types to resolve the resulting ambiguity) which all share the same implementation via adaptation.Raff
V
35

If your data is always going to be repeatable, perhaps don't worry about it. However, you can unroll it too - this is especially useful if the incoming data could be large (for example, reading from disk/network):

if(objects == null) throw new ArgumentException();
using(var iter = objects.GetEnumerator()) {
    if(!iter.MoveNext()) throw new ArgumentException();

    var firstObject = iter.Current;
    var list = DoSomeThing(firstObject);  

    while(iter.MoveNext()) {
        list.Add(DoSomeThingElse(iter.Current));
    }
    return list;
}

Note I changed the semantic of DoSomethingElse a bit, but this is mainly to show unrolled usage. You could re-wrap the iterator, for example. You could make it an iterator block too, which could be nice; then there is no list - and you would yield return the items as you get them, rather than add to a list to be returned.

Vertigo answered 23/11, 2011 at 10:54 Comment(7)
+1 Well that is a very nice code, but- I loose the prettiness and readability of the code.Leary
@Leary not everything is pretty ;p I don't the the above is unreadable, though. Especially if it is made into an iterator block - quite cute, IMO.Vertigo
I can just write: "var objectList = objects.ToList();" and save all the Enumerator handling.Leary
@Leary in the question you explicitly said you didn't really want to do that ;p Also, a ToList() approach may not suit if the data if very large (potentially, infinite - sequences do not need to be finite, but can still be iterated). Ultimately, I'm just presenting an option. Only you know the full context to see if it is warranted. If your data is repeatable, another valid option is: don't change anything! Ignore R# instead... it all depends on the context.Vertigo
I think Marc's solution is quite nice. 1. It is very clear how the 'list' is initialized, it is very clear how the list is iterated, and it is very clear which members of the list are operated on.Cyanohydrin
isn't the objects variable contains repeatable data by the definition since it comes as the parameter? how could it be unrepeatable?Sen
@Sen IEnumerable<T> just means "a sequence"; if that sequence is coming from some in-memory list/array/etc, then sure: it'll probably be repeatable - however - it could be pulling data from a socket, or a PRNG, or a ton of other places; fundamentally, IEnumerable<T> is not guaranteed to be repeatableVertigo
H
28

Using IReadOnlyCollection<T> or IReadOnlyList<T> in the method signature instead of IEnumerable<T>, has the advantage of making explicit that you might need to check the count before iterating, or to iterate multiple times for some other reason.

However they have a huge downside that will cause problems if you try to refactor your code to use interfaces, for instance to make it more testable and friendly to dynamic proxying. The key point is that IList<T> does not inherit from IReadOnlyList<T>, and similarly for other collections and their respective read-only interfaces. (In short, this is because .NET 4.5 wanted to keep ABI compatibility with earlier versions. But they didn't even take the opportunity to change that in .NET core.)

This means that if you get an IList<T> from some part of the program and want to pass it to another part that expects an IReadOnlyList<T>, you can't! You can however pass an IList<T> as an IEnumerable<T>.

In the end, IEnumerable<T> is the only read-only interface supported by all .NET collections including all collection interfaces. Any other alternative will come back to bite you as you realize that you locked yourself out from some architecture choices. So I think it's the proper type to use in function signatures to express that you just want a read-only collection.

(Note that you can always write a IReadOnlyList<T> ToReadOnly<T>(this IList<T> list) extension method that simply casts if the underlying type supports both interfaces, but you have to add it manually everywhere when refactoring, where as IEnumerable<T> is always compatible.)

As always this is not an absolute, if you're writing database-heavy code where accidental multiple enumeration would be a disaster, you might prefer a different trade-off.

Heyday answered 23/7, 2018 at 18:8 Comment(1)
+1 For coming on an old post and adding documentation on new ways to solve this issue with the new features provided by the .NET platform (i.e. IReadOnlyCollection<T>)Allain
M
11

With .NET 6/ C# 10

.. and beyond you can attempt to determine the number of elements in a sequence without forcing an enumeration using Enumerable.TryGetNonEnumeratedCount(IEnumerable, Int32) method.

This method returns true if the count of source can be determined without enumeration; otherwise, false. So you can check if further implementation is needed.

using System;
using System.Collections.Generic;
using System.Linq;
                    
public class Program
{
    public static void Main()
    {
        IEnumerable<int> arrayOne = new[] { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };

        var canGetCountDirectly = arrayOne.TryGetNonEnumeratedCount(out int theCount);

        Console.WriteLine($"Count can be returned directly = {canGetCountDirectly}");
        Console.WriteLine($"Count = {theCount}");
    }
}
Maley answered 16/11, 2021 at 5:8 Comment(0)
P
2

If the aim is really to prevent multiple enumerations than the answer by Marc Gravell is the one to read, but maintaining the same semantics you could simple remove the redundant Any and First calls and go with:

public List<object> Foo(IEnumerable<object> objects)
{
    if (objects == null)
        throw new ArgumentNullException("objects");

    var first = objects.FirstOrDefault();

    if (first == null)
        throw new ArgumentException(
            "Empty enumerable not supported.", 
            "objects");

    var list = DoSomeThing(first);  

    var secondList = DoSomeThingElse(objects);

    list.AddRange(secondList);

    return list;
}

Note, that this assumes that you IEnumerable is not generic or at least is constrained to be a reference type.

Pagepageant answered 23/11, 2011 at 11:0 Comment(5)
objects is still being passed to DoSomethingElse which is returning a list, so the possibility that you are still enumerating twice is still there. Either way if the collection was a LINQ to SQL query, you hit the DB twice.Gormand
@PaulStovell, Read this: "If the aim is really to prevent multiple enumerations than the answer by Marc Gravell is the one to read" =)Leary
It has been suggested on some other stackoverflow posts on throwing exceptions for empty lists and I'll suggest it here: why throw an exception on an empty list? Get an empty list, give an empty list. As a paradigm, that's pretty straightforward. If the caller doesn't know they are passing an empty list in, that's there problem.Cyanohydrin
@Keith Hoffman, the sample code maintains the same behavior as the code the OP posted, where the use of First will throw an exception for an empty list. I don't think its possible to say never throw exception on empty list or always throw exception on empty list. It depends...Leboeuf
Fair enough Joao. More of food for thought than a criticism of your post.Cyanohydrin
O
2

I usually overload my method with IEnumerable and IList in this situation.

public static IEnumerable<T> Method<T>( this IList<T> source ){... }

public static IEnumerable<T> Method<T>( this IEnumerable<T> source )
{
    /*input checks on source parameter here*/
    return Method( source.ToList() );
}

I take care to explain in the summary comments of the methods that calling IEnumerable will perform a .ToList().

The programmer can choose to .ToList() at a higher level if multiple operations are being concatenated and then call the IList overload or let my IEnumerable overload take care of that.

Outfall answered 22/6, 2015 at 10:10 Comment(4)
This is horrific.Overstep
@MikeChamberlain Yeah, it really is horrific and utterly clean at the same time. Unluckily some heavy scenarios need this approach.Outfall
But then you would be recreating the array each time you pass it to the IEnumerable parameratized method. I agree with @MikeChamberlain in this regard, this is a terrible suggestion.Sleeper
@Sleeper If you don't need the list to be recreated, you perform a ToList somewhere else and use the IList overload. That's the point of this solution.Outfall
S
1

If you only need to check the first element you can peek on it without iterating the whole collection:

public List<object> Foo(IEnumerable<object> objects)
{
    object firstObject;
    if (objects == null || !TryPeek(ref objects, out firstObject))
        throw new ArgumentException();

    var list = DoSomeThing(firstObject);
    var secondList = DoSomeThingElse(objects);
    list.AddRange(secondList);

    return list;
}

public static bool TryPeek<T>(ref IEnumerable<T> source, out T first)
{
    if (source == null)
        throw new ArgumentNullException(nameof(source));

    IEnumerator<T> enumerator = source.GetEnumerator();
    if (!enumerator.MoveNext())
    {
        first = default(T);
        source = Enumerable.Empty<T>();
        return false;
    }

    first = enumerator.Current;
    T firstElement = first;
    source = Iterate();
    return true;

    IEnumerable<T> Iterate()
    {
        yield return firstElement;
        using (enumerator)
        {
            while (enumerator.MoveNext())
            {
                yield return enumerator.Current;
            }
        }
    }
}
Subdue answered 7/11, 2018 at 18:9 Comment(2)
Very nice solution! But you did not call Dispose in the case where the source is empty.Chunky
Furhermore there is no Dispose when there is no iteration over the "new" source.Chunky
C
0

I like the approach of methods taking a IReadOnlyCollection if they may need to enumerate the thing multiple times. I would like to offer a utility method that can be called whenever to turn an IEnumerable into a IReadOnlyCollection. Unlike ToArray() or ToList() it can avoid the cost of extra allocations changing the type of the underlying collection:

    public static class EnumerableUtils
    {
        /* Ensures an IEnumerable is only enumerated once  */
        public static IReadOnlyCollection<T> AsEnumerated<T>(this IEnumerable<T> original)
        {
            if (original is IReadOnlyCollection<T> originalCollection)
            {
                return originalCollection;
            }

            return ImmutableArray.CreateRange(original);
        }
    }
Cafeteria answered 4/11, 2021 at 17:32 Comment(0)
R
-3

First off, that warning does not always mean so much. I usually disabled it after making sure it's not a performance bottle neck. It just means the IEnumerable is evaluated twice, wich is usually not a problem unless the evaluation itself takes a long time. Even if it does take a long time, in this case your only using one element the first time around.

In this scenario you could also exploit the powerful linq extension methods even more.

var firstObject = objects.First();
return DoSomeThing(firstObject).Concat(DoSomeThingElse(objects).ToList();

It is possible to only evaluate the IEnumerable once in this case with some hassle, but profile first and see if it's really a problem.

Regelation answered 23/11, 2011 at 10:59 Comment(7)
I work a lot with IQueryable and turning those kind of warnings off is very dangerous.Leary
Evaluating twice is a bad thing in my books. If the method takes IEnumerable, I may be tempted to pass a LINQ to SQL query to it, which results in multiple DB calls. Since the method signature doesn't indicate that it may enumerate twice, it should either change the signature (accept IList/ICollection) or only iterate onceGormand
optimizing early and ruining readability and thereby the possibility to do optimizations that make a diffrence later is way worse in my book.Regelation
When you've got a database query, making sure it's only evaluated once already makes a difference. It's not just some theoretical future benefit, it's a measurable real benefit right now. Not only that, evaluating only once is not merely beneficial for performance, it's also needed for correctness. Executing a database query twice may produce different results, as someone may have issued an update command between the two query evaluations.Renaissance
@hvd I did not recommend no such thing. Of course you should not enumerate IEnumerables that gives different results each time you iterate it or that take a really long time to iterate. See Marc Gravells answer - He also state first and foremost "perhaps don't worry". The thing is - Lots of the times you see this you've got nothing to worry about.Regelation
I did read Marc Gravell's answer already. :) It starts as "If your data is always going to be repeatable", which is an important distinction to make, and one you didn't make in your answer. If the data is always going to be repeatable, I'd actually go for the top-voted and accepted answer and simply use ICollection<T> (or maybe the new IReadOnlyCollection<T>), at which point it becomes safe to assume that it's not a lazily evaluated enumerable. If it's not guaranteed to be repeatable, I'd go for Marc Gravell's approach of spelling out the GetEnumerator/MoveNext/Current accesses.Renaissance
You are still enumerating it twice here.Skittish

© 2022 - 2024 — McMap. All rights reserved.