List<IEnumerator>.All(e => e.MoveNext()) doesn't move my enumerators on
Asked Answered
S

6

17

I'm trying to track down a bug in our code. I've boiled it down to the snippet below. In the example below I have a grid of ints (a list of rows), but I want to find the indexes of the columns that have a 1. The implementation of this is to create an enumerator for each row and step through each column in turn by keeping the enumerators in step.

class Program
{
    static void Main(string[] args)
    {
        var ints = new List<List<int>> {
            new List<int> {0, 0, 1},    // This row has a 1 at index 2
            new List<int> {0, 1, 0},    // This row has a 1 at index 1
            new List<int> {0, 0, 1}     // This row also has a 1 at index 2
        };
        var result = IndexesWhereThereIsOneInTheColumn(ints);

        Console.WriteLine(string.Join(", ", result)); // Expected: "1, 2"
        Console.ReadKey();
    }


    private static IEnumerable<int> IndexesWhereThereIsOneInTheColumn(
        IEnumerable<List<int>> myIntsGrid)
    {
        var enumerators = myIntsGrid.Select(c => c.GetEnumerator()).ToList();

        short i = 0;
        while (enumerators.All(e => e.MoveNext())) {
            if (enumerators.Any(e => e.Current == 1))
                yield return i;
            i++;

            if (i > 1000)
                throw new Exception("You have gone too far!!!");
        }
    }

}

However I have noticed that MoveNext() is not remembered each time around the while loop. MoveNext() always returns true, and Current is always 0. Is this a purposeful feature of Linq to make it more side effect free?

I noticed that this works:

    private static IEnumerable<int> IndexesWhereThereIsOneInTheColumn(
        IEnumerable<List<int>> myIntsGrid)
    {
        var enumerators = myIntsGrid.Select(c => 
            c.ToArray().GetEnumerator()).ToList(); // added ToArray() 

        short i = 0;
        while (enumerators.All(e => e.MoveNext())) {
            if (enumerators.Any(e => (int)e.Current == 1)) // added cast to int
                yield return i;
            i++;
        }
    }

So is this just a problem with List?

Shoemaker answered 8/5, 2014 at 9:16 Comment(4)
Why don't you simple use foreach loops?Litton
@Litton I could not find a nice looking implementation using foreach. My first attempt had two breaks in it, and I think that the Linq implementation is cleaner.Shoemaker
Functions passed to LINQ should not have side effects.Ivatts
Maddening! I just spent an hour in this exact issue, trying to convince myself that I wasn't crazy...Hest
O
14

As Sriram Sakthivel's answer says the issue is due to lack of boxing and accidentally the list enumerator implementation being a struct, not a reference type. Usually, one would not expect the value-type behavior for an enumerator, as most are either exposed by the IEnumerator/IEnumerator<T> interfaces, or are reference types themselves. A quick way to go around this is to change this line

var enumerators = myIntsGrid.Select(c => c.GetEnumerator()).ToList();

to

var enumerators 
    = myIntsGrid.Select(c => (IEnumerator) c.GetEnumerator()).ToList();

instead.

The above code will construct a list of already boxed enumerators, which will be treated as reference type instances, because of the interface cast. From that moment on, they should behave as you expect them to in your later code.


If you need a generic enumerator (to avoid casts when latter using the enumerator.Current property), you can cast to the appropriate generic IEnumerator<T> interface:

c => (IEnumerator<int>) c.GetEnumerator()

or even better

c => c.GetEnumerator() as IEnumerator<int>

The as keyword is said to perform a lot better than direct casts, and in the case of a loop it could bring an essential performance benefit. Just be careful that as returns null if the cast fails As per Flater's request from comments:. In the OP's case, it is guaranteed the enumerator implements IEnumerator<int>, so it is safe to go for an as cast.

Opal answered 8/5, 2014 at 9:30 Comment(8)
Nitpicking: use (IEnumerator<int>) instead of (IEnumerator) so you don't have to cast e.Current inside the while loop.Fastback
@DominicKexel, good point, I will incorporate it as an edit. ThanksOpal
It's not due to boxing. The OP's code mutates a copy, but not a boxed copy.Ivatts
@CodesInChaos, correct, the expression boxing is not appropriately used. The problem is actually the lack of boxing that enables the value type copying behavior. I edited this, thanks!Opal
@IvayloSlavov: It may be important to note that foo as Bar returns null when the cast has failed, whereas (Bar) foo throws an exception. They're both useful in their own ways, depends how you want to handle faulty casting :)Sedate
@Flater, yes, it is important to know that. I had this in a previous version of the post, but I removed it because I thought I was too verbose. In fact, the question's code uses List<int> which is guaranteed to expose a not-null IEnumerator<int>, so the as keyword is safe to use. Thanks anyway for pointing it out!Opal
@IvayloSlavov: Yeah I assumed it was irrelevant in this case. But someone could conclude from your post that foo as Bar is always better. It's for those people that I mentioned it :)Sedate
@Flater, I added this again as a note for devs to have in mind.Opal
R
17

It is because the enumerator of List<T> is a struct whereas the enumerator of Array is a class.

So when you call Enumerable.All with the struct, copy of enumerator is made and passed as a parameter to Func since structs are copied by value. So e.MoveNext is called on the copy, not the original.

Try this:

Console.WriteLine(new List<int>().GetEnumerator().GetType().IsValueType);
Console.WriteLine(new int[]{}.GetEnumerator().GetType().IsValueType);

It prints:

True
False
Reading answered 8/5, 2014 at 9:26 Comment(4)
It's not due to boxing. The OP's code mutates a copy, but not a boxed copy.Ivatts
@Ivatts Good catch, revised my answer. See if that makes sense :)Reading
That was my first idea when I read the question, but I couldn't imagine that enumerators would actually be implemented as value type sometimes. Quite error-prone.Terzas
@Mene: Imagine it; it's true! It is error-prone, but remember 99.9999% of the time the enumerator is only seen by the code generated for the foreach loop; creating your own unboxed enumerators is a rare scenario. The design team felt that the performance win of avoiding collection pressure was worth the very rare cases like this one where the code is confusing.Antho
O
14

As Sriram Sakthivel's answer says the issue is due to lack of boxing and accidentally the list enumerator implementation being a struct, not a reference type. Usually, one would not expect the value-type behavior for an enumerator, as most are either exposed by the IEnumerator/IEnumerator<T> interfaces, or are reference types themselves. A quick way to go around this is to change this line

var enumerators = myIntsGrid.Select(c => c.GetEnumerator()).ToList();

to

var enumerators 
    = myIntsGrid.Select(c => (IEnumerator) c.GetEnumerator()).ToList();

instead.

The above code will construct a list of already boxed enumerators, which will be treated as reference type instances, because of the interface cast. From that moment on, they should behave as you expect them to in your later code.


If you need a generic enumerator (to avoid casts when latter using the enumerator.Current property), you can cast to the appropriate generic IEnumerator<T> interface:

c => (IEnumerator<int>) c.GetEnumerator()

or even better

c => c.GetEnumerator() as IEnumerator<int>

The as keyword is said to perform a lot better than direct casts, and in the case of a loop it could bring an essential performance benefit. Just be careful that as returns null if the cast fails As per Flater's request from comments:. In the OP's case, it is guaranteed the enumerator implements IEnumerator<int>, so it is safe to go for an as cast.

Opal answered 8/5, 2014 at 9:30 Comment(8)
Nitpicking: use (IEnumerator<int>) instead of (IEnumerator) so you don't have to cast e.Current inside the while loop.Fastback
@DominicKexel, good point, I will incorporate it as an edit. ThanksOpal
It's not due to boxing. The OP's code mutates a copy, but not a boxed copy.Ivatts
@CodesInChaos, correct, the expression boxing is not appropriately used. The problem is actually the lack of boxing that enables the value type copying behavior. I edited this, thanks!Opal
@IvayloSlavov: It may be important to note that foo as Bar returns null when the cast has failed, whereas (Bar) foo throws an exception. They're both useful in their own ways, depends how you want to handle faulty casting :)Sedate
@Flater, yes, it is important to know that. I had this in a previous version of the post, but I removed it because I thought I was too verbose. In fact, the question's code uses List<int> which is guaranteed to expose a not-null IEnumerator<int>, so the as keyword is safe to use. Thanks anyway for pointing it out!Opal
@IvayloSlavov: Yeah I assumed it was irrelevant in this case. But someone could conclude from your post that foo as Bar is always better. It's for those people that I mentioned it :)Sedate
@Flater, I added this again as a note for devs to have in mind.Opal
R
2

Alternatively, you could do it with a lambda extension

var ids = Enumerable.Range(0,ints.Max (row => row.Count)).
      Where(col => ints.Any(row => (row.Count>col)? row[col] == (1) : false));

or

var ids = Enumerable.Range(0,ints.Max (row=> row.Count)).
      Where(col => ints.Any (row => row.ElementAtOrDefault(col) == 1));
Riordan answered 8/5, 2014 at 9:26 Comment(0)
L
1

Here's a simple implementation using loops and yield:

private static IEnumerable<int> IndexesWhereThereIsOneInTheColumn(
    IEnumerable<List<int>> myIntsGrid)
{
    for (int i=0; myIntsGrid.Max(l=>l.Count) > i;i++)
    {
        foreach(var row in myIntsGrid)
        {
            if (row.Count > i && row[i] == 1)
            {
                yield return i;
                break;
            }
        }
    }       
}

Alternatively, use this inside the for loop:

if (myIntsGrid.Any(row => row.Count > i && row[i] == 1)) yield return i;
Litton answered 8/5, 2014 at 9:38 Comment(2)
That returns the index of the rows that contain a 1, I needed the columns.Shoemaker
Aha, I misunderstood. Edited.Litton
B
1

Just for fun, here's a neat LINQ query that won't cause hard-to-trace side effects in your code:

IEnumerable<int> IndexesWhereThereIsOneInTheColumn(IEnumerable<IEnumerable<int>> myIntsGrid)
{
    return myIntsGrid
        // Collapse the rows into a single row of the maximum value of all rows
        .Aggregate((acc, x) => acc.Zip(x, Math.Max))
        // Enumerate the row
        .Select((Value,Index) => new { Value, Index })
        .Where(x => x.Value == 1)
        .Select(x => x.Index);
}
Bartz answered 9/5, 2014 at 8:35 Comment(2)
Solves the posted (cut down) problem, but I'm not actually using ints in the grid for the real problem. However, I might be able to create an aggregator that bubbles the value I want to the top.Shoemaker
Sure, you'd just need to change Math.Max to a function that picks your chosen value and then obviously the Where operator tooBartz
W
0

Why can't you just get those indexes like this:

var result = ints.Select (i => i.IndexOf(1)).Distinct().OrderBy(i => i);

Seems to be much easier...

Wellbred answered 8/5, 2014 at 9:30 Comment(1)
That will work for this example, but I wanted to reduce the amount of code that I posted. My actual bug is more involved, but the code I posted demonstrates the crux of the problem.Shoemaker

© 2022 - 2024 — McMap. All rights reserved.