"Unzip" IEnumerable dynamically in C# or best alternative
Asked Answered
E

4

8

Lets assume you have a function that returns a lazily-enumerated object:

struct AnimalCount
{
    int Chickens;
    int Goats;
}

IEnumerable<AnimalCount> FarmsInEachPen()
{
    ....
    yield new AnimalCount(x, y);
    ....
}

You also have two functions that consume two separate IEnumerables, for example:

ConsumeChicken(IEnumerable<int>);
ConsumeGoat(IEnumerable<int>);

How can you call ConsumeChicken and ConsumeGoat without a) converting FarmsInEachPen() ToList() beforehand because it might have two zillion records, b) no multi-threading.

Basically:

ConsumeChicken(FarmsInEachPen().Select(x => x.Chickens));
ConsumeGoats(FarmsInEachPen().Select(x => x.Goats));

But without forcing the double enumeration.

I can solve it with multithread, but it gets unnecessarily complicated with a buffer queue for each list.

So I'm looking for a way to split the AnimalCount enumerator into two int enumerators without fully evaluating AnimalCount. There is no problem running ConsumeGoat and ConsumeChicken together in lock-step.

I can feel the solution just out of my grasp but I'm not quite there. I'm thinking along the lines of a helper function that returns an IEnumerable being fed into ConsumeChicken and each time the iterator is used, it internally calls ConsumeGoat, thus executing the two functions in lock-step. Except, of course, I don't want to call ConsumeGoat more than once..

Erickaericksen answered 28/3, 2013 at 19:28 Comment(4)
Stop. Take a deep breath. Arrange your thoughts in a logical manner. You skipped from AnimalCount to IEnumerable<AnimalCounts> to IEnumerable<int> (I presume? can't even tell your intent...). As this stands now is getting flagged as not a real question.Copybook
why are you combining them into AnimalCount in the first place.Affection
@Daniel lets say it's not me combining them. I have, from an outside source, a zillion-line log file with ChickenCount and GoatCount on each line. And I have two separate 3rd party functions that want their data as an IEnumerable.Erickaericksen
So we're not allowed to store half the array in temporary storage of some kind right?Copybook
E
1

I figured it out, thanks in large part due to the path that @Lee put me on.

You need to share a single enumerator between the two zips, and use an adapter function to project the correct element into the sequence.

private static IEnumerable<object> ConsumeChickens(IEnumerable<int> xList)
{
    foreach (var x in xList)
    {
        Console.WriteLine("X: " + x);
        yield return null;
    }
}

private static IEnumerable<object> ConsumeGoats(IEnumerable<int> yList)
{
    foreach (var y in yList)
    {
        Console.WriteLine("Y: " + y);
        yield return null;
    }
}

private static IEnumerable<int> SelectHelper(IEnumerator<AnimalCount> enumerator, int i)
{
    bool c = i != 0 || enumerator.MoveNext();
    while (c)
    {
        if (i == 0)
        {
            yield return enumerator.Current.Chickens;
            c = enumerator.MoveNext();
        }
        else
        {
            yield return enumerator.Current.Goats;
        }
    }
}

private static void Main(string[] args)
{
    var enumerator = GetAnimals().GetEnumerator();

    var chickensList = ConsumeChickens(SelectHelper(enumerator, 0));
    var goatsList = ConsumeGoats(SelectHelper(enumerator, 1));

    var temp = chickensList.Zip(goatsList, (i, i1) => (object) null);
    temp.ToList();

    Console.WriteLine("Total iterations: " + iterations);
}
Erickaericksen answered 28/3, 2013 at 20:21 Comment(5)
I don't have control over the source function or the two consumer functions. I'm working on making this prettier.Erickaericksen
This relies on the fact that Enumerable.Zip() calls the enumerator of each sequence in alternation, and that apparently the second enumerator is called before the first. Neither of these facts are guaranteed by the documentation, meaning this solution may not always work, or could suddenly stop working in a future version of .Net.Corkhill
That is correct. However, I really do need to consume the objects as lists and not as individual objects (batching).Erickaericksen
If you don't have control over the two consumer functions, how do you make them yield return after each input?Corkhill
@BlueRaja Exactly the point of my answer, which got downvoted.Almund
C
4

I don't think there is a way to do what you want, since ConsumeChickens(IEnumerable<int>) and ConsumeGoats(IEnumerable<int>) are being called sequentially, each of them enumerating a list separately - how do you expect that to work without two separate enumerations of the list?

Depending on the situation, a better solution is to have ConsumeChicken(int) and ConsumeGoat(int) methods (which each consume a single item), and call them in alternation. Like this:

foreach(var animal in animals)
{
    ConsomeChicken(animal.Chickens);
    ConsomeGoat(animal.Goats);
}

This will enumerate the animals collection only once.


Also, a note: depending on your LINQ-provider and what exactly it is you're trying to do, there may be better options. For example, if you're trying to get the total sum of both chickens and goats from a database using linq-to-sql or linq-to-entities, the following query..

from a in animals
group a by 0 into g
select new 
{
    TotalChickens = g.Sum(x => x.Chickens), 
    TotalGoats = g.Sum(x => x.Goats)
}

will result in a single query, and do the summation on the database-end, which is greatly preferable to pulling the entire table over and doing the summation on the client end.

Corkhill answered 28/3, 2013 at 20:1 Comment(1)
Thanks for your help, but I figured it out. See my answer.Erickaericksen
A
2

The way you have posed your problem, there is no way to do this. IEnumerable<T> is a pull enumerable - that is, you can GetEnumerator to the front of the sequence and then repeatedly ask "Give me the next item" (MoveNext/Current). You can't, on one thread, have two different things pulling from the animals.Select(a => a.Chickens) and animals.Select(a => a.Goats) at the same time. You would have to do one then the other (which would require materializing the second).

The suggestion BlueRaja made is one way to change the problem slightly. I would suggest going that route.

The other alternative is to utilize IObservable<T> from Microsoft's reactive extensions (Rx), a push enumerable. I won't go into the details of how you would do that, but it's something you could look into.

Edit:

The above is assuming that ConsumeChickens and ConsumeGoats are both returning void or are at least not returning IEnumerable<T> themselves - which seems like an obvious assumption. I'd appreciate it if the lame downvoter would actually comment.

Almund answered 28/3, 2013 at 20:11 Comment(0)
B
2

Actually simples way to achieve what you what is convert FarmsInEachPen return value to push collection or IObservable and use ReactiveExtensions for working with it

var observable = new Subject<Animals>()
observable.Do(x=> DoSomethingWithChicken(x. Chickens))
observable.Do(x=> DoSomethingWithGoat(x.Goats))

foreach(var item in FarmsInEachPen())
{
    observable.OnNext(item)
}   
Breast answered 28/3, 2013 at 21:21 Comment(3)
Interesting. This will project IEnumerable<Chicken,Goat> to IEnumerable<Chicken>,IEnumerable<Goat> or will I have to modify DoSomethingWithChicken/Goat to use?Erickaericksen
If you want do it in flow like model (ie without creation any other collections to accumulate items) you will need to write some function that accept each element separately, but actually there is functions in Rx that convert observable to enumerable.Breast
The latter is what I'd be interested in. I'm going to download Rx and see what I can do with it, I'll revisit this question and change the accepted answer if it can do that.Erickaericksen
E
1

I figured it out, thanks in large part due to the path that @Lee put me on.

You need to share a single enumerator between the two zips, and use an adapter function to project the correct element into the sequence.

private static IEnumerable<object> ConsumeChickens(IEnumerable<int> xList)
{
    foreach (var x in xList)
    {
        Console.WriteLine("X: " + x);
        yield return null;
    }
}

private static IEnumerable<object> ConsumeGoats(IEnumerable<int> yList)
{
    foreach (var y in yList)
    {
        Console.WriteLine("Y: " + y);
        yield return null;
    }
}

private static IEnumerable<int> SelectHelper(IEnumerator<AnimalCount> enumerator, int i)
{
    bool c = i != 0 || enumerator.MoveNext();
    while (c)
    {
        if (i == 0)
        {
            yield return enumerator.Current.Chickens;
            c = enumerator.MoveNext();
        }
        else
        {
            yield return enumerator.Current.Goats;
        }
    }
}

private static void Main(string[] args)
{
    var enumerator = GetAnimals().GetEnumerator();

    var chickensList = ConsumeChickens(SelectHelper(enumerator, 0));
    var goatsList = ConsumeGoats(SelectHelper(enumerator, 1));

    var temp = chickensList.Zip(goatsList, (i, i1) => (object) null);
    temp.ToList();

    Console.WriteLine("Total iterations: " + iterations);
}
Erickaericksen answered 28/3, 2013 at 20:21 Comment(5)
I don't have control over the source function or the two consumer functions. I'm working on making this prettier.Erickaericksen
This relies on the fact that Enumerable.Zip() calls the enumerator of each sequence in alternation, and that apparently the second enumerator is called before the first. Neither of these facts are guaranteed by the documentation, meaning this solution may not always work, or could suddenly stop working in a future version of .Net.Corkhill
That is correct. However, I really do need to consume the objects as lists and not as individual objects (batching).Erickaericksen
If you don't have control over the two consumer functions, how do you make them yield return after each input?Corkhill
@BlueRaja Exactly the point of my answer, which got downvoted.Almund

© 2022 - 2024 — McMap. All rights reserved.