Create batches in LINQ
Asked Answered
O

21

167

Can someone suggest a way to create batches of a certain size in LINQ?

Ideally I want to be able to perform operations in chunks of some configurable amount.

Ornithology answered 5/12, 2012 at 20:27 Comment(3)
This question was asked almost 9 years ago, and now there is a Enumerable.Chunk static method in Linq. Check out the documentation here: learn.microsoft.com/en-us/dotnet/api/…Ornithology
there are still a lot of people sitting .NET Core 3 or 5, or even old good .NET FrameworkTigges
@SergeyBerezovskiy that's a really good call. I think the question and answers should remain for that purpose - just wanted to provide context on why I originally asked it :DOrnithology
C
127

An Enumerable.Chunk() extension method was added to .NET 6.0.

Example:

var list = new List<int> { 1, 2, 3, 4, 5, 6, 7 };

var chunks = list.Chunk(3);
// returns { { 1, 2, 3 }, { 4, 5, 6 }, { 7 } }

For those who cannot upgrade, the source is available on GitHub.

Chinua answered 21/6, 2021 at 0:34 Comment(1)
I knew I saw something about a new Linq method - thanks for posting this!Ornithology
T
149

You don't need to write any code. Use MoreLINQ Batch method, which batches the source sequence into sized buckets (MoreLINQ is available as a NuGet package you can install):

int size = 10;
var batches = sequence.Batch(size);

Which is implemented as:

public static IEnumerable<IEnumerable<TSource>> Batch<TSource>(
                  this IEnumerable<TSource> source, int size)
{
    TSource[] bucket = null;
    var count = 0;

    foreach (var item in source)
    {
        if (bucket == null)
            bucket = new TSource[size];

        bucket[count++] = item;
        if (count != size)
            continue;

        yield return bucket;

        bucket = null;
        count = 0;
    }

    if (bucket != null && count > 0)
        yield return bucket.Take(count).ToArray();
}
Tigges answered 5/12, 2012 at 20:29 Comment(11)
This performs terribly with large batches or low memory space. See my answer below.Aleutian
4 bytes per item performs terribly? Do you have some tests which show what terribly means? If you are loading millions of items into memory, then I wouldn't do it. Use server-side pagingTigges
I don't mean to offend you, but there are simpler solutions that do not accumulate at all. Furthermore this will allocate space even for non-existent elements: Batch(new int[] { 1, 2 }, 1000000)Aleutian
@NickWhaley well, agree with you that additional space will be allocated, but in real life you usually have just opposite situation - list of 1000 items which should go in batches of 50 :)Tigges
Yes the situation should usually be the other way, but in real life, these may be user inputs.Aleutian
This is a perfectly fine solution. In real life you: validate user input, treat batches as entire collections of items (which accumulates the items anyways), and often process batches in parallel (which is not supported by the iterator approach, and will be a nasty surprise unless you know the implementation details).Pulliam
@SergeyBerezovskiy there's an issue that I find with Nick's code, which I have pointed out in comment to his answer. Following are thge details: If I do Count() on the IEnumerable<IEnumerable<T>> result , it gives wrong answer, it gives total number of elements, when expected is total number of batches created. This is not the case with MoreLinq Batch codeAmoakuh
Added ToArray() there: yield return bucket.Take(count).ToArray();Nalepka
@SergeyNudnov ToList is preferable to ToArray in most situations.Slough
@Slough not really. If you are not going to add/remove items, then the usage of the list makes no sense. Also, ToList will grab extra space for the underlying array. It could use twice more memory depending on batch size. ToArray on the other hand returns an array with exactly the required count of items. Cost is the usage of Array.Copy onceTigges
Actually that depends on the source type. In the general case, ToArray is effectively implemented by ToList().ToArray() because an extra allocation is required to return an array of exactly the right size when the count is unknown. Since you are doing TSource[].Take().ToArray(), the count will pass through the Take and the extra allocation will not occur, in .Net Core. If you were using .Net, you would incur the extra allocation. I prefer to assume ToList is always more efficient.Slough
C
127

An Enumerable.Chunk() extension method was added to .NET 6.0.

Example:

var list = new List<int> { 1, 2, 3, 4, 5, 6, 7 };

var chunks = list.Chunk(3);
// returns { { 1, 2, 3 }, { 4, 5, 6 }, { 7 } }

For those who cannot upgrade, the source is available on GitHub.

Chinua answered 21/6, 2021 at 0:34 Comment(1)
I knew I saw something about a new Linq method - thanks for posting this!Ornithology
L
124
public static class MyExtensions
{
    public static IEnumerable<IEnumerable<T>> Batch<T>(this IEnumerable<T> items,
                                                       int maxItems)
    {
        return items.Select((item, inx) => new { item, inx })
                    .GroupBy(x => x.inx / maxItems)
                    .Select(g => g.Select(x => x.item));
    }
}

and the usage would be:

List<int> list = new List<int>() { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };

foreach(var batch in list.Batch(3))
{
    Console.WriteLine(String.Join(",",batch));
}

OUTPUT:

0,1,2
3,4,5
6,7,8
9
Lancer answered 5/12, 2012 at 20:31 Comment(10)
Worked perfect for meNeuroticism
Credited here https://mcmap.net/q/82276/-c-filter-list-to-remove-any-double-object for a full implementaion exampleNeuroticism
Once GroupBy starts enumeration, doesn't it have to fully enumerate its source? This loses lazy evaluation of the source and thus, in some cases, all of the benefit of batching!Tartu
Wow, thanks, you saved me from insanity. Works very wellNucleotide
As @Tartu mentions, this method fully enumerates its source so although it looks nice, it defeats the purpose of lazy evaluation / pipeliningSchnorkle
worked perfectly for me - I didn't need lazy evaluation, just needed to batch up items in a listKief
Don't do this. Grouping scans the entire source. Terrible for performance.Rillings
Do this - its totally appropriate when you need to break up an existing block of things into smaller batches of things for performant processing. The alternative is a gross looking for loop where you manually break up the batches.and still go through the entire source.Nassi
Prefer readability and elegance over performance. Great answer, thanks.Peers
in this example the execution will fail, because of the foreachHebraist
U
49

If you start with sequence defined as an IEnumerable<T>, and you know that it can safely be enumerated multiple times (e.g. because it is an array or a list), you can just use this simple pattern to process the elements in batches:

while (sequence.Any())
{
    var batch = sequence.Take(10);
    sequence = sequence.Skip(10);

    // do whatever you need to do with each batch here
}
Unsteel answered 21/12, 2016 at 21:23 Comment(3)
Nice, simple way for batching w/o much code or need for external libraryGorgonzola
@DevHawk: it is. Note, however, that performance will suffer exponentially on large(r) collections.Leila
@RobIII: i think the performance problem is the yield, which is not part of this sollutionArachnid
F
36

This is a fully lazy, low overhead, one-function implementation of Batch that doesn't do any accumulation and instead forwards iteration steps directly to the source IEnumerable, similar to python's itertools.GroupBy.

This design eliminates copying and buffering, which is nice, but has the following consequences:

  • Elements must be enumerated in order.
  • Elements must not be accessed more than once.
  • Elements that aren't consumed in a batch are explicitly discarded when the next batch is requested.

If these conditions are violated, that is, trying to access elements out of order, a second time, or via a saved iterator, .NET will throw an exception such as InvalidOperationException: Enumeration already finished.. Note that the design and type, IEnumerable<IEnumerable<T>>, naturally implies these conditions; make up your own mind whether that counts as "exception unsafe" or "misuse".

You can test a complete sample at .NET Fiddle.

public static IEnumerable<IEnumerable<T>> Batch<T>(this IEnumerable<T> source, int size)
{
    if (source == null)
        throw new ArgumentNullException("source");
    if (size <= 0)
        throw new ArgumentOutOfRangeException("size", "Must be greater than zero.");
    using (var enumerator = source.GetEnumerator())
        while (enumerator.MoveNext())
        {
            int i = 0;
            
            // Batch is a local iterator function closing over `i` and
            // `enumerator` that executes the inner batch enumeration
            IEnumerable<T> Batch()
            {
                do yield return enumerator.Current;
                while (++i < size && enumerator.MoveNext());
            }

            yield return Batch();
            
            // discard skipped items
            while (++i < size && enumerator.MoveNext());
        }
}


// Buffer() explicitly buffers the contents of intermediate IEnumerables, lifting the conditions of enumerating in order etc.
// but loses the memory and overhead benefits of not copying/buffering.
public static IEnumerable<IReadOnlyList<T>> Buffer<T>(this IEnumerable<IEnumerable<T>> batched) => batched.Select(e => e.ToList());

This solution is based on (and fixes issues in) Nick Whaley's solution with help from EricRoller.

Flyblown answered 12/6, 2017 at 17:26 Comment(22)
This is the only fully lazy implementation here. Consistent with the python itertools.GroupBy implementation.Kenleigh
You can eliminate the check for done by just always calling e.Count() after yield return e. You would need to rearrange the loop in BatchInner to not invoke the undefined behavior source.Current if i >= size. This will eliminate the need to allocate a new BatchInner for each batch.Kenleigh
@EricRoller what undefined behavior? The loop checks for "i < size" already. Care to flesh out your ideas in another fiddle?Flyblown
You are right, you still need to capture information about the progress of each batch. I did find a bug in your code if you try getting the 2nd item from each batch: bug fiddle. Fixed implementation without a separate class (using C#7) is here: fixed fiddle. Note that I expect the CLR will still create the local function once per loop to capture variable i so this isn't necessarily more efficient than defining a separate class, but it is a little cleaner I think.Kenleigh
Very nice! I tried for a while but gave up, I had forgotten about local functions, it's a great feature. Much cleaner than a separate class imo. Since i isn't really bound to the loop body (as long as you set i=0 each iteration) you can hoist it out with Batch() definition to avoid the extra allocations like this. Compared to a modified version of yours for benchmarking (10k elements, batch size 3, counting only) it appears to reduce memory use from 640 -> 430 kb according to the fiddle stats (a weak metric, admittedly).Flyblown
Actually on second thought, having all the batches share the same i could get into trouble if you keep access previous iterators, it'll let previous iterators access items in later batches. It's probably best to just keep them separate like you said (I feel like this is the fourth time I've 'discovered' this haha).Flyblown
I benchmarked this version using BenchmarkDotNet against System.Reactive.Linq.EnumerableEx.Buffer and your implementation was 3-4 faster, at the risk of safety. Internally, EnumerableEx.Buffer allocates a Queue of List<T> github.com/dotnet/reactive/blob/…Antipas
@Flyblown Would you guys be interested in teaming up to submit a patch to Microsoft's github?Antipas
If you want a buffered version of this, you can do: public static IEnumerable<IReadOnlyList<T>> BatchBuffered<T>(this IEnumerable<T> source, int size) => Batch(source, size).Select(chunk => (IReadOnlyList<T>)chunk.ToList()); Use of IReadOnlyList<T> is to hint the user the output is cached. You could also keep the IEnumerable<IEnumerable<T>> instead.Audiovisual
This code doesn't work for me. It fails in scenarios like this: var last = a.Batch(3).Last().ToList();Rupp
I think I understand why c# doesn't provide a Batch method. Double enumeration of a batch in this implementation yields the next batch. Not the same items over again... Batching is just hard to solve without some sort of buffering.Titre
@MortenGormMadsen No, enumerating a second time would throw an exception because the batch has already been enumerated (this is mentioned in the comment). (If you found a setup where a second enumeration yields the next batch then it's a bug and I would be mightily interested in a fiddle that demonstrates.) You're exactly right that a batch without buffering is basically impossible if the user assumes that every IEnumerable is just a List that you can re-iterate willy-nilly. The type is an IEnumerable and not a List for a reason. You cannot iterate it again.Flyblown
@Rupp That's a good case, I put this in a fiddle to demonstrate: dotnetfiddle.net/S64FEa You may or may not be surprised when I say that its not really a bug. To find the "last" item, it asks for the next item until it says there are no more items. When you ask for the next batch, the previous batch is fully drained. So by the time Last() decides that it found the final batch the batch is already gone.Flyblown
How do you feel about the warnings on the "enumerator" variable inside the local Batch method - captured variable is disposed in outer scope ?Dashing
@Dashing I'm not sure which warning you're referring to, I built and ran the linked code from .net fiddle with warning level 5 (default is 4) using the dotnet command with sdk version 5.0.300 and got 0 warnings. Are you running a different version? Anyways I would say that the captured variable is never in danger of being disposed from underneath the local function because the local function's execution scope ends before the scope of the captured variable. Though perhaps there is some edge case between the generator function, using, local function, and an exception that I'm not thinking of.Flyblown
@Flyblown sorry yes, I think it's ReSharper showing me the hint/warningDashing
@Flyblown - The discard skipped items line seems redundant. Can you tell me a case when it is not?Multiple
@Multiple Note the description "Iteration comes directly from the underlying IEnumerable", this is not an exaggeration, calls to MoveNext() inside an inner batch enumerable are pretty much forwarded directly to the original enumerable. If you skip any items in the previous batch, and then get the first element of the next batch then that would return the next element in the underlying IEnumerable if the "discard skipped items" line was omitted. This line ensures that the logical semantics of batches of N elements are preserved. See the "Second from each batch:" example in the .NET Fiddle.Flyblown
Yeah ReSharper warns 3 times in this code... the inner references to enumerator the captured variable is disposed in the outer scope and the inner i the captured variable is "modified in the outer scope". I know R# warnings are not be sneezed at though I ignored these until eventually one day, I get a null ref using this code - and I somehow doubt the code I'm using to call it, could cause it null ref at System.Linq.Enumerable.&lt;SelectManyIterator&gt;d__233.MoveNext() at Core.Extensions.LinqExtensions.&lt;&gt;c__DisplayClass0_21.&lt;&lt;Batch&gt;g__Batch|0&gt;d.MoveNext()Dashing
@Dashing can you share the code where you use .Batch()? That is much more useful for analysis than claims about warnings or exceptions.Flyblown
@Flyblown apologies, I discovered it is my fault, it's easy to think the batch() is at fault when the linq calling it gets a nullref while creating the object it's batchingDashing
@Dashing that's fair, I'm glad you figured out the problem. Maybe we should add an extra null check on source? Was that it?Flyblown
A
30

All of the above perform terribly with large batches or low memory space. Had to write my own that will pipeline (notice no item accumulation anywhere):

public static class BatchLinq {
    public static IEnumerable<IEnumerable<T>> Batch<T>(this IEnumerable<T> source, int size) {
        if (size <= 0)
            throw new ArgumentOutOfRangeException("size", "Must be greater than zero.");

        using (IEnumerator<T> enumerator = source.GetEnumerator())
            while (enumerator.MoveNext())
                yield return TakeIEnumerator(enumerator, size);
    }

    private static IEnumerable<T> TakeIEnumerator<T>(IEnumerator<T> source, int size) {
        int i = 0;
        do
            yield return source.Current;
        while (++i < size && source.MoveNext());
    }
}

Edit: Known issue with this approach is that each batch must be enumerated and enumerated fully before moving to the next batch. For example this doesn't work:

//Select first item of every 100 items
Batch(list, 100).Select(b => b.First())
Aleutian answered 11/7, 2013 at 16:36 Comment(14)
The routine @Lancer posted above doesn't perform item accumulation either.Pregnable
@Pregnable It does. GroupBy internally will accumulate.Aleutian
Do you have a reference about GroupBy doing accumulation? Usually, reduce operations like GroupBy perform a transform that outputs a key/value pair object, which is a subtle difference.Pregnable
@Pregnable Just using logic, GroupBy MUST enumerate the entire collection before moving on to the next group. If you use a reflector you will find that GroupBy will load the entire contents into a lookup table.Aleutian
I can imagine an implementation that doesn't (think of a coin-sorting machine as a physical example), but your comment about reflection answers my question.Pregnable
@Pregnable Still does. A coin sorting machine that gives you nickels first, then dimes, MUST first inspect every single coin before giving you a dime to be sure there are no more nickels.Aleutian
True, but a coin sorting machine groups each coin individually. It doesn't matter that 57 nickels have already been sorted. It puts the nickel in the nickel bin. If you're doing operations like Any() or First(), you don't need to go through all the coins -- you just need one nickel. For others like Where(), Select(), or All(), I agree, you need to examine every item.Pregnable
@Pregnable You are right, GroupBy could always run without accumulation if no more than 1 item of each group is enumerated. Too bad it is implemented with eager-load.Aleutian
Ahhh ahha, missed your edit note when I snagged this code. It took some time to understand why iterating over un-enumerated batches actually enumerated the entire original collection (!!!), providing X batches, each having enumerated 1 item (where X is the number of original collection items).Souza
Your code is almost awesome. Just one tweak is needed: Split the first method into two parts. Right now, the ArgumentOutOfRangeException won't throw until the enumerable is enumerated (it should throw right away at creation time). Also, name the inner method BatchImpl or BatchInner or something like that so it at least has the same name as the method calling it.Tartu
@NickWhaley if I do Count() on the IEnumerable<IEnumerable<T>> result by your code, it gives wrong answer, it gives total number of elements, when expected is total number of batches created. This is not the case with MoreLinq Batch codeAmoakuh
FWIW - When using this code in a threaded environment, I ran into some concurrency issues where certain objects appeared in more than one batch. Because I didn't have the memory constraints mentioned by the author, I replaced it with the version here: https://mcmap.net/q/82277/-how-to-loop-through-ienumerable-in-batches-duplicateUncouth
@MattMurrell Do you have any sample tests that demonstrate this?Antipas
@JohnZabroski - Here's a quick gist: gist.github.com/mmurrell/9225ed7c4d107c2195057f77e07f0f68Uncouth
O
14

I wonder why nobody has ever posted an old school for-loop solution. Here is one:

List<int> source = Enumerable.Range(1,23).ToList();
int batchsize = 10;
for (int i = 0; i < source.Count; i+= batchsize)
{
    var batch = source.Skip(i).Take(batchsize);
}

This simplicity is possible because the Take method:

... enumerates source and yields elements until count elements have been yielded or source contains no more elements. If count exceeds the number of elements in source, all elements of source are returned

Disclaimer:

Using Skip and Take inside the loop means that the enumerable will be enumerated multiple times. This is dangerous if the enumerable is deferred. It may result in multiple executions of a database query, or a web request, or a file read. This example is explicitly for the usage of a List which is not deferred, so it is less of a problem. It is still a slow solution since skip will enumerate the collection each time it is called.

This can also be solved using the GetRange method, but it requires an extra calculation to extract a possible rest batch:

for (int i = 0; i < source.Count; i += batchsize)
{
    int remaining = source.Count - i;
    var batch = remaining > batchsize  ? source.GetRange(i, batchsize) : source.GetRange(i, remaining);
}

Here is a third way to handle this, which works with 2 loops. This ensures that the collection is enumerated only 1 time!:

int batchsize = 10;
List<int> batch = new List<int>(batchsize);

for (int i = 0; i < source.Count; i += batchsize)
{
    // calculated the remaining items to avoid an OutOfRangeException
    batchsize = source.Count - i > batchsize ? batchsize : source.Count - i;
    for (int j = i; j < i + batchsize; j++)
    {
        batch.Add(source[j]);
    }           
    batch.Clear();
}
Omnipotent answered 27/6, 2019 at 8:37 Comment(4)
Very nice solution. People forgot how to use for loopProtection
Using Skip and Take inside the loop means that the enumerable will be enumerated multiple times. This is dangerous if the enumerable is deferred. It may result in multiple executions of a database query, or a web request, or a file read. In your example you have a List which is not deferred, so it is less of a problem.Pyszka
@TheodorZoulias yes I know, this is actually why I posted the second solution today. I posted your comment as a disclaimer, because you formulated it quite well, shall I cite you?Omnipotent
I wrote a third solution with 2 loops so that the collection is enumerated only 1 time. the skip.take thing is a very inefficient solutionOmnipotent
P
4

Here is an attempted improvement of Nick Whaley's (link) and infogulch's (link) lazy Batch implementations. This one is strict. You either enumerate the batches in the correct order, or you get an exception.

public static IEnumerable<IEnumerable<TSource>> Batch<TSource>(
    this IEnumerable<TSource> source, int size)
{
    if (size <= 0) throw new ArgumentOutOfRangeException(nameof(size));
    using (var enumerator = source.GetEnumerator())
    {
        int i = 0;
        while (enumerator.MoveNext())
        {
            if (i % size != 0) throw new InvalidOperationException(
                "The enumeration is out of order.");
            i++;
            yield return GetBatch();
        }
        IEnumerable<TSource> GetBatch()
        {
            while (true)
            {
                yield return enumerator.Current;
                if (i % size == 0 || !enumerator.MoveNext()) break;
                i++;
            }
        }
    }
}

And here is a lazy Batch implementation for sources of type IList<T>. This one imposes no restrictions on the enumeration. The batches can be enumerated partially, in any order, and more than once. The restriction of not modifying the collection during the enumeration is still in place though. This is achieved by making a dummy call to enumerator.MoveNext() before yielding any chunk or element. The downside is that the enumerator is left undisposed, since it is unknown when the enumeration is going to finish.

public static IEnumerable<IEnumerable<TSource>> Batch<TSource>(
    this IList<TSource> source, int size)
{
    if (size <= 0) throw new ArgumentOutOfRangeException(nameof(size));
    var enumerator = source.GetEnumerator();
    for (int i = 0; i < source.Count; i += size)
    {
        enumerator.MoveNext();
        yield return GetChunk(i, Math.Min(i + size, source.Count));
    }
    IEnumerable<TSource> GetChunk(int from, int toExclusive)
    {
        for (int j = from; j < toExclusive; j++)
        {
            enumerator.MoveNext();
            yield return source[j];
        }
    }
}
Pyszka answered 26/7, 2019 at 16:48 Comment(0)
B
3

Same approach as MoreLINQ, but using List instead of Array. I haven't done benchmarking, but readability matters more to some people:

    public static IEnumerable<IEnumerable<T>> Batch<T>(this IEnumerable<T> source, int size)
    {
        List<T> batch = new List<T>();

        foreach (var item in source)
        {
            batch.Add(item);

            if (batch.Count >= size)
            {
                yield return batch;
                batch.Clear();
            }
        }

        if (batch.Count > 0)
        {
            yield return batch;
        }
    }
Bise answered 4/5, 2016 at 19:4 Comment(3)
You should NOT be reusing the batch variable. Your consumers could be completely screwed up by that. Also, pass in the size parameter to your new List to optimize its size.Tartu
Easy fix: replace batch.Clear(); with batch = new List<T>();Slough
also you can preallocate the list size with batch = new List<T>(size);Bijouterie
M
3

Here's the cleanest version of Batch that I can come up with:

public static IEnumerable<IEnumerable<T>> Batch<T>(this IEnumerable<T> source, int count)
{
    if (source == null) throw new System.ArgumentNullException("source");
    if (count <= 0) throw new System.ArgumentOutOfRangeException("count");
    using (var enumerator = source.GetEnumerator())
    {
        IEnumerable<T> BatchInner()
        {
            int counter = 0;
            do
                yield return enumerator.Current;
            while (++counter < count && enumerator.MoveNext());
        }
        while (enumerator.MoveNext())
            yield return BatchInner().ToArray();
    }
}

Using this code:

Console.WriteLine(String.Join(Environment.NewLine,
    Enumerable.Range(0, 20).Batch(8).Select(xs => String.Join(",", xs))));

I get:

0,1,2,3,4,5,6,7
8,9,10,11,12,13,14,15
16,17,18,19

It's important to note that on the answers from "" & "" that this code fails:

var e = Enumerable.Range(0, 20).Batch(8).ToArray();

Console.WriteLine(String.Join(Environment.NewLine, e.Select(xs => String.Join(",", xs))));
Console.WriteLine();
Console.WriteLine(String.Join(Environment.NewLine, e.Select(xs => String.Join(",", xs))));

On their answers it gives:

19
19
19

19
19
19

Due to the inner enumerable not being computed as an array.

Multiple answered 24/8, 2021 at 6:42 Comment(4)
There are more answers with lazy evaluation w/o accumulation. Why is this the cleanest? (Not disagreeing, I like the solution anyway, but an explanation would make it a lot more interesting).Torsi
@GertArnold - Well, I think it's a minimal amount of code. Each of the other solutions using a pure enumerator seem to have extra code that isn't necessary.Multiple
I totally agree! One more thing: you added ToArray, but it also seems to work without. Are there case that require this ToArray call?Torsi
@GertArnold - Without the .ToArray() the code only runs once for var e = Enumerable.Range(0, 20).Batch(8).ToArray();. Try iterating over the inner enumerable with that. It effectively captures the enumerator in BatchInner so on subsequent executions the enumerator is done.Multiple
V
2

So with a functional hat on, this appears trivial....but in C#, there are some significant downsides.

you'd probably view this as an unfold of IEnumerable (google it and you'll probably end up in some Haskell docs, but there may be some F# stuff using unfold, if you know F#, squint at the Haskell docs and it will make sense).

Unfold is related to fold ("aggregate") except rather than iterating through the input IEnumerable, it iterates through the output data structures (its a similar relationship between IEnumerable and IObservable, in fact I think IObservable does implement an "unfold" called generate...)

anyway first you need an unfold method, I think this works (unfortunately it will eventually blow the stack for large "lists"...you can write this safely in F# using yield! rather than concat);

    static IEnumerable<T> Unfold<T, U>(Func<U, IEnumerable<Tuple<U, T>>> f, U seed)
    {
        var maybeNewSeedAndElement = f(seed);

        return maybeNewSeedAndElement.SelectMany(x => new[] { x.Item2 }.Concat(Unfold(f, x.Item1)));
    }

this is a bit obtuse because C# doesn't implement some of the things functional langauges take for granted...but it basically takes a seed and then generates a "Maybe" answer of the next element in the IEnumerable and the next seed (Maybe doesn't exist in C#, so we've used IEnumerable to fake it), and concatenates the rest of the answer (I can't vouch for the "O(n?)" complexity of this).

Once you've done that then;

    static IEnumerable<IEnumerable<T>> Batch<T>(IEnumerable<T> xs, int n)
    {
        return Unfold(ys =>
            {
                var head = ys.Take(n);
                var tail = ys.Skip(n);
                return head.Take(1).Select(_ => Tuple.Create(tail, head));
            },
            xs);
    }

it all looks quite clean...you take the "n" elements as the "next" element in the IEnumerable, and the "tail" is the rest of the unprocessed list.

if there is nothing in the head...you're over...you return "Nothing" (but faked as an empty IEnumerable>)...else you return the head element and the tail to process.

you probably can do this using IObservable, there's probably a "Batch" like method already there, and you can probably use that.

If the risk of stack overflows worries (it probably should), then you should implement in F# (and there's probably some F# library (FSharpX?) already with this).

(I have only done some rudimentary tests of this, so there may be the odd bugs in there).

Valence answered 16/4, 2018 at 10:12 Comment(1)
Maybe can exist in C# - see for example Option in Language Ext.Endostosis
F
1

I'm joining this very late but i found something more interesting.

So we can use here Skip and Take for better performance.

public static class MyExtensions
    {
        public static IEnumerable<IEnumerable<T>> Batch<T>(this IEnumerable<T> items, int maxItems)
        {
            return items.Select((item, index) => new { item, index })
                        .GroupBy(x => x.index / maxItems)
                        .Select(g => g.Select(x => x.item));
        }

        public static IEnumerable<T> Batch2<T>(this IEnumerable<T> items, int skip, int take)
        {
            return items.Skip(skip).Take(take);
        }

    }

Next I checked with 100000 records. The looping only is taking more time in case of Batch

Code Of console application.

static void Main(string[] args)
{
    List<string> Ids = GetData("First");
    List<string> Ids2 = GetData("tsriF");

    Stopwatch FirstWatch = new Stopwatch();
    FirstWatch.Start();
    foreach (var batch in Ids2.Batch(5000))
    {
        // Console.WriteLine("Batch Ouput:= " + string.Join(",", batch));
    }
    FirstWatch.Stop();
    Console.WriteLine("Done Processing time taken:= "+ FirstWatch.Elapsed.ToString());


    Stopwatch Second = new Stopwatch();

    Second.Start();
    int Length = Ids2.Count;
    int StartIndex = 0;
    int BatchSize = 5000;
    while (Length > 0)
    {
        var SecBatch = Ids2.Batch2(StartIndex, BatchSize);
        // Console.WriteLine("Second Batch Ouput:= " + string.Join(",", SecBatch));
        Length = Length - BatchSize;
        StartIndex += BatchSize;
    }

    Second.Stop();
    Console.WriteLine("Done Processing time taken Second:= " + Second.Elapsed.ToString());
    Console.ReadKey();
}

static List<string> GetData(string name)
{
    List<string> Data = new List<string>();
    for (int i = 0; i < 100000; i++)
    {
        Data.Add(string.Format("{0} {1}", name, i.ToString()));
    }

    return Data;
}

Time taken Is like this.

First - 00:00:00.0708 , 00:00:00.0660

Second (Take and Skip One) - 00:00:00.0008, 00:00:00.0008

Fushih answered 12/4, 2016 at 6:57 Comment(5)
GroupBy fully enumerates before it produces a single row. This is not a good way to do batching.Tartu
@Tartu That depends on what you are trying to achieve. If the batching is not the issue, and you just need to split the items into smaller chunks for processing it might be just the thing. I'm using this for MSCRM where there might be 100 records which is no problem for LAMBDA to batch.. its the saving that takes seconds..Broucek
Sure, there are use cases where the full enumeration doesn't matter. But why write a second-class utility method when you can write a superb one?Tartu
Good alternative but not identical as first returns a list of lists allowing you to loop through.Gottschalk
change foreach (var batch in Ids2.Batch(5000)) to var gourpBatch = Ids2.Batch(5000) and check the timed results. or add tolist to var SecBatch = Ids2.Batch2(StartIndex, BatchSize); i would be interested if your results for timing change.Halfblood
K
1

I wrote a custom IEnumerable implementation that works without linq and guarantees a single enumeration over the data. It also accomplishes all this without requiring backing lists or arrays that cause memory explosions over large data sets.

Here are some basic tests:

    [Fact]
    public void ShouldPartition()
    {
        var ints = new List<int> {0, 1, 2, 3, 4, 5, 6, 7, 8, 9};
        var data = ints.PartitionByMaxGroupSize(3);
        data.Count().Should().Be(4);

        data.Skip(0).First().Count().Should().Be(3);
        data.Skip(0).First().ToList()[0].Should().Be(0);
        data.Skip(0).First().ToList()[1].Should().Be(1);
        data.Skip(0).First().ToList()[2].Should().Be(2);

        data.Skip(1).First().Count().Should().Be(3);
        data.Skip(1).First().ToList()[0].Should().Be(3);
        data.Skip(1).First().ToList()[1].Should().Be(4);
        data.Skip(1).First().ToList()[2].Should().Be(5);

        data.Skip(2).First().Count().Should().Be(3);
        data.Skip(2).First().ToList()[0].Should().Be(6);
        data.Skip(2).First().ToList()[1].Should().Be(7);
        data.Skip(2).First().ToList()[2].Should().Be(8);

        data.Skip(3).First().Count().Should().Be(1);
        data.Skip(3).First().ToList()[0].Should().Be(9);
    }

The Extension Method to partition the data.

/// <summary>
/// A set of extension methods for <see cref="IEnumerable{T}"/>. 
/// </summary>
public static class EnumerableExtender
{
    /// <summary>
    /// Splits an enumerable into chucks, by a maximum group size.
    /// </summary>
    /// <param name="source">The source to split</param>
    /// <param name="maxSize">The maximum number of items per group.</param>
    /// <typeparam name="T">The type of item to split</typeparam>
    /// <returns>A list of lists of the original items.</returns>
    public static IEnumerable<IEnumerable<T>> PartitionByMaxGroupSize<T>(this IEnumerable<T> source, int maxSize)
    {
        return new SplittingEnumerable<T>(source, maxSize);
    }
}

This is the implementing class

    using System.Collections;
    using System.Collections.Generic;

    internal class SplittingEnumerable<T> : IEnumerable<IEnumerable<T>>
    {
        private readonly IEnumerable<T> backing;
        private readonly int maxSize;
        private bool hasCurrent;
        private T lastItem;

        public SplittingEnumerable(IEnumerable<T> backing, int maxSize)
        {
            this.backing = backing;
            this.maxSize = maxSize;
        }

        public IEnumerator<IEnumerable<T>> GetEnumerator()
        {
            return new Enumerator(this, this.backing.GetEnumerator());
        }

        IEnumerator IEnumerable.GetEnumerator()
        {
            return this.GetEnumerator();
        }

        private class Enumerator : IEnumerator<IEnumerable<T>>
        {
            private readonly SplittingEnumerable<T> parent;
            private readonly IEnumerator<T> backingEnumerator;
            private NextEnumerable current;

            public Enumerator(SplittingEnumerable<T> parent, IEnumerator<T> backingEnumerator)
            {
                this.parent = parent;
                this.backingEnumerator = backingEnumerator;
                this.parent.hasCurrent = this.backingEnumerator.MoveNext();
                if (this.parent.hasCurrent)
                {
                    this.parent.lastItem = this.backingEnumerator.Current;
                }
            }

            public bool MoveNext()
            {
                if (this.current == null)
                {
                    this.current = new NextEnumerable(this.parent, this.backingEnumerator);
                    return true;
                }
                else
                {
                    if (!this.current.IsComplete)
                    {
                        using (var enumerator = this.current.GetEnumerator())
                        {
                            while (enumerator.MoveNext())
                            {
                            }
                        }
                    }
                }

                if (!this.parent.hasCurrent)
                {
                    return false;
                }

                this.current = new NextEnumerable(this.parent, this.backingEnumerator);
                return true;
            }

            public void Reset()
            {
                throw new System.NotImplementedException();
            }

            public IEnumerable<T> Current
            {
                get { return this.current; }
            }

            object IEnumerator.Current
            {
                get { return this.Current; }
            }

            public void Dispose()
            {
            }
        }

        private class NextEnumerable : IEnumerable<T>
        {
            private readonly SplittingEnumerable<T> splitter;
            private readonly IEnumerator<T> backingEnumerator;
            private int currentSize;

            public NextEnumerable(SplittingEnumerable<T> splitter, IEnumerator<T> backingEnumerator)
            {
                this.splitter = splitter;
                this.backingEnumerator = backingEnumerator;
            }

            public bool IsComplete { get; private set; }

            public IEnumerator<T> GetEnumerator()
            {
                return new NextEnumerator(this.splitter, this, this.backingEnumerator);
            }

            IEnumerator IEnumerable.GetEnumerator()
            {
                return this.GetEnumerator();
            }

            private class NextEnumerator : IEnumerator<T>
            {
                private readonly SplittingEnumerable<T> splitter;
                private readonly NextEnumerable parent;
                private readonly IEnumerator<T> enumerator;
                private T currentItem;

                public NextEnumerator(SplittingEnumerable<T> splitter, NextEnumerable parent, IEnumerator<T> enumerator)
                {
                    this.splitter = splitter;
                    this.parent = parent;
                    this.enumerator = enumerator;
                }

                public bool MoveNext()
                {
                    this.parent.currentSize += 1;
                    this.currentItem = this.splitter.lastItem;
                    var hasCcurent = this.splitter.hasCurrent;

                    this.parent.IsComplete = this.parent.currentSize > this.splitter.maxSize;

                    if (this.parent.IsComplete)
                    {
                        return false;
                    }

                    if (hasCcurent)
                    {
                        var result = this.enumerator.MoveNext();

                        this.splitter.lastItem = this.enumerator.Current;
                        this.splitter.hasCurrent = result;
                    }

                    return hasCcurent;
                }

                public void Reset()
                {
                    throw new System.NotImplementedException();
                }

                public T Current
                {
                    get { return this.currentItem; }
                }

                object IEnumerator.Current
                {
                    get { return this.Current; }
                }

                public void Dispose()
                {
                }
            }
        }
    }
Kathrynkathryne answered 2/12, 2017 at 23:10 Comment(0)
W
1

Another way is using Rx Buffer operator

//using System.Linq;
//using System.Reactive.Linq;
//using System.Reactive.Threading.Tasks;

var observableBatches = anAnumerable.ToObservable().Buffer(size);

var batches = aList.ToObservable().Buffer(size).ToList().ToTask().GetAwaiter().GetResult();
Wrapping answered 28/11, 2018 at 7:22 Comment(1)
You should never have to use GetAwaiter().GetResult(). This is a code smell for synchronous code forcefully calling async code.Audiovisual
W
1

Just another one line implementation. It works even with an empty list, in this case you get a zero size batches collection.

var aList = Enumerable.Range(1, 100).ToList(); //a given list
var size = 9; //the wanted batch size
//number of batches are: (aList.Count() + size - 1) / size;

var batches = Enumerable.Range(0, (aList.Count() + size - 1) / size).Select(i => aList.GetRange( i * size, Math.Min(size, aList.Count() - i * size)));

Assert.True(batches.Count() == 12);
Assert.AreEqual(batches.ToList().ElementAt(0), new List<int>() { 1, 2, 3, 4, 5, 6, 7, 8, 9 });
Assert.AreEqual(batches.ToList().ElementAt(1), new List<int>() { 10, 11, 12, 13, 14, 15, 16, 17, 18 });
Assert.AreEqual(batches.ToList().ElementAt(11), new List<int>() { 100 });
Wrapping answered 28/11, 2018 at 9:18 Comment(0)
H
1

An easy version to use and understand.

    public static List<List<T>> chunkList<T>(List<T> listToChunk, int batchSize)
    {
        List<List<T>> batches = new List<List<T>>();

        if (listToChunk.Count == 0) return batches;

        bool moreRecords = true;
        int fromRecord = 0;
        int countRange = 0;
        if (listToChunk.Count >= batchSize)
        {
            countRange = batchSize;
        }
        else
        {
            countRange = listToChunk.Count;
        }

        while (moreRecords)
        {
            List<T> batch = listToChunk.GetRange(fromRecord, countRange);
            batches.Add(batch);

            if ((fromRecord + batchSize) >= listToChunk.Count)
            {
                moreRecords = false;
            }

            fromRecord = fromRecord + batch.Count;

            if ((fromRecord + batchSize) > listToChunk.Count)
            {
                countRange = listToChunk.Count - fromRecord;
            }
            else
            {
                countRange = batchSize;
            }
        }
        return batches;
    }
Hughhughes answered 28/10, 2020 at 6:19 Comment(0)
H
1

As a new helper method for LINQ in .NET 6 you can chunk any IEnumerable into batches:

int chunkNumber = 1;
foreach (int[] chunk in Enumerable.Range(0, 9).Chunk(3))
{
    Console.WriteLine($"Chunk {chunkNumber++}");
    foreach (var item in chunk)
    {
        Console.WriteLine(item);
    }
}
Harned answered 24/8, 2021 at 4:48 Comment(1)
Correct and here's a reference: dotnetcoretutorials.com/2021/08/12/ienumerable-chunk-in-net-6Eparchy
N
1

Here is an implementation that uses Async iteration in C# via IAsyncEnumerable - https://learn.microsoft.com/en-us/dotnet/csharp/whats-new/tutorials/generate-consume-asynchronous-stream

public static class EnumerableExtensions
{
    /// <summary>
    /// Chunks a sequence into a sub-sequences each containing maxItemsPerChunk, except for the last
    /// which will contain any items left over.
    ///
    /// NOTE: this implements a streaming implementation via <seealso cref="IAsyncEnumerable{T}"/>.
    /// </summary>
    public static async IAsyncEnumerable<IEnumerable<T>> ChunkAsync<T>(this IAsyncEnumerable<T> sequence, int maxItemsPerChunk)
    {
        if (sequence == null) throw new ArgumentNullException(nameof(sequence));
        if (maxItemsPerChunk <= 0)
        {
            throw new ArgumentOutOfRangeException(nameof(maxItemsPerChunk), $"{nameof(maxItemsPerChunk)} must be greater than 0");
        }

        var chunk = new List<T>(maxItemsPerChunk);
        await foreach (var item in sequence)
        {
            chunk.Add(item);

            if (chunk.Count == maxItemsPerChunk)
            {
                yield return chunk.ToArray();
                chunk.Clear();
            }
        }

        // return the "crumbs" that 
        // didn't make it into a full chunk
        if (chunk.Count > 0)
        {
            yield return chunk.ToArray();
        }
    }

    /// <summary>
    /// Chunks a sequence into a sub-sequences each containing maxItemsPerChunk, except for the last
    /// which will contain any items left over.
    /// </summary>
    public static IEnumerable<IEnumerable<T>> Chunk<T>(this IEnumerable<T> sequence, int maxItemsPerChunk)
    {
        if (sequence == null) throw new ArgumentNullException(nameof(sequence));
        if (maxItemsPerChunk <= 0)
        {
            throw new ArgumentOutOfRangeException(nameof(maxItemsPerChunk), $"{nameof(maxItemsPerChunk)} must be greater than 0");
        }

        var chunk = new List<T>(maxItemsPerChunk);
        foreach (var item in sequence)
        {
            chunk.Add(item);

            if (chunk.Count == maxItemsPerChunk)
            {
                yield return chunk.ToArray();
                chunk.Clear();
            }
        }

        // return the "crumbs" that 
        // didn't make it into a full chunk
        if (chunk.Count > 0)
        {
            yield return chunk.ToArray();
        }
    }
}
Nombles answered 20/10, 2021 at 18:26 Comment(0)
C
0

I know everybody used complex systems to do this work, and I really don't get it why. Take and skip will allow all those operations using the common select with Func<TSource,Int32,TResult> transform function. Like:

public IEnumerable<IEnumerable<T>> Buffer<T>(IEnumerable<T> source, int size)=>
    source.Select((item, index) => source.Skip(size * index).Take(size)).TakeWhile(bucket => bucket.Any());
Coaction answered 3/10, 2018 at 4:3 Comment(2)
This might be very inefficient, because the given source will be iterated very often.Acyl
This is not only inefficient, but could also produce incorrect results. There is no guarantee that an enumerable will yield the same elements when enumerated twice. Take this enumerable as an example: Enumerable.Range(0, 1).SelectMany(_ => Enumerable.Range(0, new Random().Next())).Pyszka
S
0

Another way to perform batching:

public static class Extensions
{
    public static IEnumerable<TOut> Batch<T, TOut>(this IEnumerable<T> source, Func<T, T, TOut> func)
    {
        using (var enumerator = source.GetEnumerator())
        {
            while (true)
            {
                bool state;

                state = enumerator.MoveNext(); if (!state) break; var v0 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v1 = enumerator.Current;

                yield return func(v0, v1);
            }
        }
    }

    public static IEnumerable<TOut> Batch<T, TOut>(this IEnumerable<T> source, Func<T, T, T, TOut> func)
    {
        using (var enumerator = source.GetEnumerator())
        {
            while (true)
            {
                bool state;

                state = enumerator.MoveNext(); if (!state) break; var v0 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v1 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v2 = enumerator.Current;

                yield return func(v0, v1, v2);
            }
        }
    }

    public static IEnumerable<TOut> Batch<T, TOut>(this IEnumerable<T> source, Func<T, T, T, T, TOut> func)
    {
        using (var enumerator = source.GetEnumerator())
        {
            while (true)
            {
                bool state;

                state = enumerator.MoveNext(); if (!state) break; var v0 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v1 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v2 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v3 = enumerator.Current;

                yield return func(v0, v1, v2, v3);
            }
        }
    }

    public static IEnumerable<TOut> Batch<T, TOut>(this IEnumerable<T> source, Func<T, T, T, T, T, TOut> func)
    {
        using (var enumerator = source.GetEnumerator())
        {
            while (true)
            {
                bool state;

                state = enumerator.MoveNext(); if (!state) break; var v0 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v1 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v2 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v3 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v4 = enumerator.Current;

                yield return func(v0, v1, v2, v3, v4);
            }
        }
    }

    public static IEnumerable<TOut> Batch<T, TOut>(this IEnumerable<T> source, Func<T, T, T, T, T, T, TOut> func)
    {
        using (var enumerator = source.GetEnumerator())
        {
            while (true)
            {
                bool state;

                state = enumerator.MoveNext(); if (!state) break; var v0 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v1 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v2 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v3 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v4 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v5 = enumerator.Current;

                yield return func(v0, v1, v2, v3, v4, v5);
            }
        }
    }

    public static IEnumerable<TOut> Batch<T, TOut>(this IEnumerable<T> source, Func<T, T, T, T, T, T, T, TOut> func)
    {
        using (var enumerator = source.GetEnumerator())
        {
            while (true)
            {
                bool state;

                state = enumerator.MoveNext(); if (!state) break; var v0 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v1 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v2 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v3 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v4 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v5 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v6 = enumerator.Current;

                yield return func(v0, v1, v2, v3, v4, v5, v6);
            }
        }
    }

    public static IEnumerable<TOut> Batch<T, TOut>(this IEnumerable<T> source, Func<T, T, T, T, T, T, T, T, TOut> func)
    {
        using (var enumerator = source.GetEnumerator())
        {
            while (true)
            {
                bool state;

                state = enumerator.MoveNext(); if (!state) break; var v0 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v1 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v2 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v3 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v4 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v5 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v6 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v7 = enumerator.Current;

                yield return func(v0, v1, v2, v3, v4, v5, v6, v7);
            }
        }
    }

    public static IEnumerable<TOut> Batch<T, TOut>(this IEnumerable<T> source, Func<T, T, T, T, T, T, T, T, T, TOut> func)
    {
        using (var enumerator = source.GetEnumerator())
        {
            while (true)
            {
                bool state;

                state = enumerator.MoveNext(); if (!state) break; var v0 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v1 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v2 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v3 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v4 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v5 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v6 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v7 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v8 = enumerator.Current;

                yield return func(v0, v1, v2, v3, v4, v5, v6, v7, v8);
            }
        }
    }

    public static IEnumerable<TOut> Batch<T, TOut>(this IEnumerable<T> source, Func<T, T, T, T, T, T, T, T, T, T, TOut> func)
    {
        using (var enumerator = source.GetEnumerator())
        {
            while (true)
            {
                bool state;

                state = enumerator.MoveNext(); if (!state) break; var v0 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v1 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v2 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v3 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v4 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v5 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v6 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v7 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v8 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v9 = enumerator.Current;

                yield return func(v0, v1, v2, v3, v4, v5, v6, v7, v8, v9);
            }
        }
    }

    public static IEnumerable<TOut> Batch<T, TOut>(this IEnumerable<T> source, Func<T, T, T, T, T, T, T, T, T, T, T, TOut> func)
    {
        using (var enumerator = source.GetEnumerator())
        {
            while (true)
            {
                bool state;

                state = enumerator.MoveNext(); if (!state) break; var v0 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v1 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v2 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v3 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v4 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v5 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v6 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v7 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v8 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v9 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v10 = enumerator.Current;

                yield return func(v0, v1, v2, v3, v4, v5, v6, v7, v8, v9, v10);
            }
        }
    }

    public static IEnumerable<TOut> Batch<T, TOut>(this IEnumerable<T> source, Func<T, T, T, T, T, T, T, T, T, T, T, T, TOut> func)
    {
        using (var enumerator = source.GetEnumerator())
        {
            while (true)
            {
                bool state;

                state = enumerator.MoveNext(); if (!state) break; var v0 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v1 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v2 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v3 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v4 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v5 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v6 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v7 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v8 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v9 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v10 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v11 = enumerator.Current;

                yield return func(v0, v1, v2, v3, v4, v5, v6, v7, v8, v9, v10, v11);
            }
        }
    }

    public static IEnumerable<TOut> Batch<T, TOut>(this IEnumerable<T> source, Func<T, T, T, T, T, T, T, T, T, T, T, T, T, TOut> func)
    {
        using (var enumerator = source.GetEnumerator())
        {
            while (true)
            {
                bool state;

                state = enumerator.MoveNext(); if (!state) break; var v0 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v1 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v2 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v3 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v4 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v5 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v6 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v7 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v8 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v9 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v10 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v11 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v12 = enumerator.Current;

                yield return func(v0, v1, v2, v3, v4, v5, v6, v7, v8, v9, v10, v11, v12);
            }
        }
    }

    public static IEnumerable<TOut> Batch<T, TOut>(this IEnumerable<T> source, Func<T, T, T, T, T, T, T, T, T, T, T, T, T, T, TOut> func)
    {
        using (var enumerator = source.GetEnumerator())
        {
            while (true)
            {
                bool state;

                state = enumerator.MoveNext(); if (!state) break; var v0 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v1 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v2 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v3 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v4 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v5 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v6 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v7 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v8 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v9 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v10 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v11 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v12 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v13 = enumerator.Current;

                yield return func(v0, v1, v2, v3, v4, v5, v6, v7, v8, v9, v10, v11, v12, v13);
            }
        }
    }

    public static IEnumerable<TOut> Batch<T, TOut>(this IEnumerable<T> source, Func<T, T, T, T, T, T, T, T, T, T, T, T, T, T, T, TOut> func)
    {
        using (var enumerator = source.GetEnumerator())
        {
            while (true)
            {
                bool state;

                state = enumerator.MoveNext(); if (!state) break; var v0 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v1 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v2 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v3 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v4 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v5 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v6 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v7 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v8 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v9 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v10 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v11 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v12 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v13 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v14 = enumerator.Current;

                yield return func(v0, v1, v2, v3, v4, v5, v6, v7, v8, v9, v10, v11, v12, v13, v14);
            }
        }
    }

    public static IEnumerable<TOut> Batch<T, TOut>(this IEnumerable<T> source, Func<T, T, T, T, T, T, T, T, T, T, T, T, T, T, T, T, TOut> func)
    {
        using (var enumerator = source.GetEnumerator())
        {
            while (true)
            {
                bool state;

                state = enumerator.MoveNext(); if (!state) break; var v0 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v1 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v2 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v3 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v4 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v5 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v6 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v7 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v8 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v9 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v10 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v11 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v12 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v13 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v14 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v15 = enumerator.Current;

                yield return func(v0, v1, v2, v3, v4, v5, v6, v7, v8, v9, v10, v11, v12, v13, v14, v15);
            }
        }
    }
}

Here's an example usage:

using System;
using System.Linq;


namespace TestProgram
{
    class Program
    {
        static void Main(string[] args)
        {
            foreach (var item in Enumerable.Range(0, 12).ToArray().Batch((R, X1, Y1, X2, Y2) => (R, X1, Y1, X2, Y2)))
            {
                Console.WriteLine($"{item.R}, {item.X1}, {item.Y1}, {item.X2}, {item.Y2}");
            }
        }
    }
}
Sigismundo answered 5/10, 2021 at 22:30 Comment(0)
D
-3
    static IEnumerable<IEnumerable<T>> TakeBatch<T>(IEnumerable<T> ts,int batchSize)
    {
        return from @group in ts.Select((x, i) => new { x, i }).ToLookup(xi => xi.i / batchSize)
               select @group.Select(xi => xi.x);
    }
Dynatron answered 2/7, 2015 at 16:33 Comment(1)
Add some description/ text in your answer. Putting only code may be meaning less most of time.Riata

© 2022 - 2024 — McMap. All rights reserved.