How to chunkify an IEnumerable<T>, without losing/discarding items in case of failure?
Asked Answered
H

4

4

I have a producer-consumer scenario where the producer is an enumerable sequence of items (IEnumerable<Item>). I want to process these items in chunks/batches of 10 items each. So I decided to use the new (.NET 6) Chunk LINQ operator, as suggested in this question: Create batches in LINQ.

My problem is that sometimes the producer fails, and in this case the consumer of the chunkified sequence receives the error without first receiving a chunk with the last items that were produced before the error. So if for example the producer generates 15 items and then fails, the consumer will get a chunk with the items 1-10 and then will get an exception. The items 11-15 will be lost! Here is a minimal example that demonstrates this undesirable behavior:

static IEnumerable<int> Produce()
{
    int i = 0;
    while (true)
    {
        i++;
        Console.WriteLine($"Producing #{i}");
        yield return i;
        if (i == 15) throw new Exception("Oops!");
    }
}

// Consume
foreach (int[] chunk in Produce().Chunk(10))
{
    Console.WriteLine($"Consumed: [{String.Join(", ", chunk)}]");
}

Output:

Producing #1
Producing #2
Producing #3
Producing #4
Producing #5
Producing #6
Producing #7
Producing #8
Producing #9
Producing #10
Consumed: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
Producing #11
Producing #12
Producing #13
Producing #14
Producing #15
Unhandled exception. System.Exception: Oops!
   at Program.<Main>g__Produce|0_0()+MoveNext()
   at System.Linq.Enumerable.ChunkIterator[TSource](IEnumerable`1 source, Int32 size)+MoveNext()
   at Program.Main()

Online demo.

The desirable behavior would be to get a chunk with the values [11, 12, 13, 14, 15] before getting the exception.

My question is: Is there any way to configure the Chunk operator so that it prioritizes emitting data instead of exceptions? If not, how can I implement a custom LINQ operator, named for example ChunkNonDestructive, with the desirable behavior?

public static IEnumerable<TSource[]> ChunkNonDestructive<TSource>(
    this IEnumerable<TSource> source, int size);

Note: Except from the System.Linq.Chunk operator I also experimented with the Buffer operator from the System.Interactive package, as well as the Batch operator from the MoreLinq package. Apparently they all behave the same (destructively).


Update: Here is the desirable output of the above example:

Producing #1
Producing #2
Producing #3
Producing #4
Producing #5
Producing #6
Producing #7
Producing #8
Producing #9
Producing #10
Consumed: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
Producing #11
Producing #12
Producing #13
Producing #14
Producing #15
Consumed: [11, 12, 13, 14, 15]
Unhandled exception. System.Exception: Oops!
   at Program.<Main>g__Produce|0_0()+MoveNext()
   at System.Linq.Enumerable.ChunkIterator[TSource](IEnumerable`1 source, Int32 size)+MoveNext()
   at Program.Main()

The difference is the line Consumed: [11, 12, 13, 14, 15], that is not present in the actual output.

Hydrophone answered 20/7, 2022 at 18:35 Comment(6)
Why does the producer generate an error? WHy not design your producer such that an error doesn't break the consumer? Maybe the producer could add something to the IENumerable to signify an error, but not actually throw an exceptionLuhey
@Luhey yes, this could be a possibility. But I would prefer to keep my producer as is. Bypassing the exception-propagation system of C# would complicate my code, and would make it more prone to bugs.Hydrophone
There's another Chunk() implementation here: #52636071Accompanist
@JoelCoehoorn does this solve my problem though?Hydrophone
@TheodorZoulias It might, because it will let you customize the way Chunk() works to your liking.Accompanist
@JoelCoehoorn to be honest if I was going to implement the ChunkNonDestructive operator myself from an existing implementation, I would probably use Microsoft's implementation as starting point. No offense, your implementation might be way better, but no one of us can beat Microsoft single-handedly on reputation points! 😃Hydrophone
S
2

If you preprocess your source to make it stop when it encounters an exception, then you can use Chunk() as-is.

public static class Extensions
{
    public static IEnumerable<T> UntilFirstException<T>(this IEnumerable<T> source, Action<Exception> exceptionCallback = null)
    {
        using var enumerator = source.GetEnumerator();
        while (true)
        {
            T current;
            try
            {
                if (!enumerator.MoveNext())
                {
                    break;
                }
                current = enumerator.Current;
            }
            catch (Exception e)
            {
                exceptionCallback?.Invoke(e);
                break;
            }
            yield return current;
        }
    }
}
    Exception? e = null;
    foreach (int[] chunk in Produce().UntilFirstException(thrown => e = thrown).Chunk(10))
    {
        Console.WriteLine($"Consumed: [{String.Join(", ", chunk)}]");
    }

I feel like that keeps responsibilities separated nicely. If you want a helper that throws an exception instead of having to capture it yourself, you can use this as a component to simplify writing that helper:

    public static IEnumerable<T[]> ChunkUntilFirstException<T>(this IEnumerable<T> source, int size)
    {
        Exception? e = null;
        var result = source.UntilFirstException(thrown => e = thrown).Chunk(size);
        foreach (var element in result)
        {
            yield return element;
        }
        if (e != null)
        {
            throw new InvalidOperationException("source threw an exception", e);
        }
    }

Note that this will throw a different exception than the one emitted by the producer. This lets you keep the stack trace associated with the original exception, whereas throw e would overwrite that stack trace.

You can tweak this according to your needs. If you need to catch a specific type of exception that you're expecting your producer to emit, it's easy enough to use the when contextual keyword with some pattern matching.

    try
    {
        foreach (int[] chunk in Produce().ChunkUntilFirstException(10))
        {
            Console.WriteLine($"Consumed: [{String.Join(", ", chunk)}]");
        }
    }
    catch (InvalidOperationException e) when (e.InnerException is {Message: "Oops!"})
    {
        Console.WriteLine(e.InnerException.ToString());
    }
Speller answered 20/7, 2022 at 20:11 Comment(14)
Thanks StriplingWarrior for the answer, but if I understand correctly the UntilFirstException operator suppresses/swallows the exception thrown by the producer. This is not what I want. It is important to notify the consumer about the error that occurred at the producer. It is as important as processing all the produced items before handling the error.Hydrophone
I changed the answer to include the "optional callback" I was talking about.Speller
StriplingWarrior hmm. So the exception will be pushed to the consumer, instead of being pulled by the consumer at the end of enumeration. This is changing my processing model radically from pull to pull+push. TBH I would prefer to avoid this complication.Hydrophone
@TheodorZoulias: That's easily overcome by writing a helper that produces the combined chunking behavior you're looking for. See my update.Speller
Why have you wrapped the exception in an InvalidOperationException?Hydrophone
@TheodorZoulias: I already explained that in my answer. It's to avoid losing the stack trace. I added some more code in another update, in case that's helpful.Speller
BTW, it might be possible to emit the original exception (with its original stack trace), but it would definitely be a lot more complicated (you can't yield return inside of a catch block, and you can't throw; outside of one), and I think it's arguable whether it's even correct behavior (or more correct than wrapping the exception, e.g.).Speller
Ahh, sorry, I missed it. Why did you renamed the ChunkNonDestructive to ChunkUntilFirstException? Do you think that the second name is more descriptive than the first? Also I am not sure what's the value of implementing the ChunkNonDestructive on top of another operator (UntilFirstException) that I didn't ask for, and I might never have a need for. It's just going to "pollute" my intellisense popup in VS for no reason!Hydrophone
LOL, picky, picky. That name just sounded right to me: two hard things in programming, amiright? I thought UntilFirstException (which maybe should be TakeUntilFirstException) might be generally useful in an environment where you're expecting IEnumerator<>s to throw exceptions. If not, you can certainly hide it, or make it not be an extension method. As I said, tweak it to fit your needs. Either way, separating the concerns into separate methods is good practice and less complex than reinventing Chunk's implementation with a twist.Speller
Ahhh, now I see your point. Your idea is to avoid reimplementing the Chunk from scratch, and propagate the exception around it instead of through it. Clever! It might be possible to create a generic propagator that makes NonDestructive all kind of existing LINQ operators. I am intrigued now. I'll try to make this work!Hydrophone
Your UntilFirstException implementation is way more complex than it needs to be. try { foreach(var item in source) yield return item; } catch(Exception e) { exceptionCallback(e); } If anything this would be better. If the IEnumerator is written such that Current throws, you'd still want to catch that exception in a situation such as is described in the question. Also you don't dispose your enumerator.Iraqi
@Servy: "CS1626 Cannot yield a value in the body of a try block with a catch clause"Speller
Ah, yeah. You could try to work around it with a try/finally instead of a catch, but it wouldn't come out as clean. But with your approach you'd want to make sure to catch exceptions in Current and you'd want to make sure to dispose of the iterator.Iraqi
@Servy: Great feedback, thank you. I updated the code accordingly.Speller
B
2

First off, a matter of semantics. There's nothing destructive in Chunk or Buffer or anything else, it just reads items from a source enumerable until it's over or it throws an exception. The only destructive thing in your code is you throwing exceptions, which behaves as expected (ie, unwinds the stack out of your generator, out of the Linq functions and into a catch in your code, if any exists).

Also it should be immediately obvious that every Linq functions will behave the same in regards to exceptions. It's in fact how exceptions work, and working around them to support your use case is relatively expensive: you'll need to swallow exceptions for every item you generate. This, in my humble opinion, is incredibly bad design, and you'd be fired on the spot if you worked for me and did that.

With all that out of the way, writing a BadDesignChunk like that is trivial (if expensive):

public static IEnumerable<IEnumerable<TSource>> BadDesignChunk<TSource>(this IEnumerable<TSource> source, int size)
{
    Exception caughtException = default;
    var chunk = new List<TSource>();
    using var enumerator = source.GetEnumerator();
    
    while(true)
    {
        while(chunk.Count < size)
        {
            try
            {
                if(!enumerator.MoveNext())
                {
                    // end of the stream, send what we have and finish
                    goto end;
                }
            }
            catch(Exception ex)
            {
                // exception, send what we have and finish
                caughtException = ex;
                goto end;
            }
            
            chunk.Add(enumerator.Current);
        }
        
        // chunk full, send it
        yield return chunk;
        chunk.Clear();
    }
    
    end:
    if(chunk.Count > 0)
        yield return chunk;
    if(caughtException is not null)
        throw caughtException;
}

See it in action here.

Bosky answered 20/7, 2022 at 19:27 Comment(13)
Thanks Blindy for the answer, but the BadDesignChunk is not doing what I want. What I want is to to get a chunk with the values [11, 12, 13, 14, 15] before getting the exception, not instead of getting the exception. I don't want to swallow exceptions. If I wanted that, you would be right to fire me and everything. 😃Hydrophone
The differences are minimal, you simply save any caught exception and throw it at the end. And I am right regardless, this is not good code. This isn't efficient, correct or scalable. In fact, from your comment, the only reason you even have this producer was laziness on the part of the developers ("would complicate my code").Bosky
Blindy if you think that changing the producer and complicating both the producer and the consumer is the way to go, you can post an implementation as an answer, so that people can evaluate (vote) it accordingly.Hydrophone
Btw I noticed two problems with your revised answer (revision 2). The BadDesignChunk is emitting always the same List<T> instance, and the stack trace of the exception that is caught by the consuming loop is missing the line at Program.<Main>g__Produce|0_0()+MoveNext(), which is the source of the exception. Also, why you changed the signature from IEnumerable<TSource[]>to IEnumerable<IEnumerable<TSource>>? What's the advantage of this change?Hydrophone
Can you please elaborate a little why is this a bad, or more to the point, fired-on-the-spot-bad design? The exception isn't swallowed, it's just deferred until after yielding to the consumer to make sure that no items are missed in case of an exception.Baras
@TheodorZoulias that's not a problem, it's by design. Allocating objects is expensive, and as long as you iterate over them it's fine. You can change it as you wish however, creating new arrays every time, renting arrays from a pool, etc. Also duh, it's going to be missing it because that stack trace was only valid at the point where it was fired. It's throwing the exception that builds the stack trace.Bosky
@Boris, you're asking me why designing a system that requires you to instantiate and tear down the entirety of the Windows SEH engine for every single item you iterate over in the entire life cycle of your application is bad? Seriously? No, I mean, seriously?Bosky
"Blindy if you think that changing the producer and complicating both the producer and the consumer is the way to go, you can post an implementation as an answer, so that people can evaluate (vote) it accordingly" -- I'm not sure what exactly you're asking. You didn't post your producer, you only posted a mockup of one, which can be easily fixed by deleting the throw line. Oh but Blindy, you're going to say, that won't get me the exception. Good, you don't need it. You have an enumerable, enumerate it and you're done.Bosky
Blindy I am happy with allocating a new array for each chunk. That's what the built-in Chunk does anyway (source code). By emitting the same mutable List<T> each time your are opening the gate for subtle bugs. The consumer might keep a reference to the previous chunk, and compare it with the next chunk. The two chunks will always have the same elements (because they will be the same chunk). You know what they say about premature optimization... 😃Hydrophone
As for your latest response to my previous comment about a previous comment of yours about another one of my comments where I was responding to a comment by Neil, it has become too complicated so let's leave it behind and move on. 😃Hydrophone
@Bosky I don't know what the "Windows SEH Engine" is, but a try/catch does not involve the kind of performance hit you're implying. There's virtually no cost as long as no exception is thrown, and in this case there's at most one exception thrown per IEnumerable, so it's not a cost per item in the source. No exceptions are "swallowed" here. If one of my employees tried to fire a team member on the spot for putting a try/catch in an array... well I'd like to think I wouldn't have hired a manager like that in the first place.Speller
Well, I actually know a bit about SEH, and I'm also pretty sure that it's not initialized and torn down for every try/catch frame.. If it is set up and torn down repetitively at all (and not once-per-process), I'd expect it to be once-per-managed-thread. I recall some odd performance things like that decades ago in something like VBScript or VBA, or maybe ASP-"classic" (the ancient precursor for ASP-.Net), but not sure if that was it. I would be absolutely shocked if that happened on any target w/Net>=4.0 did that, moderately shocked @Net>=2.0, and somewhat not-surprised at ancient Net1.1/1.0.Infuscate
Regarding "Also duh, it's going to be missing it because that stack trace was only valid at the point where it was fired. It's throwing the exception that builds the stack trace" - yes. If you did that like you did, the stack trace is lost. See other answers and ExceptionDispatchInfo.Capture - pure gold at times when you really need it!Infuscate
S
2

If you preprocess your source to make it stop when it encounters an exception, then you can use Chunk() as-is.

public static class Extensions
{
    public static IEnumerable<T> UntilFirstException<T>(this IEnumerable<T> source, Action<Exception> exceptionCallback = null)
    {
        using var enumerator = source.GetEnumerator();
        while (true)
        {
            T current;
            try
            {
                if (!enumerator.MoveNext())
                {
                    break;
                }
                current = enumerator.Current;
            }
            catch (Exception e)
            {
                exceptionCallback?.Invoke(e);
                break;
            }
            yield return current;
        }
    }
}
    Exception? e = null;
    foreach (int[] chunk in Produce().UntilFirstException(thrown => e = thrown).Chunk(10))
    {
        Console.WriteLine($"Consumed: [{String.Join(", ", chunk)}]");
    }

I feel like that keeps responsibilities separated nicely. If you want a helper that throws an exception instead of having to capture it yourself, you can use this as a component to simplify writing that helper:

    public static IEnumerable<T[]> ChunkUntilFirstException<T>(this IEnumerable<T> source, int size)
    {
        Exception? e = null;
        var result = source.UntilFirstException(thrown => e = thrown).Chunk(size);
        foreach (var element in result)
        {
            yield return element;
        }
        if (e != null)
        {
            throw new InvalidOperationException("source threw an exception", e);
        }
    }

Note that this will throw a different exception than the one emitted by the producer. This lets you keep the stack trace associated with the original exception, whereas throw e would overwrite that stack trace.

You can tweak this according to your needs. If you need to catch a specific type of exception that you're expecting your producer to emit, it's easy enough to use the when contextual keyword with some pattern matching.

    try
    {
        foreach (int[] chunk in Produce().ChunkUntilFirstException(10))
        {
            Console.WriteLine($"Consumed: [{String.Join(", ", chunk)}]");
        }
    }
    catch (InvalidOperationException e) when (e.InnerException is {Message: "Oops!"})
    {
        Console.WriteLine(e.InnerException.ToString());
    }
Speller answered 20/7, 2022 at 20:11 Comment(14)
Thanks StriplingWarrior for the answer, but if I understand correctly the UntilFirstException operator suppresses/swallows the exception thrown by the producer. This is not what I want. It is important to notify the consumer about the error that occurred at the producer. It is as important as processing all the produced items before handling the error.Hydrophone
I changed the answer to include the "optional callback" I was talking about.Speller
StriplingWarrior hmm. So the exception will be pushed to the consumer, instead of being pulled by the consumer at the end of enumeration. This is changing my processing model radically from pull to pull+push. TBH I would prefer to avoid this complication.Hydrophone
@TheodorZoulias: That's easily overcome by writing a helper that produces the combined chunking behavior you're looking for. See my update.Speller
Why have you wrapped the exception in an InvalidOperationException?Hydrophone
@TheodorZoulias: I already explained that in my answer. It's to avoid losing the stack trace. I added some more code in another update, in case that's helpful.Speller
BTW, it might be possible to emit the original exception (with its original stack trace), but it would definitely be a lot more complicated (you can't yield return inside of a catch block, and you can't throw; outside of one), and I think it's arguable whether it's even correct behavior (or more correct than wrapping the exception, e.g.).Speller
Ahh, sorry, I missed it. Why did you renamed the ChunkNonDestructive to ChunkUntilFirstException? Do you think that the second name is more descriptive than the first? Also I am not sure what's the value of implementing the ChunkNonDestructive on top of another operator (UntilFirstException) that I didn't ask for, and I might never have a need for. It's just going to "pollute" my intellisense popup in VS for no reason!Hydrophone
LOL, picky, picky. That name just sounded right to me: two hard things in programming, amiright? I thought UntilFirstException (which maybe should be TakeUntilFirstException) might be generally useful in an environment where you're expecting IEnumerator<>s to throw exceptions. If not, you can certainly hide it, or make it not be an extension method. As I said, tweak it to fit your needs. Either way, separating the concerns into separate methods is good practice and less complex than reinventing Chunk's implementation with a twist.Speller
Ahhh, now I see your point. Your idea is to avoid reimplementing the Chunk from scratch, and propagate the exception around it instead of through it. Clever! It might be possible to create a generic propagator that makes NonDestructive all kind of existing LINQ operators. I am intrigued now. I'll try to make this work!Hydrophone
Your UntilFirstException implementation is way more complex than it needs to be. try { foreach(var item in source) yield return item; } catch(Exception e) { exceptionCallback(e); } If anything this would be better. If the IEnumerator is written such that Current throws, you'd still want to catch that exception in a situation such as is described in the question. Also you don't dispose your enumerator.Iraqi
@Servy: "CS1626 Cannot yield a value in the body of a try block with a catch clause"Speller
Ah, yeah. You could try to work around it with a try/finally instead of a catch, but it wouldn't come out as clean. But with your approach you'd want to make sure to catch exceptions in Current and you'd want to make sure to dispose of the iterator.Iraqi
@Servy: Great feedback, thank you. I updated the code accordingly.Speller
H
2

I was inspired by StriplingWarrior's answer, which is based on an idea that I didn't initially understand. The idea is to reuse the existing Chunk implementation, and propagate the exception around it instead of through it. Based on this idea I wrote a generic method DeferErrorUntilCompletion that robustifies¹ all kinds of LINQ operators, or combinations of operators, according to this rule:

In case the input sequence fails, the error is propagated after yielding all the elements of the output sequence.

private static IEnumerable<TOutput> DeferErrorUntilCompletion<TInput, TOutput>(
    IEnumerable<TInput> input,
    Func<IEnumerable<TInput>, IEnumerable<TOutput>> conversion)
{
    Task errorContainer = null;
    IEnumerable<TInput> InputIterator()
    {
        using var enumerator = input.GetEnumerator();
        while (true)
        {
            TInput item;
            try
            {
                if (!enumerator.MoveNext()) break;
                item = enumerator.Current;
            }
            catch (Exception ex)
            {
                errorContainer = Task.FromException(ex);
                break;
            }
            yield return item;
        }
    }
    IEnumerable<TOutput> output = conversion(InputIterator());
    foreach (TOutput item in output) yield return item;
    errorContainer?.GetAwaiter().GetResult();
}

Then I used the DeferErrorUntilCompletion method to implement the ChunkNonDestructive operator like this:

/// <summary>
/// Splits the elements of a sequence into chunks of the specified size.
/// In case the sequence fails and there are buffered elements, a last chunk
/// that contains these elements is emited before propagating the error.
/// </summary>
public static IEnumerable<TSource[]> ChunkNonDestructive<TSource>(
    this IEnumerable<TSource> source, int size)
{
    ArgumentNullException.ThrowIfNull(source);
    if (size < 1) throw new ArgumentOutOfRangeException(nameof(size));
    return DeferErrorUntilCompletion(source, s => s.Chunk(size));
}

Online example.

The implementation uses a Task for capturing the error, which is later rethrown without losing the original stack trace.

Deferring the propagation of the error opens an interesting possibility: The consumer of the deferred sequence might abandon the enumeration prematurely, for example by breaking or by suffering an exception, while an error is already captured inside the edi. This possibility is handled by propagating the unobserved error through the TaskScheduler.UnobservedTaskException event. Other options for handing this scenario could be to rethrow the error during the Dispose of the deferred enumerator, or simply suppress the error. Suppressing was implemented in the 3rd revision of this answer, by using a ExceptionDispatchInfo instead of a Task as container for the error. Throwing on Dispose has other problems that are discussed in this question.

Although there is some value at reusing an existing built-in implementation (simplicity, consistency, robustness), there are downsides too. Adding two extra enumerations on top of the core functionality could result in non-negligible overhead. Charlieface's implementation is about twice as fast as this implementation, at producing chunks. So for a producer-consumer scenario with very high throughput (thousands of chunks per second), I would probably prefer to use Charlieface's implementation than this one.

¹ The idea that the LINQ operators need to be robustified might sound strange, or even arrogant. Please note that the context of this answer is very specific: it is producer-consumer scenarios. In these scenarios, where multiple producers and consumers might be running in parallel, occasional exceptions are to be expected, and resilience mechanisms are in place in anticipation of such exceptions, losing messages here and there because of errors is generally something to avoid.

Hydrophone answered 21/7, 2022 at 0:45 Comment(1)
A version of the DeferErrorUntilCompletion method that supports asynchronous sequences can be found here.Hydrophone
P
1

You can't catch an exception, yield and then re-throw, because you can't have yield inside a catch. (For obvious reasons: once you have yielded then you are not in a catch any more.)

I think the only solution that will preserve the original exception with the original stack trace is to use ExceptionDispatchInfo.Capture.

private static IEnumerable<IList<TSource>> ChunkIterator<TSource>(this IEnumerable<TSource> source, int size)
{
    using var e = source.GetEnumerator();

    var chunk = new List<TSource>(size);
    ExceptionDispatchInfo exDispatch = null;
    try
    {
        while(true)
        {
            try
            {
                while(e.MoveNext())
                {
                    chunk.Add(e.Current);
                    if (chunk.Count == size)
                        break;
                }
            }
            catch(Exception ex)
            {
                exDispatch = ExceptionDispatchInfo.Capture(ex);
            }

            if(chunk.Count > 0)
                yield return chunk.ToArray();

            var exDispatch2 = exDispatch;
            exDispatch = null;
            exDispatch2?.Throw();

            if(chunk.Count > 0)
                chunk.Clear();
            else
                yield break;
        }
    }
    finally
    {
        exDispatch?.Throw();
    }
}

Your foreach will always receive the last chunk of items, and only throw on the next iteration.

Paragrapher answered 21/7, 2022 at 9:23 Comment(10)
Thanks Charlieface! Your solution works, if I make the ChunkIterator an extension method by adding the this keyword. I guess that you worked on Microsoft's implementation, which currently has a minor issue. :-)Hydrophone
Yes I started off with that. It does have a slight perf penalty compared to their code because it always initializes the list. I couldn't find a neat way to avoid it while still using the try, it ended up with a lot of duplicate code. And yes I think I have the same issue as them.Paragrapher
Yeah, I noticed the duplication. Allocating early a List<T> is perfectly fine IMHO, for the purpose of a StackOverflow answer. We are not supposed to post scientific code here. :-)Hydrophone
Btw yesterday I had a discussion with Stephen Toub about the NoThrow functionality of await, and as a byproduct of the discussion I realized that he would probably disapprove your implementation (as well as my implementation). In case the consumer abandons the enumeration during the yield return, an exception stored in the exDispatch will not be surfaced. Stephen Toub's opinion is that the exception should be thrown when the consumer disposes our enumerator.Hydrophone
Hmm, true. I think the addition of a try finally around the whole thing would help for that. Obviously still needs a foreach which implicitly implies a Dispose(). See edits.Paragrapher
Charlieface TBH I don't agree with Stephen's opinion. The concensus seems to be that throwing on Dispose should happen only on catastrophic failures. Out of curiosity I checked what PLINQ does when the consumer of a ParallelQuery<T> stops enumerating it, and an error happens afterwards. The PLINQ then throws the error on the Dispose. I think that Stephen's opinion is influenced by this existing behavior.Hydrophone
I can think of a potential problem though. If we don't throw on Dispose, the consumer could check the size of the chunk in each iteration, and interpret a chunk smaller than size as an indication that the enumeration has ended: if (chunk.Count() < 10) break;. This would prevent the surfacing of the exception, without the consumer realizing it. This scenario is a bit disturbing.Hydrophone
On the other hand if we do throw on Dispose, and the code of the consumer fails also in the last iteration, the consumer's exception will be lost because it will be replaced by the exception thrown on the implicit finally of the consumer's foreach. So there are arguments for both options.Hydrophone
I posted a related question here: Exception is lost while consuming a PLINQ query.Hydrophone
Checking the Count is a bit of a leaky abstraction anyway, and I don't think you need to worry about that. If you pass it as an IEnumerable then the consumer shouldn't be trying that.Paragrapher

© 2022 - 2024 — McMap. All rights reserved.