The LinkTo
method with the PropagateCompletion
configuration, propagates the completion of the source block to the target block. So if the source block fails, the failure will be propagated to the target block, and so eventually both blocks will complete. The same is not true if the target block fails. In that case the source block will not get notified, and will continue accepting and processing messages. If we add the BoundedCapacity
configuration in the mix, the internal output buffer of the source block will soon become full, preventing it from accepting more messages. And as you discovered, that can easily result in a deadlock.
To prevent a deadlock from happening, the simplest approach would be to ensure that an error in any block of the pipeline would cause the timely completion all its constituent blocks ASAP. Other approaches are also possible, as indicated by Stephen Cleary's answer, but in the majority of cases I expect the fail-fast approach to be the desirable behavior. Surprisingly this simple behavior is not so easy to achieve. No built-in mechanism is readily available for this purpose, and implementing it manually is tricky.
As of .NET 6, the only reliable way to complete forcefully a block that is part of a dataflow pipeline, is to Fault
the block, and also discard its output buffer by linking it to a NullTarget
. Faulting the block alone, or canceling it through the CancellationToken
option, is not enough. There are scenarios where a faulted or canceled block will not complete. Here is a demonstration of the first case (faulted and not completed), and here is a demonstration of the second case (canceled and not completed). Both scenarios require that the block has been previously marked as completed, which can happen automatically and non-deterministically for all blocks participating in a dataflow pipeline, and are linked with the PropagateCompletion
configuration. A GitHub issue reporting this problematic behavior exists: No way to cancel completing dataflow blocks. As of the time of this writing, no feedback has been provided by the devs.
Armed with this knowledge, we can implement a LinkTo
-on-steroids method that can create fail-fast pipelines like this:
/// <summary>
/// Connects two blocks that belong in a simple, straightforward,
/// one-way dataflow pipeline.
/// Completion is propagated in both directions.
/// Failure of the target block causes purging of all buffered messages
/// in the source block, allowing the timely completion of both blocks.
/// </summary>
/// <remarks>
/// This method should be used only if the two blocks participate in an exclusive
/// producer-consumer relationship.
/// The source block should be the only producer for the target block, and
/// the target block should be the only consumer of the source block.
/// </remarks>
public static void ConnectTo<TOutput>(this ISourceBlock<TOutput> source,
ITargetBlock<TOutput> target)
{
source.LinkTo(target, new DataflowLinkOptions { PropagateCompletion = true });
ThreadPool.QueueUserWorkItem(async _ =>
{
try { await target.Completion.ConfigureAwait(false); } catch { }
if (!target.Completion.IsFaulted) return;
if (source.Completion.IsCompleted) return;
source.Fault(new Exception("Pipeline error."));
source.LinkTo(DataflowBlock.NullTarget<TOutput>()); // Discard all output
});
}
Usage example:
var data_buffer = new BufferBlock<int>(new() { BoundedCapacity = 1 });
var process_block = new ActionBlock<int>(
x => throw new InvalidOperationException(),
new() { BoundedCapacity = 2, MaxDegreeOfParallelism = 2 });
data_buffer.ConnectTo(process_block); // Instead of LinkTo
foreach (var k in Enumerable.Range(1, 5))
if (!await data_buffer.SendAsync(k)) break;
data_buffer.Complete();
await process_block.Completion;
Optionally you could also consider awaiting all the constituent blocks of the pipeline, before awaiting the last one (or after in a finally
region). This offers the advantage that in case of failure, you won't risk leaking fire-and-forget operations running in the background unobserved, before the next reincarnation of the pipeline:
try { await Task.WhenAll(data_buffer.Completion, process_block.Completion); } catch { }
You can ignore all the errors that might be thrown by the await Task.WhenAll
operation, because awaiting the last block will convey most of the error-related information anyway. You may only miss additional errors that happened in blocks upstream after the failure of a block downstream. You can try to observe all errors if you want, but it will be tricky because of how the errors are propagated downstream: you may observe the same error multiple times. If you want to log diligently every single error, it is probably easier (and more accurate) to do the logging inside the lambdas of the processing blocks, instead of relying on their Completion
property.
Shortcomings: The ConnectTo
implementation above propagates the failure backwards one block at time. The propagation is not instantaneous, because a faulted block does not complete before the processing of any currently processed messages has finished. This can be an issue in case the pipeline is long (5-6 blocks or more), and the workload of each block is chunky. This additional latency is not only a waste of time, but also a waste of resources, for doing work that is going to be discarded anyway.
I've uploaded a more sophisticated version of the ConnectTo
idea in this GitHub repository. It addresses the delayed-completion issue mentioned in the previous paragraph: a failure in any block is propagated instantaneously to all blocks. As a bonus it also propagates all the errors in the pipeline, as a flat AggregateException
.
Note: This answer has been rewritten from scratch. The original answer (Revision 4) included some wrong ideas, and a flawed implementation of the ConnectTo
method.
BoundedCapacity
smaller than theMaxDegreeOfParallelism
will reduce the degree of parallelism to the value of the capacity. In other words, the block cannot process 2 items simultaneously if it is allowed to buffer only one. I believe this happens because after processing the two items it should store the two results in it's output buffer, and it has not available space for two results. – OperandActionBlock
then yes, that would make sense, because this block has only an input queue with no output. But actually evenActionBlock
s are governed by the same rule for some reason. Probably for consistency. – Operand