The linked article by Samuel Jack provides two overloads of the WithProgressReporting
operator, one with itemCount
parameter (second) and one without (first). According to the author's instructions, you have to use the overload that has the itemCount
parameter:
The second variant can be used if generating the items in the sequence takes time (so you want to report progress of this part as well), but you know in advance how many there will be: it doesn't attempt to buffer the sequence before passing the items through to the output sequence.
You also have to use the PLINQ operator WithMergeOptions(ParallelMergeOptions.NotBuffered)
, otherwise the elements produced by the PLINQ query will not be propagated immediately downstream, causing undesirable delays in the progress reporting.
Samuel Jack's implementation reports progress for each and every element produced by the source sequence, which is too much. Below is a more sophisticated implementation, that reports progress at most 101 times, from 0 to 100:
/// <summary>
/// Reports progress 0-100.
/// </summary>
public static IEnumerable<TSource> WithProgressReporting<TSource>(
this IEnumerable<TSource> source, long itemsCount,
IProgress<int> progress)
{
ArgumentNullException.ThrowIfNull(source);
ArgumentNullException.ThrowIfNull(progress);
if (itemsCount < 0)
throw new ArgumentOutOfRangeException(nameof(itemsCount));
progress.Report(0);
long i = 0;
long next = GetNext(0, itemsCount, out _);
foreach (TSource item in source)
{
i++;
if (i == next)
{
next = GetNext(i, itemsCount, out int percentDone);
progress.Report(percentDone);
}
yield return item;
}
if (i == 0 || itemsCount == 0) progress.Report(100);
static long GetNext(long itemsDone, long itemsCount, out int percentDone)
{
if (itemsCount == 0) { percentDone = 0; return long.MaxValue; }
checked
{
// Calculate the percent of the work done, rounding down
percentDone = (int)(itemsDone * 100 / itemsCount);
// Calculate the next iteration to report, rounding up
long next = (((percentDone + 1) * itemsCount - 1) / 100) + 1;
Debug.Assert(next > itemsDone);
return next;
}
}
}
Usage example:
IProgress<int> progress = new Progress<int>(e => Console.WriteLine(e));
long sum = Enumerable.Range(1, 1_000_000)
.AsParallel()
.AsOrdered()
.WithMergeOptions(ParallelMergeOptions.NotBuffered)
.Select(n =>
{
Thread.SpinWait(100); // Simulate some lightweight computation
return (long)n;
})
.AsSequential() // End of parallelism
.WithProgressReporting(1_000_000, progress) // <--- the extension method
.Sum();
I've included the AsSequential
operator, although it's not needed, to signify visually the completion of the parallel part of the query. Both the WithProgressReporting
and the Sum
are standard sequential LINQ operators, not parallel.
Online demo.
The GetNext
local function does some precise integer divisions, in order to calculate the next iteration that needs to trigger a progress report. I have tested and validated the correctness of the calculations, for all possible itemsCount
values in the range 0 - 10,000.
The built-in Progress<T>
class invokes asynchronously the handler
on the captured synchronization context. In case there is no synchronization context, for example in a console application, the handler
is invoked on the ThreadPool
. This might be undesirable, because it means that the handler
might be invoked after the completion of the PLINQ query. You can solve this problem by using a synchronous IProgress<T>
implementation, like the SynchronousProgress<T>
that I've posted here. Alternatively you could replace the IProgress<int> progress
parameter with an Action<int> reportProgress
, as is in Samuel Jack's article.
WithProgressReporting()
method serve your purpose adequately? Typically, you'd be starting withIEnumerable<T>
anyway...just wrap your sourceIEnumerable<T>
with the call toWithProgressReporting()
and callAsParallel()
on that, as you've done in your tests. Ultimately the throughput is going to be same, whether you report progress on the source or result. You need to be more specific: post a minimal reproducible example and explain precisely what output it is you expect, and what you're getting instead. – Landloper