TPL Dataflow block consumes all available memory

Asked 23/6, 2015 at 5:31 Answered 15/6, 2020 at 9:39

Solved c#.net task-parallel-library dataflow tpl-dataflow

I have a TransformManyBlock with the following design:

Input: Path to a file
Output: IEnumerable of the file's contents, one line at a time

I am running this block on a huge file (61GB), which is too large to fit into RAM. In order to avoid unbounded memory growth, I have set BoundedCapacity to a very low value (e.g. 1) for this block, and all downstream blocks. Nonetheless, the block apparently iterates the IEnumerable greedily, which consumes all available memory on the computer, grinding every process to a halt. The OutputCount of the block continues to rise without bound until I kill the process.

What can I do to prevent the block from consuming the IEnumerable in this way?

EDIT: Here's an example program that illustrates the problem:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Threading;
using System.Threading.Tasks;
using System.Threading.Tasks.Dataflow;

class Program
{
    static IEnumerable<string> GetSequence(char c)
    {
        for (var i = 0; i < 1024 * 1024; ++i)
            yield return new string(c, 1024 * 1024);
    }

    static void Main(string[] args)
    {
        var options = new ExecutionDataflowBlockOptions() { BoundedCapacity = 1 };
        var firstBlock = new TransformManyBlock<char, string>(c => GetSequence(c), options);
        var secondBlock = new ActionBlock<string>(str =>
            {
                Console.WriteLine(str.Substring(0, 10));
                Thread.Sleep(1000);
            }, options);

        firstBlock.LinkTo(secondBlock);
        firstBlock.Completion.ContinueWith(task =>
            {
                if (task.IsFaulted) ((IDataflowBlock) secondBlock).Fault(task.Exception);
                else secondBlock.Complete();
            });

        firstBlock.Post('A');
        firstBlock.Complete();
        for (; ; )
        {
            Console.WriteLine("OutputCount: {0}", firstBlock.OutputCount);
            Thread.Sleep(3000);
        }
    }
}

If you're on a 64-bit box, make sure to clear the "Prefer 32-bit" option in Visual Studio. I have 16GB of RAM on my computer, and this program immediately consumes every available byte.

Gladdie answered 23/6, 2015 at 5:31 Comment(2)

well TBH: I have no time to argue with you here - good luck – Shizue 23/6, 2015 at 7:4

if you read the rest of the section carefully you will see that it does not work as you think - your firstBlock always offers everything it can produce - if you bound the second one it will just deny the second input and fetch it later – Shizue 23/6, 2015 at 7:9

You seem to misunderstand how TPL Dataflow works.

BoundedCapacity limits the amount of items you can post into a block. In your case that means a single char into the TransformManyBlock and single string into the ActionBlock.

So you post a single item to the TransformManyBlock which then returns 1024*1024 strings and tries to pass them on to the ActionBlock which will only accept a single one at a time. The rest of the strings will just sit there in the TransformManyBlock's output queue.

What you probably want to do is create a single block and post items into it in a streaming fashion by waiting (synchronously or otherwise) when it's capacity is reached:

private static void Main()
{
    MainAsync().Wait();
}

private static async Task MainAsync()
{
    var block = new ActionBlock<string>(async item =>
    {
        Console.WriteLine(item.Substring(0, 10));
        await Task.Delay(1000);
    }, new ExecutionDataflowBlockOptions { BoundedCapacity = 1 });

    foreach (var item in GetSequence('A'))
    {
        await block.SendAsync(item);
    }

    block.Complete();
    await block.Completion;
}

Teflon answered 23/6, 2015 at 11:42 Comment(4)

Thanks. I ended up creating a new block that encapsulates a source ActionBlock and a target BufferBlock. The action block uses SendAsync as you suggest to populate the buffer. To the outside world, it behaves like a TransformManyBlock with the behavior I want. – Gladdie 23/6, 2015 at 21:0

@brianberns: Sorry if this is a stupid question, but what's the difference between "await block.SendAsync(item)" and "block.Post(item)" ? – Calesta 15/7, 2015 at 8:24

@Calesta It's not a stupid question at all: https://mcmap.net/q/295793/-tpl-dataflow-whats-the-functional-difference-between-post-and-sendasync – Teflon 15/7, 2015 at 8:32

@i3arnon: Thanks, I did not realize that Post() will return right away no matter what, I thought it would block until the message is consumed. Oops ! – Calesta 15/7, 2015 at 9:9

It seems that to create an output-bounded TransformManyBlock, three internal blocks are needed:

A TransformBlock that receives the input and produces IEnumerables, running potentially in parallel.
A non-parallel ActionBlock that enumerates the produced IEnumerables, and propagates the final results.
A BufferBlock where the final results are stored, respecting the desirable BoundedCapacity.

The slightly tricky part is how to propagate the completion of the second block, because it is not directly linked to the third block. In the implementation below, the method PropagateCompletion is written according to the source code of the library.

public static IPropagatorBlock<TInput, TOutput>
    CreateOutputBoundedTransformManyBlock<TInput, TOutput>(
    Func<TInput, Task<IEnumerable<TOutput>>> transform,
    ExecutionDataflowBlockOptions dataflowBlockOptions)
{
    if (transform == null) throw new ArgumentNullException(nameof(transform));
    if (dataflowBlockOptions == null)
        throw new ArgumentNullException(nameof(dataflowBlockOptions));

    var input = new TransformBlock<TInput, IEnumerable<TOutput>>(transform,
        dataflowBlockOptions);
    var output = new BufferBlock<TOutput>(dataflowBlockOptions);
    var middle = new ActionBlock<IEnumerable<TOutput>>(async results =>
    {
        if (results == null) return;
        foreach (var result in results)
        {
            var accepted = await output.SendAsync(result).ConfigureAwait(false);
            if (!accepted) break; // If one is rejected, the rest will be rejected too
        }
    }, new ExecutionDataflowBlockOptions()
    {
        MaxDegreeOfParallelism = 1,
        BoundedCapacity = dataflowBlockOptions.MaxDegreeOfParallelism,
        CancellationToken = dataflowBlockOptions.CancellationToken,
        SingleProducerConstrained = true,
    });

    input.LinkTo(middle, new DataflowLinkOptions() { PropagateCompletion = true });
    PropagateCompletion(middle, output);

    return DataflowBlock.Encapsulate(input, output);

    async void PropagateCompletion(IDataflowBlock source, IDataflowBlock target)
    {
        try
        {
            await source.Completion.ConfigureAwait(false);
        }
        catch { }

        var exception = source.Completion.IsFaulted ? source.Completion.Exception : null;
        if (exception != null) target.Fault(exception); else target.Complete();
    }
}

// Overload with synchronous delegate
public static IPropagatorBlock<TInput, TOutput>
    CreateOutputBoundedTransformManyBlock<TInput, TOutput>(
    Func<TInput, IEnumerable<TOutput>> transform,
    ExecutionDataflowBlockOptions dataflowBlockOptions)
{
    return CreateOutputBoundedTransformManyBlock<TInput, TOutput>(
        item => Task.FromResult(transform(item)), dataflowBlockOptions);
}

Usage example:

var firstBlock = CreateOutputBoundedTransformManyBlock<char, string>(
    c => GetSequence(c), options);

Preachy answered 15/6, 2020 at 9:39 Comment(0)

If output ratio of the pipeline is lower then the post ratio, messages will accumulate on the pipeline until memory runs out or some queue limit is reached. If messages have a significant size, process will be starving for memory soon.

Setting BoundedCapacity to 1 will cause messages to be rejected by queue if the queue has already one message. That is not the desired behavior in cases like batch processing, for example. Check this post for insights.

This working test illustrate my point:

//Change BoundedCapacity to +1 to see it fail
[TestMethod]
public void stackOverflow()
{      
    var total = 1000;
    var processed = 0;
    var block = new ActionBlock<int>(
       (messageUnit) =>
       {
           Thread.Sleep(10);
           Trace.WriteLine($"{messageUnit}");
           processed++;
       },
        new ExecutionDataflowBlockOptions() { BoundedCapacity = -1 } 
   );

    for (int i = 0; i < total; i++)
    {
        var result = block.SendAsync(i);
        Assert.IsTrue(result.IsCompleted, $"failed for {i}");
    }

    block.Complete();
    block.Completion.Wait();

    Assert.AreEqual(total, processed);
}

So my approach is to throttle the post, so the pipeline will not accumulate much messages in the queues.

Below a simple way to do it. This way dataflow keeps processing the messages at full speed, but messages are not accumulated, and by doing this avoiding excessive memory consumption.

//Should be adjusted for specific use.
public void postAssync(Message message)
{

    while (totalPending = block1.InputCount + ... + blockn.InputCount> 100)
    {
        Thread.Sleep(200);
        //Note: if allocating huge quantities for of memory for each message the Garbage collector may keep up with the pace. 
        //This is the perfect place to force garbage collector to release memory.

    }
    block1.SendAssync(message)
}

Armistead answered 21/3, 2019 at 13:48 Comment(0)

Recommended topics

Hot tags