Reusing Arrays in C#
Asked Answered
N

2

5

So I'm optimizing a C# program which uses byte arrays very very frequently, I wrote a kind of recycle pool thing to reuse arrays which had to be collected by GC. Like that:

public class ArrayPool<T>
{
    private readonly ConcurrentDictionary<int, ConcurrentBag<T[]>> _pool;

    public ArrayPool()
    {
        _pool = new ConcurrentDictionary<int, ConcurrentBag<T[]>>();
    }

    public ArrayPool(int capacity)
    {
        _pool = new ConcurrentDictionary<int, ConcurrentBag<T[]>>(4, capacity);
        for (var i = 1; i <= capacity; i++)
        {
            _pool.TryAdd(i, new ConcurrentBag<T[]>());
        }
    }

    public T[] Alloc(int capacity)
    {
        if (capacity < 1)
        {
            return null;
        }
        if (_pool.ContainsKey(capacity))
        {
            var subpool = _pool[capacity];
            T[] result;
            if (subpool != null) return subpool.TryTake(out result) ? result : new T[capacity];
            subpool = new ConcurrentBag<T[]>();
            _pool.TryAdd(capacity, subpool);
            _pool[capacity] = subpool;
            return subpool.TryTake(out result) ? result : new T[capacity];
        }
        _pool[capacity] = new ConcurrentBag<T[]>();
        return new T[capacity];
    }

    public void Free(T[] array)
    {
        if (array == null || array.Length < 1)
        {
            return;
        }
        var len = array.Length;
        Array.Clear(array, 0, len);
        var subpool = _pool[len] ?? new ConcurrentBag<T[]>();
        subpool.Add(array);
    }

}

and I also wrote some code to test its performance:

const int TestTimes = 100000;
const int PoolCapacity = 1000;
public static ArrayPool<byte> BytePool;
static void Main()
{
    BytePool = = new ArrayPool<byte>(PoolCapacity);
    var watch = Stopwatch.StartNew();
    for (var i = 1; i <= TestTimes; i++)
    {
        var len = (i % PoolCapacity) + 1;
        var array = new byte[len];
    }
    watch.Stop();
    Console.WriteLine("Traditional Method: {0} ms.", watch.ElapsedMilliseconds);
    watch = Stopwatch.StartNew();
    for (var i = 1; i <= TestTimes; i++)
    {
        var len = (i % PoolCapacity) + 1;
        var array = BytePool.Alloc(len);
        BytePool.Free(array);
    }
    watch.Stop();
    Console.WriteLine("New Method: {0} ms.", watch.ElapsedMilliseconds);
    Console.ReadKey();
}

I thought it should be faster if the program could reuse memory instead of malloc them every time, but it turns out that my code is about 10 times slower than before:

Traditional Method: 31 ms. New Method: 283 ms.

So is it true that resuing arrays could increase performance in C#? If true, why my code is so slow? Is there better way to reuse arrays?

Any advice would be appreciated.Thank you.

Neptunium answered 22/7, 2014 at 9:58 Comment(9)
You should read up on memory management in C# before starting off such experiments. malloc in C# has virtually zero cost, so it’s not surprising that your complex, manual memory management method performs worse.Thunder
Pooling is a valid optimization to reduce GC overhead. ConcurrentDictionary is just really slow compared to allocating (small) objects.Vitelline
@KonradRudolph zero cost? I must read that.Thank you. So It is no need to reuse arrays manually,right?Neptunium
In general pooling is done to control access to scarce resources whose construction is expensive or limited (connections, files etc). Memory is the least expensive resource available to .net framework. My general instinct is to let the GC do its job - its had millions of hours of work and optimisation thrown at it. However; I do accept mem frag can be an issue sometimes, but its quite rare that it will be your biggest performance issue.Resilience
allocating usually isn't expensive. What is expensive is GC which you aren't running in your test. Besides - I'm not sure if the traditional array creation in your test is run at all - as the array is not used it's possible that it wasn't allocated at all....Cristophercristy
@Vitelline Pooling small objects is a valid strategy in conservative systems such as C++, or systems with limited resource capacity. In the context of .NET’s GC, which is specifically optimised for the allocation of many small objects, pooling has almost no benefit.Thunder
@KonradRudolph you are assuming that the OP is in a scenario where pooling is not beneficial. The question has no information about the greater situation he is in. That assumption cannot be made. Simple counterexamples: Big buffers (megabytes) or a huge GC heap and allocation pattern of medium-lived objects. That causes huge amounts of full collections on big heaps. Been there.Vitelline
@Vitelline Given OP’s test usage, I assumed small objects. Large objects change everything (although .NET 4.5 saw vast improvements in that area as well).Thunder
There's some interesting information about performance optimisation in Rosyln centred around pools of small arrays etc., in this blog post mattwarren.org/2014/06/10/…Gropius
R
7

You should check out the new System.Buffers package.

Racketeer answered 29/12, 2015 at 21:53 Comment(0)
C
6

The new ArrayPool class will handle this.

Cipher answered 8/4, 2019 at 18:25 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.