Extensive use of LOH causes significant performance issue
Asked Answered
D

2

6

We have a Web Service using WebApi 2, .NET 4.5 on Server 2012. We were seeing occasional latency increases by 10-30ms with no good reason. We were able to track down the problematic piece of code to LOH and GC.

There is some text which we convert to its UTF8 byte representation (actually, the serialization library we use does that). As long as the text is shorter than 85000 bytes, latency is stable and short: ~0.2 ms on average and at 99%. As soon as the 85000 boundary is crossed, average latency increases to ~1ms while the 99% jumps to 16-20ms. Profiler shows that most of the time is spent in GC. To be certain, if I put GC.Collect between iterations, the measured latency goes back to 0.2ms.

I have two questions:

  1. Where does the latency come from? As far as I understand the LOH isn't compacted. SOH is being compacted, but doesn't show the latency.
  2. Is there a practical way to work around this? Note that I can’t control the size of the data and make it smaller.

--

public void PerfTestMeasureGetBytes()
{
    var text = File.ReadAllText(@"C:\Temp\ContactsModelsInferences.txt");
    var smallText = text.Substring(0, 85000 + 100);
    int count = 1000;
    List<double> latencies = new List<double>(count);
    for (int i = 0; i < count; i++)
    {
        Stopwatch sw = new Stopwatch();
        sw.Start();
        var bytes = Encoding.UTF8.GetBytes(smallText);
        sw.Stop();
        latencies.Add(sw.Elapsed.TotalMilliseconds);

        //GC.Collect(2, GCCollectionMode.Default, true);
    }

    latencies.Sort();
    Console.WriteLine("Average: {0}", latencies.Average());
    Console.WriteLine("99%: {0}", latencies[(int)(latencies.Count * 0.99)]);
}
Doering answered 9/12, 2014 at 14:55 Comment(2)
Does this help #687450Frederigo
not really, this talks about fragmentation and my problem is with latency.Doering
D
8

The performance problems usually come from two areas: allocation and fragmentation.

Allocation

The runtime guarantees clean memory so spends cycles cleaning it. When you allocate a large object, that's a lot of memory and starts to add milliseconds to a single allocation (when lets be honest, simple allocation in .NET is actually very fast, so we usually never care about this).

Fragmentation occurs when LOH objects are allocated then reclaimed. Until recently, the GC could not reorganise the memory to remove these old object "gaps", and thus could only fit the next object in that gap if it was the same size or smaller. Recently, the GC has been given the ability to compact the LOH, which removes this issue, but costs time during compaction.

My guess in your case is you are suffering from both issues and triggering GC runs, but it depends on how often your code is attempting to allocate items in the LOH. If you are doing lots of allocations, try the object pooling route. If you cannot control a pool effectively (lumpy object lifetimes or disparate usage patterns), try chunking the data you are working against to avoid it completely.


Your Options

I've encountered two approaches to the LOH:

  • Avoid it.
  • Use it, but realise you are using it and manage it explicitly.

Avoid it

This involves chunking your large object (usually an array of some sort) into, well, chunks that each fall under the LOH barrier. We do this when serialising large object streams. Works well, but an implementation would be specific to your environment so I'm hesitant to provide a coded example.

Use it

A simple way to tackle both allocation and fragmentation is long-lived objects. Explicitly make an empty array (or arrays) of a large size to accommodate your large object, and don't get rid of it (or them). Leave it around and re-use it like an object pool. You pay for this allocation, but can do this either on first use or during application idle time, but you pay less for re-allocation (because you aren't re-allocating) and lessen fragmentation issues because you aren't constantly asking to allocate stuff and you aren't reclaiming items (which causes the gaps in the first place).

That said, a halfway house may be in order. Reserve a section of memory up-front for an object pool. Done early, these allocations should be contiguous in memory so you won't get any gaps, and leave the tail end of the available memory for uncontrolled items. Do beware though that this obviously has an impact on the working set of your application - an object pool takes space regardless of it being used or not.


Resources

The LOH is covered a lot out in the web, but pay attention to the date of the resource. In the latest .NET versions the LOH has received some love, and has improved. That said, if you are on an older version I think the resources on the net are fairly accurate as the LOH never really received any serious updates in a long time between inception and .NET 4.5 (ish).

For example, there is this article from 2008 http://msdn.microsoft.com/en-us/magazine/cc534993.aspx

And a summary of improvements in .NET 4.5: http://blogs.msdn.com/b/dotnet/archive/2011/10/04/large-object-heap-improvements-in-net-4-5.aspx

Dijon answered 9/12, 2014 at 15:11 Comment(4)
@downvoter I'd like to know the reason for the downvote. My understanding of the LOH is not complete so I'd like to fill any gaps. Please contribute and help me improve the answer. I've reworded my answer a lot since the downvote, so hopefully I've rectified the issue, but I wouldn't know.Dijon
Thanks Adam, the problem with both suggested solutions is that we are dealing with 3rd party created objects. One of the field is a string which just happened to be above the threshold. As soon as this happens, the entire system slows down. The string goes from serialized form to string, being copied a couple of times because of own business logic. Since I don't control the object, I can't replace the string with something else, which would be more LOH friendly.Doering
I will give it some more time, and will accept your answer, if I don't find/receive a better one.Doering
@alon there is very little you can do if you can't get any access to the large objects themselves. Try contacting the vendor.Dijon
L
3

In addition to the following, make sure that you're using the server garbage collector. That doesn't affect how the LOH is used, but my experience is that it does significantly reduce the amount of time spent in GC.

The best work around I found for avoiding large object heap problems is to create a persistent buffer and re-use it. So rather than allocating a new byte array with every call to Encoding.GetBytes, pass the byte array to the method.

In this case, use the GetBytes overload that takes a byte array. Allocate an array that's large enough to hold the bytes for your longest expected string, and keep it around. For example:

// allocate buffer at class scope
private byte[] _theBuffer = new byte[1024*1024];

public void PerfTestMeasureGetBytes()
{
    // ...
    for (...)
    {
        var sw = Stopwatch.StartNew();
        var numberOfBytes = Encoding.UTF8.GetBytes(smallText, 0, smallText.Length, _theBuffer, 0);
        sw.Stop();
        // ...
    }

The only problem here is that you have to make sure your buffer is large enough to hold the largest string. What I've done in the past is to allocate the buffer to the largest size I expect, but then check to make sure it's large enough whenever I go to use it. If it's not large enough, then re-allocate it. How you do that depends on how rigorous you want to be. When working with primarily Western European text, I'd just double the string length. For example:

string textToConvert = ...
if (_theBuffer.Length < 2*textToConvert.Length)
{
    // reallocate the buffer
    _theBuffer = new byte[2*textToConvert.Length];
}

Another way to do it is to just try the GetString, and reallocate on failure. Then retry. For example:

while (!good)
{
    try
    {
        numberOfBytes = Encoding.UTF8.GetString(theString, ....);
        good = true;
    }
    catch (ArgumentException)
    {
        // buffer isn't big enough. Find out how much I really need
        var bytesNeeded = Encoding.UTF8.GetByteCount(theString);
        // and reallocate the buffer
        _theBuffer = new byte[bytesNeeded];
    }
}

If you make the buffer's initial size large enough to accommodate the largest string you expect, then you probably won't get that exception very often. Which means that the number of times you have to reallocate the buffer will be very small. You could, of course, add some padding to the bytesNeeded so that you allocate more, in case you have some other outliers.

Leucomaine answered 9/12, 2014 at 16:0 Comment(4)
It's worth noting that when using the buffer you need to ignore the part of the buffer that isn't needed for the object, so your code to iterate or whatever needs to pay attention to the size of actual data inside and cannot rely on Array.Length.Dijon
@AdamHouldsworth: True. But that's what the return value from GetBytes is for . . .Leucomaine
Indeed, I'm not criticising the solution. I've done this in the past, but it is easy to forget that when you are working under that solution your little array could be sat inside a much larger one. This technique can also be used to avoid allocation of many small arrays either on or off the LOH.Dijon
Thanks for the answer. I realize I could re-use a buffer to avoid GC. The code sample was a demonstration of the problem. The actual situation is much more complicated. Those long strings can come in different forms and in unexpected places (dependent on 3rd party code).Doering

© 2022 - 2024 — McMap. All rights reserved.