HttpClient resulting in leaking Node<Object> in mscorlib
Asked Answered
T

3

17

Consider the following program, with all of HttpRequestMessage, and HttpResponseMessage, and HttpClient disposed properly. It always ends up with about 50MB memory at the end, after collection. Add a zero to the number of requests, and the un-reclaimed memory doubles.

   class Program
    {
        static void Main(string[] args)
        {
            var client = new HttpClient { 
                   BaseAddress = new Uri("http://localhost:5000/")};

            var t = Task.Run(async () =>
            {
                var resps = new List<Task<HttpResponseMessage>>();
                var postProcessing = new List<Task>();

                for (int i = 0; i < 10000; i++)
                {
                    Console.WriteLine("Firing..");
                    var req = new HttpRequestMessage(HttpMethod.Get,
                                                        "test/delay/5");
                    var tsk = client.SendAsync(req);
                    resps.Add(tsk);
                    postProcessing.Add(tsk.ContinueWith(async ts =>
                    {
                        req.Dispose();
                        var resp = ts.Result;
                        var content = await resp.Content.ReadAsStringAsync();
                        resp.Dispose();
                        Console.WriteLine(content);
                    }));
                }

                await Task.WhenAll(resps);
                resps.Clear();
                Console.WriteLine("All requests done.");
                await Task.WhenAll(postProcessing);
                postProcessing.Clear();
                Console.WriteLine("All postprocessing done.");
            });

            t.Wait();
            Console.Clear();

            var t2 = Task.Run(async () =>
            {
                var resps = new List<Task<HttpResponseMessage>>();
                var postProcessing = new List<Task>();

                for (int i = 0; i < 10000; i++)
                {
                    Console.WriteLine("Firing..");
                    var req = new HttpRequestMessage(HttpMethod.Get,
                                                        "test/delay/5");
                    var tsk = client.SendAsync(req);
                    resps.Add(tsk);
                    postProcessing.Add(tsk.ContinueWith(async ts =>
                    {
                        var resp = ts.Result;
                        var content = await resp.Content.ReadAsStringAsync();
                        Console.WriteLine(content);
                    }));
                }

                await Task.WhenAll(resps);
                resps.Clear();
                Console.WriteLine("All requests done.");
                await Task.WhenAll(postProcessing);
                postProcessing.Clear();
                Console.WriteLine("All postprocessing done.");
            });

            t2.Wait();
            Console.Clear();
            client.Dispose();

            GC.Collect();
            Console.WriteLine("Done");
            Console.ReadLine();
        }
    }

On a quick investigation with a memory profiler, it seems that the objects that take up the memory are all of the type Node<Object> inside mscorlib.

My initial though was that, it was some internal dictionary or a stack, since they are the types that uses Node as an internal structure, but I was unable to turn up any results for a generic Node<T> in the reference source since this is actually Node<object> type.

Is this a bug, or somekind of expected optimization (I wouldn't consider a proportional consumption of memory always retained to be a optimization in any way)? And purely academic, what is the Node<Object>.

Any help in understanding this would be much appreciated. Thanks :)

Update: To extrapolate the results for a much larger test set, I optimized it slightly by throttling it.

Here's the changed program. And now, it seems to stay consistent at 60-70MB, for a 1 million request set. I'm still baffled at what those Node<object>s really are, and its allowed to maintain such a high number of irreclaimable objects.

And the logical conclusion from the differences in these two results leads me to guess, this may not really be an issue in with HttpClient or WebRequest, rather something rooted directly with async - Since the real variant in these two test are the number of incomplete async tasks that exist at a given point in time. This is merely a speculation from the quick inspection.

static void Main(string[] args)
{

    Console.WriteLine("Ready to start.");
    Console.ReadLine();

    var client = new HttpClient { BaseAddress = 
                    new Uri("http://localhost:5000/") };

    var t = Task.Run(async () =>
    {
        var resps = new List<Task<HttpResponseMessage>>();
        var postProcessing = new List<Task>();

        for (int i = 0; i < 1000000; i++)
        {
            //Console.WriteLine("Firing..");
            var req = new HttpRequestMessage(HttpMethod.Get, "test/delay/5");
            var tsk = client.SendAsync(req);
            resps.Add(tsk);
            var n = i;
            postProcessing.Add(tsk.ContinueWith(async ts =>
            {
                var resp = ts.Result;
                var content = await resp.Content.ReadAsStringAsync();
                if (n%1000 == 0)
                {
                    Console.WriteLine("Requests processed: " + n);
                }

                //Console.WriteLine(content);
            }));

            if (n%20000 == 0)
            {
                await Task.WhenAll(resps);
                resps.Clear();
            }

        }

        await Task.WhenAll(resps);
        resps.Clear();
        Console.WriteLine("All requests done.");
        await Task.WhenAll(postProcessing);
        postProcessing.Clear();
        Console.WriteLine("All postprocessing done.");
    });

    t.Wait();
    Console.Clear();
    client.Dispose();

    GC.Collect();
    Console.WriteLine("Done");
    Console.ReadLine();
}
Trimetallic answered 19/12, 2014 at 6:58 Comment(14)
HttpClient, HttpRequestMessage and HttpResponseMessage are all disposable, but you only dispose HttpClient. Dispose everything that needs disposing (through using), then check again. (It may very well still have allocated Node<> objects, but at least the disposables won't be confusing the issue.)Bring
As I've already mentioned in the first line, with all combinations of dispose. The given example doesn't dispose it, but all does indeed result in the same leak. Regardless, I get your point. Will modify it to make it clear.Trimetallic
Trying different combinations really doesn't make sense. Disposing less than you're supposed to certainly can't help in reducing memory use.Bring
Is the question purely academic or is it an actual problem? 50 MB is a drop in the ocean. The runtime maintains several threads for I/O purposes, for starters, so there's a baseline of memory a .NET application consumes that simply won't be released. Ditto for some static objects. To be a genuine "leak" worth bothering about, the memory has to consistently increase the more requests you do -- are you observing that?Bring
Only the parts about the Node<object> is purely academic. The problem is real, the memory usage keeps climbing up. I'm aware that a baseline of memory is never reclaimed by the framework. But that is true only as long as there is a "reasonable upper limit" to it. This has a proportional increase, and hence the usage of the term "leak". On long running tasks on low-memory systems, it becomes highly noticeable. Add one more '0' to the number of the requests, and the memory unreclaimed doubles. Granted, the numbers aren't huge, but on low-memory systems, this impact is huge.Trimetallic
Although not my actual application, a different practical real-life example for the purposes of understanding this - Consider a router. Long standing network task on a low memory system. This quickly sucks up all the memory.Trimetallic
How do you determine the memory used? What metric is used? Did you use a memory profiler?Semiquaver
Yup. Visual Studio ultimate's memory profiler.Trimetallic
Sorry, I only saw that now :) Try Jetbrain's one. It will allow you to drill down and inspect the objects.Semiquaver
Node<T> seems to come from concurrent collections.Semiquaver
Unfortunately, I don't have a license for Jetbrain's profiler. I'll try to drill down with VS as to what creates those objects. Collections were my first guess too, but I was unable to find a generic one inside mscorlib at reference source. It only seems to have a non-generic Node. Odd.Trimetallic
Not really familiar with Tasks, but they are disposable. Does t.Dispose() help?Semiquaver
That's an interesting suggestion. Let me try it out right away.Trimetallic
Okay, I added the ts.Dispose, tsk.Dispose, and t.Dispose, to make sure almost all the tasks that are created to be disposed off. It yielded no change whatsoever.Trimetallic
W
21

Let’s investigate the problem with all the tools we have in hand.

First, let’s take a look at what those objects are, in order to do that, I put the given code in Visual Studio and created a simple console application. Side-by-side I run a simple HTTP server on Node.js to serve the requests.

Run the client to the end and start attaching WinDBG to it, I inspect the managed heap and get these results:

0:037> !dumpheap
Address       MT     Size
02471000 00779700       10 Free
0247100c 72482744       84     
...
Statistics:
      MT    Count    TotalSize Class Name
...
72450e88      847        13552 System.Collections.Concurrent.ConcurrentStack`1+Node[[System.Object, mscorlib]]
...

The !dumpheap command dumps all objects in the managed heap there. That could include objects that should be freed (but not yet because GC has not kicked in yet). In our case, that should be rare because we just called GC.Collect() before the print out and nothing else should run after the print out.

Worth notice is the specific line above. That should be the Node object you are referring to in the question.

Next, let’s look at the individual objects of that type, we grab the MT value of that object and then invoke !dumpheap again like this, this will filter out only the objects we are interested in.

0:037> !dumpheap -mt 72450e88   
 Address       MT     Size
025b9234 72450e88       16     
025b93dc 72450e88       16     
...

Now grabbing a random one in the list, and then asks the debugger why this object is still on the heap by invoking the !gcroot command as follow:

0:037> !gcroot 025bbc8c
Thread 6f24:
    0650f13c 79752354 System.Net.TimerThread.ThreadProc()
        edi:  (interior)
            ->  034734c8 System.Object[]
            ->  024915ec System.PinnableBufferCache
            ->  02491750 System.Collections.Concurrent.ConcurrentStack`1[[System.Object, mscorlib]]
            ->  09c2145c System.Collections.Concurrent.ConcurrentStack`1+Node[[System.Object, mscorlib]]
            ->  09c2144c System.Collections.Concurrent.ConcurrentStack`1+Node[[System.Object, mscorlib]]
            ->  025bbc8c System.Collections.Concurrent.ConcurrentStack`1+Node[[System.Object, mscorlib]]

Found 1 unique roots (run '!GCRoot -all' to see all roots).

Now it is quite obvious that we have a cache, and that cache maintain a stack, with the stack implemented as a linked list. If we ponder further we will see in the reference source, how that list is used. To do that, let’s first inspect the cache object itself, using !DumpObj

0:037> !DumpObj 024915ec 
Name:        System.PinnableBufferCache
MethodTable: 797c2b44
EEClass:     795e5bc4
Size:        52(0x34) bytes
File:        C:\WINDOWS\Microsoft.Net\assembly\GAC_MSIL\System\v4.0_4.0.0.0__b77a5c561934e089\System.dll
Fields:
      MT    Field   Offset                 Type VT     Attr    Value Name
724825fc  40004f6        4        System.String  0 instance 024914a0 m_CacheName
7248c170  40004f7        8 ...bject, mscorlib]]  0 instance 0249162c m_factory
71fe994c  40004f8        c ...bject, mscorlib]]  0 instance 02491750 m_FreeList
71fed558  40004f9       10 ...bject, mscorlib]]  0 instance 025b93b8 m_NotGen2
72484544  40004fa       14         System.Int32  1 instance        0 m_gen1CountAtLastRestock
72484544  40004fb       18         System.Int32  1 instance 605289781 m_msecNoUseBeyondFreeListSinceThisTime
7248fc58  40004fc       2c       System.Boolean  1 instance        0 m_moreThanFreeListNeeded
72484544  40004fd       1c         System.Int32  1 instance      244 m_buffersUnderManagement
72484544  40004fe       20         System.Int32  1 instance      128 m_restockSize
7248fc58  40004ff       2d       System.Boolean  1 instance        1 m_trimmingExperimentInProgress
72484544  4000500       24         System.Int32  1 instance        0 m_minBufferCount
72484544  4000501       28         System.Int32  1 instance        0 m_numAllocCalls

Now we see something interesting, the stack is actually used as a free list for the cache. The source code tells us how the free list is used, in particular, in the Free() method shown below:

http://referencesource.microsoft.com/#mscorlib/parent/parent/parent/parent/InternalApis/NDP_Common/inc/PinnableBufferCache.cs

/// <summary>
/// Return a buffer back to the buffer manager.
/// </summary>
[System.Security.SecuritySafeCritical]
internal void Free(object buffer)
{
  ...
  m_FreeList.Push(buffer);
}

So that is it, when the caller is done with the buffer, it returns to the cache, the cache then put that in the free list, the free list is then used for allocation purpose

[System.Security.SecuritySafeCritical]
internal object Allocate()
{
  // Fast path, get it from our Gen2 aged m_FreeList.  
  object returnBuffer;
  if (!m_FreeList.TryPop(out returnBuffer))
    Restock(out returnBuffer);
  ...
}

Last but not least, let’s understand why the cache itself is not freed when we are done with all those HTTP requests? Here is why. By adding a breakpoint on mscorlib.dll!System.Collections.Concurrent.ConcurrentStack.Push(), we see the following call stack (well, this could be just one of the cache use case, but this is representative)

mscorlib.dll!System.Collections.Concurrent.ConcurrentStack<object>.Push(object item)
System.dll!System.PinnableBufferCache.Free(object buffer)
System.dll!System.Net.HttpWebRequest.FreeWriteBuffer()
System.dll!System.Net.ConnectStream.WriteHeadersCallback(System.IAsyncResult ar)
System.dll!System.Net.LazyAsyncResult.Complete(System.IntPtr userToken)
System.dll!System.Net.ContextAwareResult.Complete(System.IntPtr userToken)
System.dll!System.Net.LazyAsyncResult.ProtectedInvokeCallback(object result, System.IntPtr userToken)
System.dll!System.Net.Sockets.BaseOverlappedAsyncResult.CompletionPortCallback(uint errorCode, uint numBytes, System.Threading.NativeOverlapped* nativeOverlapped)
mscorlib.dll!System.Threading._IOCompletionCallback.PerformIOCompletionCallback(uint errorCode, uint numBytes, System.Threading.NativeOverlapped* pOVERLAP)

At WriteHeadersCallback, we are done with writing the headers, so we return the buffer to the cache. At this point the buffer is pushed back to the free list, and therefore we allocate a new stack node. The key thing to notice is that the cache object is a static member of HttpWebRequest.

http://referencesource.microsoft.com/#System/net/System/Net/HttpWebRequest.cs

...
private static PinnableBufferCache _WriteBufferCache = new PinnableBufferCache("System.Net.HttpWebRequest", CachedWriteBufferSize);
...
// Return the buffer to the pinnable cache if it came from there.   
internal void FreeWriteBuffer()
{
  if (_WriteBufferFromPinnableCache)
  {
    _WriteBufferCache.FreeBuffer(_WriteBuffer);
    _WriteBufferFromPinnableCache = false;
  }
  _WriteBufferLength = 0;
  _WriteBuffer = null;
}
...

So there we go, the cache is shared across all requests and is not released when all requests are done.

Walloper answered 30/12, 2015 at 23:55 Comment(1)
So is it a bug? And how to prevent memory overuse?Anvers
V
2

We had the same problems, when we use System.Net.WebRequest for doing some http-requests. Size of w3wp process had range 4-8 Gb, because we do not have a constant load. Sometimes we have 10 request per second and 1000 in other time. Of course buffer does not reused in same scenario.

We are change all place when used System.Net.WebRequest on System.Net.Http.HttpClient because it doesn't have any buffer pools.

If you have many request through your httpclient, make it as static variable for avoid Socket leaks.

enter image description here

I think that more simple way analyze this problem - use PerfView. This application can show reference tree so you can show root case of your problem.

enter image description here enter image description here

Vanden answered 26/10, 2016 at 11:54 Comment(0)
D
0

We encountered a similar issue with the PinnableBufferCache becoming too large and leading to OutOfMemoryException's.

enter image description here

Andrew Au's analysis stopped at the point that the cache is static "and is not released when all requests are done". But the more interesting question "Under what conditions it is released?" was still open.

According to the sources it is trimmed on Gen2 GC event together with some other conditions which are pretty tricky (e.g. not often that every 10 msec, etc): https://referencesource.microsoft.com/#System/parent/parent/parent/InternalApis/NDP_Common/inc/PinnableBufferCache.cs,203

My experiments have shown that if the process will survive the memory usage hype and a load (i.e. the number of HTTP requests) will decrease than the cache volume will decrease as well with time.

In our case, we found that we can greatly optimize the amount of content loaded via HTTP.

I think alternative solutions might be making more free virtual memory available for process or throttling a load when memory usage is too high.

Dumb answered 23/3, 2018 at 10:52 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.