We have received a native (full) crash dump file from a customer. Opening it in the Visual Studio (2005) debugger shows that we had a crash caused by a realloc call that tried to allocate a ~10MB block. The dump file was unusually large (1,5 GB -- normally they are more like 500 MB).
We therefore conclude that we have a memory "leak" or runaway allocations that either fully exhausted the memory of the process or at least fragmented it significantly enough for the realloc to fail. (Note that this realloc was for an operation that allocated a logging buffer and we are not surprised it failed here, because 10MB in one go would be one of the larger allocations that we do apart from some very large pretty unchangeable buffers -- the problem itself likely has nothing to do with this specific allocation.)
Edit: After the comments exchange wit Lex Li below, I should add: This is not reproducible for us (at the moment). It's just one customer dump clearly showing runaway memory consumption.
Main Question:
Now we have a dump file, but how can we locate what caused the excessive memory usage?
What we've done so far:
We have used the DebugDiag tool to analyze the dump file (the so called Memory Pressure Analyzer), and here's what we got:
Report for DumpFM...dmp
Virtual Memory Summary
----------------------
Size of largest free VM block 62,23 MBytes
Free memory fragmentation 81,30%
Free Memory 332,87 MBytes (16,25% of Total Memory)
Reserved Memory 0 Bytes (0,00% of Total Memory)
Committed Memory 1,67 GBytes (83,75% of Total Memory)
Total Memory 2,00 GBytes
Largest free block at 0x00000000`04bc4000
Loaded Module Summary
---------------------
Number of Modules 114 Modules
Total reserved memory 0 Bytes
Total committed memory 3,33 MBytes
Thread Summary
--------------
Number of Threads 56 Thread(s)
Total reserved memory 0 Bytes
Total committed memory 652,00 KBytes
This was just to get a bit context. Whats more interesting I believe is:
Heap Summary
------------
Number of heaps 26 Heaps
Total reserved memory 1,64 GBytes
Total committed memory 1,61 GBytes
Top 10 heaps by reserved memory
-------------------------------
0x01040000 1,55 GBytes
0x00150000 64,06 MBytes
0x010d0000 15,31 MBytes
...
Top 10 heaps by committed memory
--------------------------------
0x01040000 1,54 GBytes
0x00150000 55,17 MBytes
0x010d0000 6,25 MBytes
...
Now, looking at heap 0x01040000
(1,5 GB) we see:
Heap 5 - 0x01040000
-------------------
Heap Name msvcr80!_crtheap
Heap Description This heap is used by msvcr80
Reserved memory 1,55 GBytes
Committed memory 1,54 GBytes (99,46% of reserved)
Uncommitted memory 8,61 MBytes (0,54% of reserved)
Number of heap segments 39 segments
Number of uncommitted ranges 41 range(s)
Size of largest uncommitted range 8,33 MBytes
Calculated heap fragmentation 3,27%
Segment Information
-------------------
Base Address | Reserved Size | Committed Size | Uncommitted Size | Number of uncommitted ranges | Largest uncommitted block | Calculated heap fragmentation
0x01040640 64,00 KBytes 64,00 KBytes 0 Bytes 0 0 Bytes 0,00%
0x01350000 1.024,00 KBytes 1.024,00 KBytes 0 Bytes 0 0 Bytes 0,00%
0x02850000 2,00 MBytes 2,00 MBytes 0 Bytes 0 0 Bytes 0,00%
...
What is this Segment Information anyway?
Looking at the allocations that are listed:
Top 5 allocations by size
-------------------------
Allocation Size - 336 1,18 GBytes
Allocation Size - 1120004 121,77 MBytes
...
Top 5 allocations by count
--------------------------
Allocation Size - 336 3760923 allocation(s)
Allocation Size - 32 1223794 allocation(s)
...
We can see that apparently the MSVCR80 heap holds 3.760.923 allocations at 336 bytes. This makes it pretty clear that we mopped up our memory with lots of small allocations, but how can we get some more info regarding where these allocation came from?
If we somehow could sample some of these allocation addresses and then check where in the process image these addresses are in use, then -- assuming that a large portion of these allocations are responsible for our "leak" -- we could maybe find out where these runaway allocations came from.
Unfortunately, I have really no idea how to get more info out of the dump at the moment.
How could I inspect this heap to see some of the "336" allocation addresses?
How can I search the dump for these addresses and how do I then find out which pointer variable (if any) in the dump hold on tho these addresses?
Any tips regarding usage of DebugDiag, WinDbg or any other tool could really help! Also, if you disagree with any of my analysis above, let us know! Thanks!