Why is memory not reusable after allocating/deallocating a number of small objects?
Asked Answered
R

4

5

While investigating a memory link in one of our projects, I've run into a strange issue. Somehow, the memory allocated for objects (vector of shared_ptr to object, see below) is not fully reclaimed when the parent container goes out of scope and can't be used except for small objects.

The minimal example: when the program starts, I can allocate a single continuous block of 1.5Gb without problem. After I use the memory somewhat (by creating and destructing an number of small objects), I can no longer do big block allocation.

Test program:

#include <iostream>
#include <memory>
#include <vector>
using namespace std;

class BigClass
{
private:
    double a[10000];
};

void TestMemory() {
    cout<< "Performing TestMemory"<<endl;
    vector<shared_ptr<BigClass>> list;
    for (int i = 0; i<10000; i++) {
        shared_ptr<BigClass> p(new BigClass());
        list.push_back(p);
    };
};

void TestBigBlock() {
    cout<< "Performing TestBigBlock"<<endl;
    char* bigBlock = new char [1024*1024*1536];
    delete[] bigBlock;
}

int main() {
    TestBigBlock();
    TestMemory();
    TestBigBlock();
}

Problem also repeats if using plain pointers with new/delete or malloc/free in cycle, instead of shared_ptr.

The culprit seems to be that after TestMemory(), the application's virtual memory stays at 827125760 (regardless of number of times I call it). As a consequence, there's no free VM regrion big enough to hold 1.5 GB. But I'm not sure why - since I'm definitely freeing the memory I used. Is it some "performance optimization" CRT does to minimize OS calls?

Environment is Windows 7 x64 + VS2012 + 32-bit app without LAA

Rancorous answered 29/10, 2013 at 8:53 Comment(19)
There is no way of returning memory to the OS. Once your program has it will keep it forever. But if you free half of the objects and allocate them again you should see no increase in usage.Responser
How do you measure "memory leakage"? You can't just call some random functions and decide that's a leak. You actually have to prove that a new is not matched by a delete etc.Seating
@Kerrek SB: I measure "memory leak" by the fact that 800Mb are not freed (see VM usage), while they should be. Proof of that is that I can't allocate 1.5Gb piece, while, again, I should be.Rancorous
@RedX: delete and delete[] operators are this exact way to return memory to OS. If you check the example with TestMemory2, you'll see it works. Problem is, smart pointers should do it for me.Rancorous
Then this is a question for your OS vendor, not about C++! There's no leak as far as C++ is concerned.Seating
Either as Kerrek SB says, or a case for your compiler or library implementation documentation or bug database.Carbo
Now that's more constructive. Yes, it's possible that it's OS/runtime bug - not releasing the memory. From my point of view, it's a memory leak, though, as I can't use that memory anymore.Rancorous
Much more plausible than a leak in vector is that the many smaller allocations in TestMemory is causing fragmentation, which TestMemory2 doesn't, since it's allocating a big chunk. The remedy for memory fragmentation is to avoid it.Corticosteroid
@Kerrek SB: As i understand it, the Memory is not available in main after TestMemory has returned. Wether or not it has been returned to the OS by that time, it certainly should be usable for the app. So i would say C++ is concerned here. The use of GetUsedMemory might be distracting, but i understand the real question is: Why does the allocation in main does not work after TestMemory().Stuff
Well, the question is, what is causing fragmentation? as you can see from the code, I'm releasing every single object I'm allocating.Rancorous
Just to make sure: BigClass does not by any Chance hold a shared_pointer of itself, right? Also you could maybe try boost::shared_ptr if all else Fails.Stuff
BigClass is just a holder of a 10Kb char array (see code). And the issue actually repeats itself if I use plain pointers with new/delete in cycle.Rancorous
Are you on a 32-bit OS?Seating
I'm not too familiar with auto_ptr implementation, but could it have to with delayed deallocation (see bottom recommendation, 2nd to last)? To test this, add an n second delay loop after TestMemory(). I might read this wrong, but there appears to be a hidden pointer tracking the control object managing auto_ptr references, which is garbage-collected (?) with a possible delay.Bobbobb
I'm using std::shared_ptr, not sure it has such features... But even if it did, the problem still reproduces when using malloc/free.Rancorous
The linked description is about shared_ptr, and it applies there; but if the same happens with Malloc, then that is probably not the reason. Still, I'd consider adding a wait loop between your functions, and see what happens.Bobbobb
C++/C++11 is NOT a GC language.Rancorous
With shared_ptr, it is.Bobbobb
You are grossly wrong.Rancorous
L
3

Sorry for posting yet another answer since I am unable to comment; I believe many of the others are quite close to the answer really :-)

Anyway, the culprit is most likely address space fragmentation. I gather you are using Visual C++ on Windows.

The C / C++ runtime memory allocator (invoked by malloc or new) uses the Windows heap to allocate memory. The Windows heap manager has an optimization in which it will hold on to blocks under a certain size limit, in order to be able to reuse them if the application requests a block of similar size later. For larger blocks (I can't remember the exact value, but I guess it's around a megabyte) it will use VirtualAlloc outright.

Other long-running 32-bit applications with a pattern of many small allocations have this problem too; the one that made me aware of the issue is MATLAB - I was using the 'cell array' feature to basically allocate millions of 300-400 byte blocks, causing exactly this issue of address space fragmentation even after freeing them.

A workaround is to use the Windows heap functions (HeapCreate() etc.) to create a private heap, allocate your memory through that (passing a custom C++ allocator to your container classes as needed), and then destroy that heap when you want the memory back - This also has the happy side-effect of being very fast vs delete()ing a zillion blocks in a loop..

Re. "what is remaining in memory" to cause the issue in the first place: Nothing is remaining 'in memory' per se, it's more a case of the freed blocks being marked as free but not coalesced. The heap manager has a table/map of the address space, and it won't allow you to allocate anything which would force it to consolidate the free space into one contiguous block (presumably a performance heuristic).

Lexical answered 10/4, 2014 at 13:37 Comment(4)
Yes, I missed to mention the point "blocks being marked as free but not coalesced".Phototype
Thanks, the most sensible explanation so far - well explains the pattern I observe. Would you have any links, by chance? This looks like something I'd have to read about in depth.Rancorous
This usenet thread was helpful, along with some close reading of MSDN as I recall. There is also some round-about discussion of the issue in a MATLAB whitepaper on memory allocation, but I can't seem to find that right now: tech-archive.net/Archive/Development/…Lexical
Thanks, that's precisely what I was hunting for! Bounty is yours.Rancorous
P
2

There is absolutely no memory leak in your C++ program. The real culprit is memory fragmentation.

Just to be sure(regarding memory leak point), I ran this program on Valgrind, and it did not give any memory leak information in the report.

//Valgrind Report
mantosh@mantosh4u:~/practice$ valgrind ./basic
==3227== HEAP SUMMARY:
==3227==     in use at exit: 0 bytes in 0 blocks
==3227==   total heap usage: 20,017 allocs, 20,017 frees, 4,021,989,744 bytes allocated
==3227== 
==3227== All heap blocks were freed -- no leaks are possible

Please find my response to your query/doubt asked in original question.

The culprit seems to be that after TestMemory(), the application's virtual memory stays at 827125760 (regardless of number of times I call it). Yes, real culprit is hidden fragmentation done during the TestMemory() function.Just to understand the fragmentation, I have taken the snippet from wikipedia

" when free memory is separated into small blocks and is interspersed by allocated memory. It is a weakness of certain storage allocation algorithms, when they fail to order memory used by programs efficiently. The result is that, although free storage is available, it is effectively unusable because it is divided into pieces that are too small individually to satisfy the demands of the application. For example, consider a situation wherein a program allocates 3 continuous blocks of memory and then frees the middle block. The memory allocator can use this free block of memory for future allocations. However, it cannot use this block if the memory to be allocated is larger in size than this free block."

The above explains paragraph explains very nicely about memory fragmentation.Some allocation patterns(such as frequent allocation and deal location) would lead to memory fragmentation,but its end impact(.i.e. memory allocation 1.5GBgets failed) would greatly vary on different system as different OS/heap manager has different strategy and implementation. As an example, your program ran perfectly fine on my machine(Linux) however you have encountered the memory allocation failure.

Regarding your observation on VM size remains constant: VM size seen in task manager is not directly proportional to our memory allocation calls. It mainly depends on the how much bytes is in committed state. When you allocate some dynamic memory(using new/malloc) and you do not write/initialize anything in those memory regions, it would not go committed state and hence VM size would not get impacted due to this. VM size depends on many other factors and bit complicated so we should not rely completely on this while understanding about dynamic memory allocation of our program.

As a consequence, there's no free VM regrion big enough to hold 1.5 GB.

Yes, due to fragmentation, there is no contiguous 1.5GB memory. It should be noted that total remaining(free) memory would be more than 1.5GB but not in fragmented state. Hence there is not big contiguous memory.

But I'm not sure why - since I'm definitely freeing the memory I used. Is it some "performance optimization" CRT does to minimize OS calls?

I have explained about why it may happen even though you have freed all your memory. Now in order to fulfil user program request, OS will call to its virtual memory manager and try to allocate the memory which would be used by heap memory manager. But grabbing the additional memory does depend on many other complex factor which is not very easy to understand.

Possible Resolution of Memory Fragmentation

We should try to reuse the memory allocation rather than frequent memory allocation/free. There could be some patterns(like a particular request size allocation in particular order) which may lead overall memory into fragmented state. There could be substantial design change in your program in order to improve memory fragmentation. This is complex topic and require internal understanding of memory manager to understand the complete root cause of such things.

However there are tools exists on Windows based system which I am not much aware. But I found one excellent SO post regarding the which tool(on windows) can be useful to understand and check the fragmentation status of your program by yourself.

https://mcmap.net/q/1265655/-heap-fragmentation-and-windows-memory-manager

Phototype answered 9/4, 2014 at 18:55 Comment(4)
It's not memory fragmentation, because (by C++ standards) I return the heap to the precise state it was in previously - exactly 0 bytes on the heap. There's nothing left in the heap which would "fragment" it. See definition of types of fragmentationRancorous
And no, Virtual Memory size (as shown by Process Explorer - Windows task manager actually shows Private bytes in VM column, incorrectly) is exactly showing memory which is actually allocated in address space. If you call new char[1024*1024*1024], VM size would grow exactly by 1GiB. Good attempt, but unfortunately doesn't give an actual answer to my question.Rancorous
@DarkWanderer: Well with my understanding, its definitely the memory fragmentation issue and I do not have anything to add upon my post. However it would be really good for everyone to understand in better way, if you post the exact log/message like NULL/bad allocation message of heap manager when you called your second TestBigBlock() function in your above minimal version of program where you are able to reproduce the problem.Phototype
Try answering this question: what is remaining in memory so it is fragmented?Rancorous
M
1

This is not memory leak. The memory U used was allocated by C\C++ Runtime. The Runtime apply a a bulk of memory from OS once and then each new you called will allocated from that bulk memory. when delete one object, the Runtime not return memory to OS immediately, it may hold that memory for performance.

Mcconaghy answered 29/10, 2013 at 9:13 Comment(4)
In my application, I have to manipulate large containers of shared_ptr. It takes up large part of available memory, making it unavailable for me later - hence, from my point of view, it is a memory leak. Is there any way to prevent that behavior? I'm using VS2012 with vc11 toolset.Rancorous
If that was true, he should have been able to use that Memory in main after TestMemory had returned. If the Memory is not returned to the OS for Performance as you say, then it certainly should be available for the very same app that allocated and released it.Stuff
@DeVadder: You could easily imagine a garbage collector operating with a delay prompted by when it will be accorded a time slice. This is my hunch too (see my comments above).Bobbobb
@Bobbobb Garbage Collection as a word has a pretty clear meaning for most programmers and means a feature of a programming language. And as such it does not apply here as C++ has no Garbage Collection. However, by now i do like the edited answer of edA-qa mort-ora-y as it does explain how the memory strategies of the underlying OS could cause a behavior like this. However, if that was the reason, i would probably call it a bug in that strategy.Stuff
A
1

There is nothing here which indicates a genuine "leak". The pattern of memory you describe is not unexpected. Here are a few points which might help to understand. What happens is highly OS dependent.

  • A program often has a single heap which can be extended or shrunk in length. It is however one contiguous memory area, so changing the size is just changing where the end of the heap is. This makes it very difficult to ever "return" memory to the OS, since even one little tiny object in that space will prevent its shrinking. On Linux you can lookup the function 'brk' (I know you're on Windows, but I presume it does something similar).

  • Large allocations are often done with a different strategy. Rather than putting them in the general purpose heap, an extra block of memory is created. When it is deleted this memory can actually be "returned" to the OS since its guaranteed nothing is using it.

  • Large blocks of unused memory don't tend to consume a lot of resources. If you generally aren't using the memory any more they might just get paged to disk. Don't presume that because some API function says you're using memory that you are actually consuming significant resources.

  • APIs don't always report what you think. Due to a variety of optimizations and strategies it may not actually be possible to determine how much memory is in use and/or available on a system at a particular moment. Unless you have intimate details of the OS you won't know for sure what those values mean.

The first two points can explain why a bunch of small blocks and one large block result in different memory patterns. The latter points indicate why this approach to detecting leaks is not useful. To detect genuine object-based "leaks" you generally need a dedicated profiling tool which tracks allocations.


For example, in the code provided:

  1. TestBigBlock allocates and deletes array, assume this uses a special memory block, so memory is returned to OS
  2. TestMemory extends the heap for all the small objects, and never returns any heap to the OS. Here the heap is entirely available from the applications point-of-view, but from the OS's point of view it is assigned to the application.
  3. TestBigBlock now fails, since although it would use a special memory block, it shares the overall memory space with heap, and there just isn't enough left after 2 is complete.
Animalist answered 29/10, 2013 at 10:28 Comment(7)
Doesn't explain why big block allocation fails after the test, but does not before.Rancorous
I agree it's not a "leak", though. Will amend the question.Rancorous
@DarkWanderer, 32-bit OS or process? If so I think virtual memory fragmentation would be the cause.Ruelle
32-bit process, 64bit OS. No, it can't be fragmentation - I don't leave any objects allocated before trying the large block.Rancorous
None of that explains why the allocation in main would fail after TestMemory() and succeed before. You really should remove the calls to GetUsedVirtualMemory() and put more emphasis on the matter that the Memory is not available for your program instead of just not showing up for the OS.Stuff
@DeVadder: thanks for the advice, everyone seems to jump onto the VM part instead of actual problem. Question edited.Rancorous
I never saw that last edit. I believe that could actually be the reason. But if so, i would argue that the behavior is less than desirable. I would have expected that in the case of a lack of overall memory, the heap would be shrunk even if that was "difficult" at the current time. But all in all, i would not be surprised if this actually was the reason.Stuff

© 2022 - 2024 — McMap. All rights reserved.