Understanding Memory Pools

Asked 30/5, 2017 at 4:52 Answered 30/5, 2017 at 6:51

Solved c++performance optimization memory-management memory-pool

To my understanding, a memory pool is a block, or multiple blocks of memory allocate on the stack before runtime.
By contrast, to my understanding, dynamic memory is requested from the operating system and then allocated on the heap during run time.

// EDIT //

Memory pools are evidently not necessarily allocated on the stack ie. a memory pool can be used with dynamic memory.
Non dynamic memory is evidently also not necessarily allocated on the stack, as per the answer to this question.
The topics of 'dynamic vs. static memory' and 'memory pools' are thus not really related although the answer is still relevant.

From what I can tell, the purpose of a memory pool is to provide manual management of RAM, where the memory must be tracked and reused by the programmer.

This is theoretically advantageous for performance for a number of reasons:

Dynamic memory becomes fragmented over time
The CPU can parse static blocks of memory faster than dynamic blocks
When the programmer has control over memory, they can choose to free and rebuild data when it is best to do so, according the the specific program.

4. When multithreading, separate pools allow separate threads to operate independently without waiting for the shared heap (Davislor)

Is my understanding of memory pools correct? If so, why does it seem like memory pools are not used very often?

Bashibazouk answered 30/5, 2017 at 4:52 Comment(7)

One reason you might want separate memory pools is if you have several threads allocating and deallocating memory simultaneously. If they all use the same heap, they have to wait for the other threads to finish modifying the heap before they can proceed. If each thread has its own memory pool, it can use it without any kind of locking or waiting. – Waligore 30/5, 2017 at 6:14

"The CPU can parse static blocks of memory..." - This is entirely not how CPU's work. In fact, a typical CPU has no idea what types of memory exist in a program - they don't even know the language a program was written in! – Afrikah 30/5, 2017 at 7:52

@davislor That's a good point also, I'll edit it in to the question later. – Bashibazouk 30/5, 2017 at 13:23

@msalters Perhaps I should change the word 'parse' to 'access'. What I was told is that the CPU can actually loop through static memory faster. Perhaps because it is packed tighter/more likely aligned and thus has fewer cache misses? – Bashibazouk 30/5, 2017 at 13:24

@davislor Question edited, added the bit about threads – Bashibazouk 30/5, 2017 at 13:57

@bigcodeszzer: That's more a locality of reference thing, not directly related to the type of memory. – Afrikah 30/5, 2017 at 14:44

Another way in which systems often use separate memory pools is allocating small blocks of memory from a classical heap, but handling large memory allocations by mapping hardware pages into the address space. This can perform better. – Waligore 30/5, 2017 at 17:23

It seems this question is thwart with XY problem and premature optimisation.

You should focus on writing legible code, then using a profiler to perform optimisations if necessary.

Is my understanding of memory pools correct?

Not quite.

... on the stack ...

... on the heap ...

Storage duration is orthogonal to the concept of pools; pools can be allocated to have any of the four storage durations (they are: static, thread, automatic and dynamic storage duration).

The C++ standard doesn't require that any of these go into a stack or a heap; it might be useful to think of all of them as though they go into the same place... after all, they all (commonly) go onto silicon chips!

... allocate ... before runtime ...

What matters is that the allocation of multiple objects occurs before (or at least less often than) those objects are first used; this saves having to allocate each object separately. I assume this is what you meant by "before runtime". When choosing the size of the allocation, the closer you get to the total number of objects required at any given time the less waste from excessive allocation and the less waste from excessive resizing.

If your OS isn't prehistoric, however, the advantages of pools will quickly diminish. You'd probably see this if you used a profiler before and after conducting your optimisation!

Dynamic memory becomes fragmented over time

This may be true for a naive operating system such as Windows 1.0. However, in this day and age objects with allocated storage duration are commonly stored in virtual memory, which periodically gets written to, and read back from disk (this is called paging). As a consequence, fragmented memory can be defragmented and objects, functions and methods that are more commonly used might even end up being united into common pages.

That is, paging forms an implicit pool (and cache prediction) for you!

The CPU can parse static blocks of memory faster than dynamic blocks

While objects allocated with static storage duration commonly are located on the stack, that's not mandated by the C++ standard. It's entirely possible that a C++ implementation may exist where-by static blocks of memory are allocated on the heap, instead.

A cache hit on a dynamic object will be just as fast as a cache hit on a static object. It just so happens that the stack is commonly kept in cache; you should try programming without the stack some time, and you might find that the cache has more room for the heap!

BEFORE you optimise you should ALWAYS use a profiler to measure the most significant bottleneck! Then you should perform the optimisation, and then run the profiler again to make sure the optimisation was a success!

This is not a machine-independent process! You need to optimise per-implementation! An optimisation for one implementation is likely a pessimisation for another.

If so, why does it seem like memory pools are not used very often?

The virtual memory abstraction described above, in conjunction with eliminating guess-work using cache profilers virtually eliminates the usefulness of pools in all but the least-informed (i.e. use a profiler) scenarios.

Royston answered 30/5, 2017 at 6:20 Comment(1)

Made some very good points. Question edited, answer accepted. – Bashibazouk 30/5, 2017 at 15:39

A customized allocator can help performance since the default allocator is optimized for a specific use case, which is infrequently allocating large chunks of memory.

But let's say for example in a simulator or game, you may have a lot of stuff happening in one frame, allocating and freeing memory very frequently. In this case the default allocator is not as good.

A simple solution can be allocating a block of memory for all the throwaway stuff happening during a frame. This block of memory can be overwritten over and over again, and the deletion can be deferred to a later time. e.g: end of a game level or whatever.

Thiazine answered 30/5, 2017 at 5:26 Comment(0)

Memory pools are used to implement custom allocators.

One commonly used is a linear allocator. It only keeps a pointer seperating allocated/free memory. Allocating with it is just a matter of incrementing the pointer by the N bytes requested, and returning it's previous value. And deallocation is done by resetting the pointer to the start of the pool.

Epidemic answered 30/5, 2017 at 6:51 Comment(0)

Recommended topics

Hot tags