What are the differences between Block, Stack and Scratch Allocators?

In his talk "Solving the Right Problems for Engine Developers", Mike Acton says that

the vast majority of the time, all you're going to need are these three types of allocator: there's the block allocator, the stack allocator and the scratch allocator

However, he doesn't go into detail about what the differences between these types of allocator are.

I would presume a 'stack allocator' is just a stack-based allocator, but all the other types I've heard of (including 'arena') just sound like fancy ways of doing the same thing, that is 'allocate a big block and chunk it up in a nice efficient way, then free it when you're done'

So, what are the differences between these allocators, what are the advantages of each, why do I only need these three 'the vast majority of the time'?

As was pointed out in the comments, the terminology used in the talk is not well established around the industry, so there is some doubt left as to what exact allocation strategies are being referred to here. Taking into account what is commonly mentioned in game programming literature, here is my educated guess what is behind the three mentioned allocators:

Block Allocator

Also known as a pool allocator. This is an allocator that only hands out fized-sized blocks of memory, regardless of how much memory the user actually requested.

Let's say you have a block allocator with a block size of 100 bytes. You want to allocate memory for a single 64 bit integer? It gives you a block of 100 bytes. You want to allocate memory for an array of 20 single precision floats? It gives you a block of 100 bytes. You want to allocate memory for an ASCII string with 101 characters? It gives you an error, as it can't fit your string into 100 bytes.

Block allocators have several advantages. They are relatively easy to implement and they don't suffer from external memory fragmentation. They also usually exhibit a very predictable runtime behavior, which is often essential for video games. They are well suited for problems where most allocations are of roughly the same size and obviously less well suited for when that is not the case.

Apart from the simplest version described here, where each allocator supports only a single block size, extensions exist that are more flexible, supporting multiple block sizes, without compromising too heavily on the aforementioned advantages.

Stack Allocator

A stack allocator works like a stack: You can only deallocate in the inverse order of allocation. If you subsequently allocate objects A and then B, you cannot reclaim the memory for A without also giving up B.

Stack allocators are very easy to implement, as you only need to keep track of a single pointer that marks the separation between the used and unused regions of memory. Allocation moves that pointer into one direction and deallocation moves it the opposite way.

Stack allocators make optimally efficient use of memory and have fully predictable runtime behavior. They obviously work well only for problems where the required order of deallocations is easy to achieve. It is usually not trivial to enforce the correct deallocation order statically, so debugging them can be a pain if they are being used carelessly.

Scratch Allocator

Also known as a monotonic allocator. A scratch allocator works similar to a stack allocator. Allocation works exactly the same. Deallocation is a no-op. That is, once memory has been allocated it cannot be reclaimed.

If you want to get the memory back, you have to destroy the entire scratch allocator, thereby releasing all of its memory at once.

The advantages of the scratch allocator are the same as with the stack allocator. They are well suited for problems where you can naturally identify points at which all allocated objects are no longer needed. Similar to the stack allocator, when used carelessly, they can lead to nasty runtime errors if an allocator is destroyed while there are still active objects alive.

Why do I only need those three?

Experience shows that in a lot of domains, fully dynamic memory management is not required. Allocation lifetimes can either be grouped by common size (block allocator) or by common lifetimes (scratch and stack allocator). If an engineer working in such a domain is willing to go through the troubles of classifying each allocation accordingly, they can probably make due with just these three allocation strategies for the majority of their dynamic memory needs, without introducing unreasonable additional development efforts. As a reward for their efforts, they will benefit from the nice runtime properties of these algorithms, in particular very fast and predictable execution times, and predictable memory consumption.

If you are in a domain where it is harder to classify allocations along those terms; or if you can not or are unwilling to spend the additional engineering effort; or if you are dealing with a special use case that doesn't map well to those three allocators - you will probably still want to use a general purpose allocator, i.e. good old malloc.

The point that was being made in the talk is more that if you do need to worry about custom memory allocation - and especially in the domain of video games with its specific requirements and trade offs - those three types of allocators are very good answers to the specific problems that you may otherwise encounter when naïvely relying on the general purpose allocator alone.

I gave a long talk about allocators in C++ a while back where I explain all this in more detail if you still want to know more.

Recommended topics

Hot tags