How do you determine the size of the nodes created by a 'std::map' for use with 'boost::pool_allocator' (in a cross-platform way)?

Asked 9/4, 2014 at 1:51 Answered 10/4, 2014 at 13:1

c++memory-management c++11 boost pool

UPDATE

Per comments, answer, and additional research, I have come to the conclusion that there is typically no difference between a set and a map in terms of node overhead. My question that follows is really:

How do you determine node overhead for convenient use of boost::pool_allocator as a custom allocator?

And, a further update: The node overhead is probably never going to be more than the size of 4 pointers, so just purging the Boost Pool for sizeof(T), sizeof(T)+sizeof(int), sizeof(T) + 2*sizeof(int), sizeof(T) + 3*sizeof(int) and sizeof(T) + 4*sizeof(int) (or int64_t for 64-bit systems) should be fine. That is what I am actually doing, and it works.

I want to use a boost memory pool to manage tens of millions of tiny, identically-sized objects by avoiding calls to the destructors of these objects, and instead freeing the memory in single swaths containing many instances per swath.

I posted another question about this issue, and the answer to that question has led me to understand that the question I really need to answer is the one I am asking here.

Consider the following code:

class Obj { // ... has an operator<() ... };

typedef std::set<Obj, std::less<Obj>, boost::fast_pool_allocator<Obj>> fast_set_obj;

// Deliberately do not use a managed pointer -
// I will *NOT* delete this object, but instead
// I will manage the memory using the memory pool!!!
fast_set_obj * mset = new fast_set_obj;

// ... add some Obj's to 'mset'
mset->insert(Obj());
mset->insert(Obj());

// Do something desireable with the set ...
...

// All done.
// It's time to release the memory, but do NOT call any Obj destructors.
// The following line of code works exactly as intended.

boost::singleton_pool<boost::fast_pool_allocator_tag, sizeof(Obj const)>::purge_memory();

If you step into the purge_memory() function of the last line of code, you will see that the fast_pool_allocator nicely proceeds to free the memory from the system, just as desired. (No Obj destructors are called, because as noted in the linked question above, the job of the custom allocator is just to allocate and free memory, not to call constructors or destructors.)

It works precisely as desired. Great!

However, here is the problem. If you replace the set with a map, and then try to use the boost::pool_allocator, nothing happens in the call to purge_memory()!

typedef std::map<int, int, std::less<int>,
                 boost::fast_pool_allocator<std::pair<int const, int>>>
        fast_map_obj;

// Ditto above: Deliberately do not use managed pointer
mast_map_obj * mmap = new fast_map_obj;

mmap[5] = Obj();
mmap[6] = Obj();

...

// Uh-oh.  The following line of code DOES NOTHING, because I was using a map, not a set!

boost::singleton_pool<boost::fast_pool_allocator_tag,
                     sizeof(std::pair<int const, int>)>::purge_memory();

As noted, the last line of code does nothing. The reason is that the boost::fast_pool_allocator is a singleton that only responds to, and manages memory for, objects of a given size that is fixed at compile-time. This is why the sizeof argument is used in the expression that calls purge_memory() - it tells the Boost Pool code which of the various different singleton memory pools to purge (assuming the requested one exists due to previously having been instantiated).

Unfortunately, because the memory pool selected to be purged is size-dependent, it is critical that the size of the internal objects managed (i.e., created and destroyed in memory allocated via calls to the custom allocator) is known. Sadly, for std::map, the size of the internal objects managed by the map is neither sizeof(Obj) nor sizeof(std::pair<int const, Obj>).

My question is: How do you rigorously, and in a cross-platform way that works according to the C++11 standard, determine the size of the objects internally managed by std::map for use with boost::fast_pool_allocator?

Is this even possible?

Unanswerable answered 9/4, 2014 at 1:51 Comment(11)

It is kind of weird your example works with std::set, but not with std::map, because internally they use the same node type (_Rb_tree_node<value_type> in libstdc++; quite probably size rounding is at work in the singleton pool). Unfortunately, this type (and the actual tree member) are private, so you can't look at them from outside. – Obligee 9/4, 2014 at 3:9

All examples in the Boost Pool documentation involve lists and sets - none involve maps. Also, I have read elsewhere that map nodes do not stor pairs. – Unanswerable 9/4, 2014 at 3:40

At least in libstdc++ map (Rb_tree) nodes do store map::value_type as is (so it will be pair). And std::list<> is not really that different from std::set or std::map, the only difference is the size of overhead (2 pointers for list, circa 4 pointers for set and map). – Obligee 9/4, 2014 at 3:48

I don't think it's true that the size of objects allocated internally by map when it calls the custom allocator is the size of std::pair<K const, V>, while I think it is true that the size of objects allocated internally by set when it calls the custom allocator is the size of K const. – Unanswerable 9/4, 2014 at 4:16

Why don't you check the source and see for yourself? stl is open source on all platforms I'm aware of. – Obligee 9/4, 2014 at 4:48

I am doing that right now. It is likely to require hours, if not days, for me to understand, to be able to answer the question. The internals of std::map are extremely intricate. Perhaps you can understand it more quickly? – Unanswerable 9/4, 2014 at 4:54

The pool pertains to the internal storage of the container. The container uses the allocator's rebind member to create a new allocator that works for the types used in the implementation. Note that your set or map itself will not be allocated in the pool: only the internal storage. – Construction 9/4, 2014 at 18:59

@Obligee - Apparently, there is a specialization for std::set<int> for which the node size is the size of an int (see Nevin's answer); but for more complex types, there is additional node overhead. I was wrong (and you're almost certainly right) that there's 'typically' no difference in node storage overhead between a set and a map - I was fooled because the set I was testing with stored ints. – Unanswerable 10/4, 2014 at 3:17

On your platform they probably went with "sorted array" approach for integral type sets (btw, which toolchain are you using?). From what I understand, this was ok as long as older C++ standards are concerned, but should not be done in C++11 (if I'm not mistaken). – Obligee 10/4, 2014 at 3:33

@Obligee - If it works for int's, I would think it would work just as well for any other POD to use a sorted array and avoid the node overhead (the C++11 standard notwithstanding)... I wonder if it tests for the trait is_pod? – Unanswerable 10/4, 2014 at 3:37

Suddenly: boost/container/flat_set.hpp and boost/container/flat_map.hpp – Obligee 10/4, 2014 at 3:41

The problem is that you don't know the internal type which set uses for nodes.

While I haven't figured out how to determine this at compile time, you can write a tracing allocator which prints out the sizeof the node type when allocate is called, as in:

template<typename T>
struct SimpleAllocator : private std::allocator<T>
{
    using value_type = T;
    using pointer = typename std::allocator<T>::pointer;
    using size_type = typename std::allocator<T>::size_type;

    pointer allocate(size_type n)
    {   
        std::cout << "Allocator sizeof(T)==" << sizeof(T) << '\n';
        return std::allocator<T>::allocate(n);
    }   

    void deallocate(pointer p, size_type n)
    { return std::allocator<T>::deallocate(p, n); }
};

And a little test program (I'm testing with sets of ints):

std::set<int, std::less<int>, SimpleAllocator<int>> s;
s.insert(2);

On my system, I get output of:

Allocator sizeof(T)==32

Shattuck answered 9/4, 2014 at 18:54 Comment(0)

There isn't a truly cross platform way to deduce what you are after, since every map implementation is unhappy in its own way, but generally it is the map node that will be allocated in the pool.

This looks different for the different implementations of the standard library, so you code will have to #ifdef-ed for the different versions. With a fragility warning in place, here are the main ones for the g++/clang++/msc compilers and their std libs:

// libstdc++
boost::singleton_pool<boost::fast_pool_allocator_tag,
    sizeof(std::_Rb_tree_node<fast_map_obj::value_type>)>::purge_memory()

// libc++
boost::singleton_pool<boost::fast_pool_allocator_tag,
    sizeof(std::__tree_node<fast_map_obj::value_type, void*>)>::purge_memory()

// msvc 2013 (incl. nov ctp)
boost::singleton_pool<boost::fast_pool_allocator_tag,
    sizeof(fast_map_obj::_Node)>::purge_memory()

Here are some useful links for finding the necessary defines:

http://www.boost.org/doc/libs/1_55_0/boost/config/compiler/

http://sourceforge.net/p/predef/wiki/Compilers/

Electrochemistry answered 10/4, 2014 at 13:1 Comment(1)

Might I be right that a good proposal for an enhancement of boost::pool_allocator would be for the allocator to determine for itself the correct underlying pool to use, seeing as it is associated with one and only one container of known K, V? – Unanswerable 11/4, 2014 at 5:27

Recommended topics

Hot tags