Why use base classes for list/tree nodes in the C++ standard library?
Asked Answered
K

1

7

In libstdc++ and libc++ the internal nodes for lists (e.g. list and forward_list) and other trees are constructed in (at least) two parts: a node base class; and the node class itself. For example, rather than:

struct sl_full_node {
  sl_full_node *p;
  int value;
};

...we would see something like:

struct sl_node_base {
  sl_node_base *p;
};

struct sl_node : sl_node_base {
  int value;
};

A simple, self-contained example of this idiom can be found in the singly linked list proposal from 2008 here. What are the benefits of this approach?

Kerley answered 27/5, 2020 at 8:58 Comment(5)
For instance, in case of std::list and libc++, __list_node_base class does not seem to be used anywhere else than just as a base class of __list_node. I guess the reason is mainly modularity here, that is, the single-entity/class-single-responsibility principle. Node base takes care about linking of nodes to other nodes, while the node class itself takes care about stored values.Overdose
It is used, though it's not immediately apparent: there are quite a few typedefs. For example, the __list_iterator has one data member: __ptr_ which is of type __link_pointer, itself ultimately an instantiation of the __list_node_base template.Kerley
I meant that it is not "instantiated" anywhere else (if I looked correctly), that is, only objects of __list_node_base are those (base) subobjects of _list_node. __link_pointer is a pointer.Overdose
Yes, this usage pattern is the same in the example from the 2008 proposal I linked to - which if refactored to use a single class - removes the need for the static_casts. Splitting something like sl_full_node (above) seems a very granular approach - surely there is also a performance benefit?Kerley
@user2023370, there is not a concrete performance benefit. However, there may be a relevant benefit in terms of occupied space where the node base is used as a dummy node. I also assume that there is another reason, but it is necessary to contact those who actively participated in the implementation in order to obtain a definitive answer.Rochette
R
2

All STL containers require to provide an end() sentinel. This means that, if bidirectional iteration is supported, there must be a valid past-the-end position on which a decrement operation can be performed through the iterator interface. In this part, only node-based containers that allow bidirectional iteration will be treated, and therefore the std::forward_list container will be momentarily ignored.

If the end() sentinel were simply represented by any sentinel value, such as a null pointer, it would not be possible to perform the aforementioned decrement operation in order to move to the previous node. One way to solve this issue is the use of a "dummy node".

In the libstdc++ and libc++ implementations, this node is statically allocated in order that it is present in the container throughout the lifetime of the object and allows for a valid past-the-end position even if the sequence is completely empty. It is important to note that the word "sequence" is used to refer to the in-order iteration of all values in the container, regardless of whether the nodes are arranged linearly or structured in a more complex graph (std::set).

If the result were achieved by allocating a complete node, this would result in two side effects:

  • the space that is occupied by the dummy node would not be limited by the number of pointers, which are the only member attributes required to allow linkage with other nodes, but would be proportional to the type T size;
  • if the type T is stored directly in the node, when the object is constructed, a method should be found to avoid the T object construction.

The last issue can be solved by replacing the type T with an aligned storage, but this would not solve the issue of occupied space. This situation makes it necessary to introduce a specific structure for the dummy node, which contain only pointers. However, the creation of such a structure would lead to an inter-operability problem because a pointer to node can not be converted to a pointer to dummy node and vice versa.

A solution consists of deriving the complete node from the dummy node, to which the data attribute, or an aligned storage, is added.

Example:

    struct _List_node_base
    {
      _List_node_base* _M_next;
      _List_node_base* _M_prev;
    };

    template <typename _Tp>
    struct _List_node : public _List_node_base
    {
      _Tp _M_data;
    };

Similar reasoning can be applied to the std::forward_list container, which, however, requires a before_begin() sentinel on which an increment operation can be performed through the iterator interface.

Rochette answered 10/12, 2023 at 14:41 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.