What is the "correct OOP" way to deal with a storage pool of items of mixed types?

Asked 19/8, 2016 at 9:13 Answered 19/8, 2016 at 12:13

This was inspired by a comment to my other question here:

How do you "not repeat yourself" when giving a class an accessible "name" in C++?

nvoight: "RTTI is bad because it's a hint you are not doing good OOP. Doing your own homebrew RTTI does not make it better OOP, it just means you are reinventing the wheel on top of bad OOP."

So what is the "good OOP" solution here? The problem is this. The program is in C++, so there are also C++ specific details mentioned below. I have a "component" class (actually, a struct), which is subclassed into a number of different derived classes containing different kinds of component data. It's part of an "entity component system" design for a game. I'm wondering about the storage of the components. In particular, the current storage system has:

a "component manager" which stores an array, actually a hash map, of a single type of component. The hash map allows for lookup of a component by the entity ID of the entity it belongs to. This component manager is a template which inherits from a base, and the template parameter is the type of component to manage.
a full storage pack which is a collection of these component managers, implemented as an array of pointers to the component manager base class. This has methods to insert and extract an entity (on insertion, the components are taken out and put into the managers, on removal, they are extracted and collected into a new entity object), as well as ones to add new component managers, so if we want to add a new component type to the game, all we have to do is put another command to insert a component manager for it.

It's the full storage pack that prompted this. In particular, it offers no way of accessing a particular type of component. All the components are stored as base class pointers with no type information. What I thought of was using some kind of RTTI and storing the component managers in a map which maps type names and thus allows for lookup and then the proper downcasting of the base class pointer to the appropriate derived class (the user would call a template member on the entity storage pool to do this).

But if this RTTI means bad OOP, what would be the correct way to design this system so no RTTI is required?

Wallywalnut answered 19/8, 2016 at 9:13 Comment(1)

I think the phrase "correct-OOP" has a certain one-true-way smell to it. That's why I prefer @Vittorio Romeo's approach below who left out that part of the question. – Befit 19/8, 2016 at 13:11

Disclaimer/resources: my BCS thesis was about the design and implementation of a C++14 library for compile-time Entity-Component-System pattern generation. You can find the library here on GitHub.

This answer is meant to give you a broad overview of some techniques/ideas you can apply to implement the Entity-Component-System pattern depending on whether or not component/system types are known at compile-time.

If you want to see implementation details, I suggest you to check out my library (linked above) for an entirely compile-time based approach. diana is a very nice C library that can give you an idea of a run-time based approach.

You have several approaches, depending on the scope/scale of your project and on the nature of your entities/components/systems.

All component types and system types are known at compile-time.

This is the case analyzed in my BCS thesis - what you can do is use advanced metaprogramming techniques (e.g. using Boost.Hana) to put all component types and system types in compile-time lists and create data structures that link everything together at compile time. Pseudocode example:

namespace c 
{
    struct position { vec2f _v }; 
    struct velocity { vec2f _v };
    struct acceleration { vec2f _v };
    struct render { sprite _s; };
}

constexpr auto component_types = type_list
{
    component_type<c::position>,
    component_type<c::velocity>,
    component_type<c::acceleration>,
    component_type<c::render>
};

After defining your components, you can define your systems and tell them "what components to use":

namespace s
{
    struct movement 
    {
        template <typename TData>
        void process(TData& data, float ft)
        {
            data.for_entities([&](auto eid)
            {
                auto& p = data.get(eid, component_type<c::position>)._v;
                auto& v = data.get(eid, component_type<c::velocity>)._v;
                auto& a = data.get(eid, component_type<c::acceleration>)._v;

                v += a * ft;
                p += v * ft;
            });
        }
    };

    struct render
    {
        template <typename TData>
        void process(TData& data)
        {
            data.for_entities([&](auto eid)
            {
                auto& p = data.get(eid, component_type<c::position>)._v;
                auto& s = data.get(eid, component_type<c::render>)._s;   

                s.set_position(p);
                some_context::draw(s);
            });
        }
    };
}

constexpr auto system_types = type_list
{
    system_type<s::movement, 
        uses
        (
            component_type<c::position>,
            component_type<c::velocity>,
            component_type<c::acceleration>
        )>,
    system_type<s::render, 
        uses
        (
            component_type<c::render>
        )>
};

All that's left is using some sort of context object and lambda overloading to visit the systems and call their processing methods:

ctx.visit_systems(
    [ft](auto& data, s::movement& s)
    {
        s.process(data, ft);
    },
    [](auto& data, s::render& s)
    {
        s.process(data);
    });

You can use all the compile-time knowledge to generate appropriate data structures for components and systems inside the context object.

This is the approach I used in my thesis and library - I talked about it at C++Now 2016: "Implementation of a multithreaded compile-time ECS in C++14".

All component types and systems types are known at run-time.

This is a completely different situation - you need to use some sort of type-erasure technique to dynamically deal with components and systems. A suitable solution is using a scripting language such as LUA to deal with system logic and/or component structure (a more efficient simple component definition language can also be handwritten, so that it maps one-to-one to C++ types or to your engine's types).

You need some sort of context object where you can register component types and system types at run-time. I suggest either using unique incrementing IDs or some sort of UUIDs to identify component/system types. After mapping system logic and component structures to IDs, you can pass those around in your ECS implementation to retrieve data and process entities. You can store component data in generic resizable buffers (or associative maps, for big containers) that can be modified at run-time thanks to component structure knowledge - here's an example of what I mean:
```
auto c_position_id = ctx.register_component_type("./c_position.txt");

// ...

auto context::register_component_type(const std::string& path)
{
    auto& storage = this->component_storage.create_buffer();
    auto file_contents = get_contents_from_path(path);

    for_parsed_lines_in(file_contents, [&](auto line)
    {
        if(line.type == "int")
        {
            storage.append_data_definition(sizeof(int));
        }
        else if(line.type == "float")
        {
            storage.append_data_definition(sizeof(float));
        }
    }); 

    return next_unique_component_type_id++;
}
```

Some component types and system types are known at compile-time, others are known at run-time.

Use approach (1), and create some sort of "bridge" component and system types that implements any type-erasure technique in order to access component structure or system logic at run-time. An std::map<runtime_system_id, std::function<...>> can work for run-time system logic processing. An std::unique_ptr<runtime_component_data> or an std::aligned_storage_t<some_reasonable_size> can work for run-time component structure.

To answer your question:

But if this RTTI means bad OOP, what would be the correct way to design this system so no RTTI is required?

You need a way of mapping types to values that you can use at run-time: RTTI is an appropriate way of doing that.

If you do not want to use RTTI and you still want to use polymorphic inheritance to define your component types, you need to implement a way to retrieve some sort of run-time type ID from a derived component type. Here's a primitive way of doing that:

namespace impl
{
    auto get_next_type_id() 
    {
        static std::size_t next_type_id{0};
        return next_type_id++;
    }

    template <typename T> 
    struct type_id_storage 
    { 
        static const std::size_t id; 
    };

    template <typename T> 
    const std::size_t type_id_storage<T>::id{get_next_type_id()};
}

template <typename T>
auto get_type_id()
{
    return impl::type_id_storage<T>::id;
}

Explanation: get_next_type_id is a non-static function (shared between translation units) that stores a static incremental counter of type IDs. To retrieve the unique type ID that matches a specific component type you can call:

auto position_id = get_type_id<position_component>();

The get_type_id "public" function will retrieve the unique ID from the corresponding instantiation of impl::type_id_storage, that calls get_next_type_id() on construction, which in turn returns its current next_type_id counter value and increments it for the next type.

Particular care for this kind of approach needs to be taken to make sure it behaves correctly over multiple translation units and to avoid race conditions (in case your ECS is multithreaded). (More info here.)

Now, to solve your issue:

It's the full storage pack that prompted this. In particular, it offers no way of accessing a particular type of component.

// Executes `f` on every component of type `T`.
template <typename T, typename TF>
void storage_pack::for_components(TF&& f)
{
    auto& data = this->_component_map[get_type_id<T>()];
    for(component_base* cb : data)
    {
        f(static_cast<T&>(*cb));
    }
}

You can see this pattern in use in my old and abandoned SSVEntitySystem library. You can see an RTTI-based approach in my old and outdated “Implementation of a component-based entity system in modern C++” CppCon 2015 talk.

Emulation answered 19/8, 2016 at 9:41 Comment(11)

This doesn't really reply to the OP's question. Apart for the fact that the answer is still valuable, it shows alternative approaches, but none of them fits exactly with the points in the question. Am I right? – Dixil 19/8, 2016 at 10:12

@skypjack: I think it's quite hard to objectively answer the OP's doubts, considering that he's asking for a "correct OOP way" of doing something, and that's mostly subjective. I tried to focus on his intent (which is implementing an Entity-Component-System) and on his issue of dealing with component storage. My idea was to give OP several possible solutions on how to store components and how to design its library/application depending on the nature of his/her component types and system types. – Emulation 19/8, 2016 at 10:15

Do not misunderstand me, I find your answer really interesting. Anyway, the OP mentioned a basic (let me say) structure already existent in his/her code and it looks like none of these solutions really fit. That's all. - Side note: I've downloaded your thesis, I'm going to read it during the weekend, looks interesting as well. :-) – Dixil 19/8, 2016 at 10:19

@skypjack: after carefully reading the OP again, I agree with you - I didn't really answer the question. I'll improve the answer ASAP. Thanks! – Emulation 19/8, 2016 at 10:22

@skypjack: I added a possible solution for the OP's issue. Looking forward to hear your thoughts on it – Emulation 19/8, 2016 at 10:59

It looks fine indeed. Thank you. Another way to do that (probably having worst overall performance) would have been possible using double dispatching and type erasure maybe. I like the fact that ECS can be implemented in so many ways. – Dixil 19/8, 2016 at 11:44

See my answer for further details about the above mentioned approach. – Dixil 19/8, 2016 at 12:17

I am interested in the first approach. Questions: 1. What is "data"? 2. How do you pass a type to "get"? What is "get", anyways? 3. What is "component_type"? How is it defined? Where is it defined? – Wallywalnut 20/8, 2016 at 6:27

@mike4ty4 You can find all details in the thesis/talk video. In short: data is a proxy object generated by the context that allows the user to retrieve the component data used by the system. Types are passed to get through the type-value encoding (a.k.a. dependent typing) idiom - this is what Boost.Hana is based upon. data.get is a method in the data proxy that gives a compile-error if the system cannot access the component type you're requesting and that returns a reference to the component data otherwise (needs an entity_id, whoops - fixing answer). – Emulation 20/8, 2016 at 10:30

@mike4ty4 component_type is just a "tag type" that allows users to interact with types as values. It simply wraps the component type specified in the angle brackets and behaves like a normal value - it can be inspected at compile-time to retrieve the stored type. it is conceptually equivalent to boost::hana::type_c – Emulation 20/8, 2016 at 10:34

I just found this digging through my old questions (I'm the asker, with different username now). The project I was working on there died; but i have a new one going with the same kind of architecture and this is quite useful. That said, it's really that non-trivial that it'd take a whole thesis worth of research to derive for oneself such a seemingly very common programming problem?! But also, is there any advantage to really this versus using the C++ RTTI directly and just having a map of typeid()s? – Wallywalnut 21/4, 2022 at 23:16

Despite the good and long answer by @VittorioRomeo, I'd like to show another possible approach to the problem.
Basic concepts involved here are type erasure and double dispatching.
The one below is a minimal, working example:

#include <map>
#include <vector>
#include <cstddef>
#include <iostream>
#include <memory>

struct base_component {
    static std::size_t next() noexcept {
        static std::size_t v = 0;
        return v++;
    }
};

template<typename D>
struct component: base_component {
    static std::size_t type() noexcept {
        static const std::size_t t = base_component::next();
        return t;
    }
};

struct component_x: component<component_x> { };
struct component_y: component<component_y> { };

struct systems {
    void elaborate(std::size_t id, component_x &) { std::cout << id << ": x" << std::endl; }
    void elaborate(std::size_t id, component_y &) { std::cout << id << ": y" << std::endl; }
};

template<typename C>
struct component_manager {
    std::map<std::size_t, C> id_component;
};

struct pack {
    struct base_handler {
        virtual void accept(systems *) = 0;
    };

    template<typename C>
    struct handler: base_handler {
        void accept(systems *s) {
            for(auto &&el: manager.id_component) s->elaborate(el.first, el.second);
        }

        component_manager<C> manager;
    };

    template<typename C>
    void add(std::size_t id) {
        if(handlers.find(C::type()) == handlers.cend()) {
            handlers[C::type()] = std::make_unique<handler<C>>();
        }

        handler<C> &h = static_cast<handler<C>&>(*handlers[C::type()].get());
        h.manager.id_component[id] = C{};
    }

    template<typename C>
    void walk(systems *s) {
        if(handlers.find(C::type()) != handlers.cend()) {
            handlers[C::type()]->accept(s);
        }
    }

private:
    std::map<std::size_t, std::unique_ptr<base_handler>> handlers;
};

int main() {
    pack coll;
    coll.add<component_x>(1);
    coll.add<component_y>(1);
    coll.add<component_x>(2);
    systems sys;
    coll.walk<component_x>(&sys);
    coll.walk<component_y>(&sys);
}

I tried to be true to the few points mentioned by the OP, so as to provide a solution that fits the real problem.

Let me know with a comment if the example is clear enough for itself or if a few more details are required to fully explain how and why it works actually.

Dixil answered 19/8, 2016 at 12:13 Comment(0)

If I understand correctly, you want a collection, such as a map, where the values are of different type, and you want to know what type is each value (so you can downcast it).

Now, a "good OOP" is a design which you don't need to downcast. You just call the mothods (which are common to the base class and the deriveratives) and the derived class performs a different operation than its parent for the same method.

If this is not the case, for example, where you need to use some other data from the child and thus you want to downcast, it means, in most cases, you didn't work hard enough on the design. I don't say it's always possible, but you need to design it in such a way the polymorphism is your only tool. That's a "good OOP".

Anyway, if you really need to downcast, you don't have to use RTTI. You can use a common field (string) in the base class, that marks the class type.

Similitude answered 19/8, 2016 at 9:23 Comment(0)

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags