Ad hoc polymorphism and heterogeneous containers with value semantics
Asked Answered
R

5

37

I have a number of unrelated types that all support the same operations through overloaded free functions (ad hoc polymorphism):

struct A {};

void use(int x) { std::cout << "int = " << x << std::endl; }
void use(const std::string& x) { std::cout << "string = " << x << std::endl; }
void use(const A&) { std::cout << "class A" << std::endl; }

As the title of the question implies, I want to store instances of those types in an heterogeneous container so that I can use() them no matter what concrete type they are. The container must have value semantics (ie. an assignment between two containers copies the data, it doesn't share it).

std::vector<???> items;
items.emplace_back(3);
items.emplace_back(std::string{ "hello" });
items.emplace_back(A{});

for (const auto& item: items)
    use(item);
// or better yet
use(items);

And of course this must be fully extensible. Think of a library API that takes a vector<???>, and client code that adds its own types to the already known ones.


The usual solution is to store (smart) pointers to an (abstract) interface (eg. vector<unique_ptr<IUsable>>) but this has a number of drawbacks -- from the top of my head:

  • I have to migrate my current ad hoc polymorphic model to a class hierarchy where every single class inherits from the common interface. Oh snap! Now I have to write wrappers for int and string and what not... Not to mention the decreased reusability/composability due to the free member functions becoming intimately tied to the interface (virtual member functions).
  • The container loses its value semantics: a simple assignment vec1 = vec2 is impossible if we use unique_ptr (forcing me to manually perform deep copies), or both containers end up with shared state if we use shared_ptr (which has its advantages and disadvantages -- but since I want value semantics on the container, again I am forced to manually perform deep copies).
  • To be able to perform deep copies, the interface must support a virtual clone() function which has to be implemented in every single derived class. Can you seriously think of something more boring than that?

To sum it up: this adds a lot of unnecessary coupling and requires tons of (arguably useless) boilerplate code. This is definitely not satisfactory but so far this is the only practical solution I know of.


I have been searching for a viable alternative to subtype polymorphism (aka. interface inheritance) for ages. I play a lot with ad hoc polymorphism (aka. overloaded free functions) but I always hit the same hard wall: containers have to be homogeneous, so I always grudgingly go back to inheritance and smart pointers, with all the drawbacks already listed above (and probably more).

Ideally, I'd like to have a mere vector<IUsable> with proper value semantics, without changing anything to my current (absence of) type hierarchy, and keep ad hoc polymorphism instead of requiring subtype polymorphism.

Is this possible? If so, how?

Ranjiv answered 17/9, 2013 at 18:5 Comment(3)
Will Boost.Any help you?Adverb
@Adverb I'm not really used to boost::any. I can see how it stores unrelated types, but to actually use() the underlying object one has to know its type first, right? If so, that kinda defeats the point (unless, of course, I missed something important along the way).Ranjiv
If you only have the objects to potentially call use on them, you can use std::vector<std::function<void()>> and have template<typename T> std::function<void()> make_use_fn(T&& v) { return [v=std::forward<T>(v)]{ return use(v); }; } to do items.push_back(make_use_fn(3)); items.push_back(make_use_fn(std::string{ "hello" }));Disjointed
C
27

Different alternatives

It is possible. There are several alternative approaches to your problem. Each one has different advantages and drawbacks (I will explain each one):

  1. Create an interface and have a template class which implements this interface for different types. It should support cloning.
  2. Use boost::variant and visitation.

Blending static and dynamic polymorphism

For the first alternative you need to create an interface like this:

class UsableInterface 
{
public:
    virtual ~UsableInterface() {}
    virtual void use() = 0;
    virtual std::unique_ptr<UsableInterface> clone() const = 0;
};

Obviously, you don't want to implement this interface by hand everytime you have a new type having the use() function. Therefore, let's have a template class which does that for you.

template <typename T> class UsableImpl : public UsableInterface
{
public:
    template <typename ...Ts> UsableImpl( Ts&&...ts ) 
        : t( std::forward<Ts>(ts)... ) {}
    virtual void use() override { use( t ); }
    virtual std::unique_ptr<UsableInterface> clone() const override
    {
        return std::make_unique<UsableImpl<T>>( t ); // This is C++14
        // This is the C++11 way to do it:
        // return std::unique_ptr<UsableImpl<T> >( new UsableImpl<T>(t) ); 
    }

private:
    T t;
};

Now you can actually already do everything you need with it. You can put these things in a vector:

std::vector<std::unique_ptr<UsableInterface>> usables;
// fill it

And you can copy that vector preserving the underlying types:

std::vector<std::unique_ptr<UsableInterface>> copies;
std::transform( begin(usables), end(usables), back_inserter(copies), 
    []( const std::unique_ptr<UsableInterface> & p )
    { return p->clone(); } );

You probably don't want to litter your code with stuff like this. What you want to write is

copies = usables;

Well, you can get that convenience by wrapping the std::unique_ptr into a class which supports copying.

class Usable
{
public:
    template <typename T> Usable( T t )
        : p( std::make_unique<UsableImpl<T>>( std::move(t) ) ) {}
    Usable( const Usable & other ) 
        : p( other.clone() ) {}
    Usable( Usable && other ) noexcept 
        : p( std::move(other.p) ) {}
    void swap( Usable & other ) noexcept 
        { p.swap(other.p); }
    Usable & operator=( Usable other ) 
        { swap(other); }
    void use()
        { p->use(); }
private:
    std::unique_ptr<UsableInterface> p;
};

Because of the nice templated contructor you can now write stuff like

Usable u1 = 5;
Usable u2 = std::string("Hello usable!");

And you can assign values with proper value semantics:

u1 = u2;

And you can put Usables in an std::vector

std::vector<Usable> usables;
usables.emplace_back( std::string("Hello!") );
usables.emplace_back( 42 );

and copy that vector

const auto copies = usables;

You can find this idea in Sean Parents talk Value Semantics and Concepts-based Polymorphism. He also gave a very brief version of this talk at Going Native 2013, but I think this is to fast to follow.

Moreover, you can take a more generic approach than writing your own Usable class and forwarding all the member functions (if you want to add other later). The idea is to replace the class Usable with a template class. This template class will not provide a member function use() but an operator T&() and operator const T&() const. This gives you the same functionality, but you don't need to write an extra value class every time you facilitate this pattern.

A safe, generic, stack-based discriminated union container

The template class boost::variant is exactly that and provides something like a C style union but safe and with proper value semantics. The way to use it is this:

using Usable = boost::variant<int,std::string,A>;
Usable usable;

You can assign from objects of any of these types to a Usable.

usable = 1;
usable = "Hello variant!";
usable = A();

If all template types have value semantics, then boost::variant also has value semantics and can be put into STL containers. You can write a use() function for such an object by a pattern that is called the visitor pattern. It calls the correct use() function for the contained object depending on the internal type.

class UseVisitor : public boost::static_visitor<void>
{
public:
    template <typename T>
    void operator()( T && t )
    {
        use( std::forward<T>(t) );
    }
}

void use( const Usable & u )
{
    boost::apply_visitor( UseVisitor(), u );
}

Now you can write

Usable u = "Hello";
use( u );

And, as I already mentioned, you can put these thingies into STL containers.

std::vector<Usable> usables;
usables.emplace_back( 5 );
usables.emplace_back( "Hello world!" );
const auto copies = usables;

The trade-offs

You can grow the functionality in two dimensions:

  • Add new classes which satisfy the static interface.
  • Add new functions which the classes must implement.

In the first approach I presented it is easier to add new classes. The second approach makes it easier to add new functionality.

In the first approach it it impossible (or at least hard) for client code to add new functions. In the second approach it is impossible (or at least hard) for client code to add new classes to the mix. A way out is the so-called acyclic visitor pattern which makes it possible for clients to extend a class hierarchy with new classes and new functionality. The drawback here is that you have to sacrifice a certain amount of static checking at compile-time. Here's a link which describes the visitor pattern including the acyclic visitor pattern along with some other alternatives. If you have questions about this stuff, I'm willing to answer.

Both approaches are super type-safe. There is not trade-off to be made there.

The run-time-costs of the first approach can be much higher, since there is a heap allocation involved for each element you create. The boost::variant approach is stack based and therefore is probably faster. If performance is a problem with the first approach consider to switch to the second.

Cheer answered 18/9, 2013 at 9:16 Comment(4)
Thanks. This is basically my answer and user2790567's answer combined in one post, but with much better explanations in both cases and more food for thought. Really a great job.Ranjiv
This acyclic visitor pattern you just added looks awesome at first glance. Digesting it will take some time though, so I won't bug you with questions right now (even though I'll probably have a couple at some point). Thanks again!Ranjiv
I waited a bit before accepting your answer (in order to see if anyone would come up with something else), but your additional explanations, the whole Trade-offs section and the clean formatting really deserve it. ;)Ranjiv
Is there any possible way for us to compare the underlying contents of the vectors without using dynamic casting? Let say that we have 2 vectors Usable and we are trying to compare if their underlying contents are the same, how could we do that?Motorcycle
R
18

Credit where it's due: When I watched Sean Parent's Going Native 2013 "Inheritance Is The Base Class of Evil" talk, I realized how simple it actually was, in hindsight, to solve this problem. I can only advise you to watch it (there's much more interesting stuff packed in just 20 minutes, this Q/A barely scratches the surface of the whole talk), as well as the other Going Native 2013 talks.


Actually it's so simple it hardly needs any explanation at all, the code speaks for itself:

struct IUsable {
  template<typename T>
  IUsable(T value) : m_intf{ new Impl<T>(std::move(value)) } {}
  IUsable(IUsable&&) noexcept = default;
  IUsable(const IUsable& other) : m_intf{ other.m_intf->clone() } {}
  IUsable& operator =(IUsable&&) noexcept = default;
  IUsable& operator =(const IUsable& other) { m_intf = other.m_intf->clone(); return *this; }

  // actual interface
  friend void use(const IUsable&);

private:
  struct Intf {
    virtual ~Intf() = default;
    virtual std::unique_ptr<Intf> clone() const = 0;
    // actual interface
    virtual void intf_use() const = 0;
  };
  template<typename T>
  struct Impl : Intf {
    Impl(T&& value) : m_value(std::move(value)) {}
    virtual std::unique_ptr<Intf> clone() const override { return std::unique_ptr<Intf>{ new Impl<T>(*this) }; }
    // actual interface
    void intf_use() const override { use(m_value); }
  private:
    T m_value;
  };
  std::unique_ptr<Intf> m_intf;
};

// ad hoc polymorphic interface
void use(const IUsable& intf) { intf.m_intf->intf_use(); }

// could be further generalized for any container but, hey, you get the drift
template<typename... Args>
void use(const std::vector<IUsable, Args...>& c) {
  std::cout << "vector<IUsable>" << std::endl;
  for (const auto& i: c) use(i);
  std::cout << "End of vector" << std::endl;
}

int main() {
  std::vector<IUsable> items;
  items.emplace_back(3);
  items.emplace_back(std::string{ "world" });
  items.emplace_back(items); // copy "items" in its current state
  items[0] = std::string{ "hello" };
  items[1] = 42;
  items.emplace_back(A{});
  use(items);
}

// vector<IUsable>
// string = hello
// int = 42
// vector<IUsable>
// int = 3
// string = world
// End of vector
// class A
// End of vector

As you can see, this is a rather simple wrapper around a unique_ptr<Interface>, with a templated constructor that instantiates a derived Implementation<T>. All the (not quite) gory details are private, the public interface couldn't be any cleaner: the wrapper itself has no member functions except construction/copy/move, the interface is provided as a free use() function that overloads the existing ones.

Obviously, the choice of unique_ptr means that we need to implement a private clone() function that is called whenever we want to make a copy of an IUsable object (which in turn requires a heap allocation). Admittedly one heap allocation per copy is quite suboptimal, but this is a requirement if any function of the public interface can mutate the underlying object (ie. if use() took non-const references and modified them): this way we ensure that every object is unique and thus can freely be mutated.


Now if, as in the question, the objects are completely immutable (not only through the exposed interface, mind you, I really mean the whole objects are always and completely immutable) then we can introduce shared state without nefarious side effects. The most straightforward way to do this is to use a shared_ptr-to-const instead of a unique_ptr:

struct IUsableImmutable {
  template<typename T>
  IUsableImmutable(T value) : m_intf(std::make_shared<const Impl<T>>(std::move(value))) {}
  IUsableImmutable(IUsableImmutable&&) noexcept = default;
  IUsableImmutable(const IUsableImmutable&) noexcept = default;
  IUsableImmutable& operator =(IUsableImmutable&&) noexcept = default;
  IUsableImmutable& operator =(const IUsableImmutable&) noexcept = default;

  // actual interface
  friend void use(const IUsableImmutable&);

private:
  struct Intf {
    virtual ~Intf() = default;
    // actual interface
    virtual void intf_use() const = 0;
  };
  template<typename T>
  struct Impl : Intf {
    Impl(T&& value) : m_value(std::move(value)) {}
    // actual interface
    void intf_use() const override { use(m_value); }
  private:
    const T m_value;
  };
  std::shared_ptr<const Intf> m_intf;
};

// ad hoc polymorphic interface
void use(const IUsableImmutable& intf) { intf.m_intf->intf_use(); }

// could be further generalized for any container but, hey, you get the drift
template<typename... Args>
void use(const std::vector<IUsableImmutable, Args...>& c) {
  std::cout << "vector<IUsableImmutable>" << std::endl;
  for (const auto& i: c) use(i);
  std::cout << "End of vector" << std::endl;
}

Notice how the clone() function has disappeared (we don't need it any more, we just share the underlying object and it's no bother since it's immutable), and how copy is now noexcept thanks to shared_ptr guarantees.

The fun part is, the underlying objects have to be immutable, but you can still mutate their IUsableImmutable wrapper so it's still perfectly OK to do this:

  std::vector<IUsableImmutable> items;
  items.emplace_back(3);
  items[0] = std::string{ "hello" };

(only the shared_ptr is mutated, not the underlying object itself so it doesn't affect the other shared references)

Ranjiv answered 17/9, 2013 at 18:5 Comment(6)
I realize I didn't explain much, only the most important parts, so feel free to ask for clarifications if you don't understand something, and/or edit this Q/A to add more details or correct my poor English.Ranjiv
After all, Copy-on-Write is plain awesome if you never write. ;)Selfcongratulation
@Selfcongratulation Well the thing is, the wrapper (IUsableImmutable) is COW just not the underlying wrapped object. But I do get your point. :pRanjiv
The sound is really bad, but this talk from last year's C++Now could be considered as an extended version of the one from Going Native. Link to the slides. Just thought that it could be interesting.Nanine
@cv_and_he That's very kind of you but I'm already struggling a lot with MS's perfectly clean recordings of GN2013 which I must listen 3 or 4 times before understanding anything... I'm no native English speaker... spoken English is completely alien to me even though I can handle written English without too much suffering. Oh well, knowing Sean his slides alone will be well worth the pain... ;)Ranjiv
For people who want to learn more, the general term for this kind of technique is type erasure.Db
N
6

Maybe boost::variant?

#include <iostream>
#include <string>
#include <vector>
#include "boost/variant.hpp"

struct A {};

void use(int x) { std::cout << "int = " << x << std::endl; }
void use(const std::string& x) { std::cout << "string = " << x << std::endl; }
void use(const A&) { std::cout << "class A" << std::endl; }

typedef boost::variant<int,std::string,A> m_types;

class use_func : public boost::static_visitor<>
{
public:
    template <typename T>
    void operator()( T & operand ) const
    {
        use(operand);
    }
};
int main()
{
    std::vector<m_types> vec;
    vec.push_back(1);
    vec.push_back(2);
    vec.push_back(std::string("hello"));
    vec.push_back(A());
    for (int i=0;i<4;++i)
        boost::apply_visitor( use_func(), vec[i] );
    return 0;
}

Live example: http://coliru.stacked-crooked.com/a/e4f4ccf6d7e6d9d8

Novgorod answered 18/9, 2013 at 8:27 Comment(3)
Nice, I didn't know about boost::apply_visitor. I have one small criticism though: this is not easily extensible. Sure you can modify the definition of m_types to include new types, but that's irrelevant if you have an initial set of types which is part of a library and want to allow client code to extend it. Do you know if boost::any could be used in the same way instead of variant? That would both solve this shortcoming and explain ZijingWu's comment. :)Ranjiv
I slightly modified my question to include extensibility as a requirement (this seemed obvious to me since I was looking for an equivalent to vector<unique_ptr<Interface>> but as always, "obvious" is extremely subjective). Unfortunately this renders your answer irrelevant to the question (it doesn't fit all the requirements any more), I'm sorry about that especially since it's my fault that I didn't write a complete question in the first place. Still, that's an excellent solution when one knows all the possible types in advance, it's still a +1 from me. :)Ranjiv
@syam: You can add more types to the list of types without modifying m_types, using template meta programming. That way you can extend this.Walrus
E
3

The other answers earlier (use vtabled interface base class, use boost::variant, use virtual base class inheritance tricks) are all perfectly good and valid solutions for this problem, each with a difference balance of compile time versus run time costs. I would suggest though that instead of boost::variant, on C++ 11 and later use eggs::variant instead which is a reimplementation of boost::variant using C++ 11/14 and it is enormously superior on design, performance, ease of use, power of abstraction and it even provides a fairly full feature subset on VS2013 (and a full feature set on VS2015). It's also written and maintained by a lead Boost author.

If you are able to redefine the problem a bit though - specifically, that you can lose the type erasing std::vector in favour of something much more powerful - you could use heterogenous type containers instead. These work by returning a new container type for each modification of the container, so the pattern must be:

newtype newcontainer=oldcontainer.push_back(newitem);

These were a pain to use in C++ 03, though Boost.Fusion makes a fair fist of making them potentially useful. Actually useful usability is only possible from C++ 11 onwards, and especially so from C++ 14 onwards thanks to generic lambdas which make working with these heterogenous collections very straightforward to program using constexpr functional programming, and probably the current leading toolkit library for that right now is proposed Boost.Hana which ideally requires clang 3.6 or GCC 5.0.

Heterogeneous type containers are pretty much the 99% compile time 1% run time cost solution. You'll see a lot of compiler optimiser face plants with current compiler technology e.g. I once saw clang 3.5 generate 2500 opcodes for code which should have generated two opcodes, and for the same code GCC 4.9 spat out 15 opcodes 12 of which didn't actually do anything (they loaded memory into registers and did nothing with those registers). All that said, in a few years time you will be able to achieve optimal code generation for heterogeneous type containers, at which point I would expect they'll become the next gen form of C++ metaprogramming where instead of arsing around with templates we'll be able to functionally program the C++ compiler using actual functions!!!

Elysia answered 24/2, 2015 at 16:10 Comment(1)
I have just published constexpr support for eggs::variant a few hours ago, so VS2015 does no longer provide a full feature set. Everything but that should still be supported.Consolidation
C
1

Heres an idea I got recently from std::function implementation in libstdc++:

Create a Handler<T> template class with a static member function that knows how to copy, delete and perform other operations on T.

Then store a function pointer to that static functon in the constructor of your Any class. Your Any class doesn't need to know about T then, it just needs this function pointer to dispatch the T-specific operations. Notice that the signature of the function is independant of T.

Roughly like so:

struct Foo { ... }
struct Bar { ... }
struct Baz { ... }

template<class T>
struct Handler
{
    static void action(Ptr data, EActions eAction)
    {
       switch (eAction)
       {
       case COPY:
           call T::T(...);

       case DELETE:
           call T::~T();

       case OTHER:
           call T::whatever();
       }
    }
}

struct Any
{
    Ptr handler;
    Ptr data;

    template<class T>
    Any(T t)
      : handler(Handler<T>::action)
      , data(handler(t, COPY))
    {}

    Any(const Any& that)
       : handler(that.handler)
       , data(handler(that.data, COPY))
    {}

    ~Any()
    {
       handler(data, DELETE);
    }
};

int main()
{
    vector<Any> V;

    Foo foo; Bar bar; Baz baz;

    v.push_back(foo);
    v.push_back(bar);
    v.push_back(baz);
}

This gives you type erasure while still maintaining value semantics, and does not require modification of the contained classes (Foo, Bar, Baz), and doesn't use dynamic polymorphism at all. It's pretty cool stuff.

Careycarfare answered 18/9, 2013 at 7:1 Comment(5)
Pretty cool stuff indeed. This "handler" approach is a nifty trick.Ranjiv
This amounts to basically the same as a virtual function, since it's stored in a function pointer. It's a "manual" vtable with the dispatch happening inside the function. Boost.Function has also been doing this for quite some time.Selfcongratulation
@Xeo: It is similar to a hand-written vtable yes, except it is stripped-down to be more performant, smaller, and it is more extensible. A virtual call has higher overhead then simply indirecting a function pointer, and a polymorphic class has more in its header than just a pointer.Careycarfare
The performace difference of a virtual function call compared to calling a function through a function poiner is very little. In essence the difference is one assembler instruction called MOV. When passing an extra argument like eAction the advantage is nullified and you'll get the same performance. If you add another action (other than use()) you can easily forget to add a case in the switch statement. With a vtable the compiler does that job for you. Manually writing your own vtable in this style makes your code harder to maintain.Cheer
@RalphTandetzky: Compared to the version where you have a Base* in the Any object and then have a Derived<T> : Base for the type, and then use a virtual clone method and virtual destructor, the above pattern is smaller and faster. Plus for small types you can reuse the data pointer as storage (this is what std::function and boost::function do). If you draw it out you can see why. As for the ease of maintenance and readability, I don't really speak to it - although note that any production compiler will complain if a case isn't handled in a switch.Careycarfare

© 2022 - 2024 — McMap. All rights reserved.