Accessing common part of an union from base class

Asked 11/10, 2015 at 19:18 Answered 22/10, 2015 at 9:10

I have a Result<T> template class that holds a union of some error_type and T. I would like to expose the common part (the error) in a base class without resorting to virtual functions.

Here is my attempt:

using error_type = std::exception_ptr;

struct ResultBase
{
    error_type error() const
    {
        return *reinterpret_cast<const error_type*>(this);
    }

protected:
    ResultBase() { }
};

template <class T>
struct Result : ResultBase
{
    Result() { new (&mError) error_type(); }

    ~Result() { mError.~error_type(); }

    void setError(error_type error) { mError = error; }

private:
    union { error_type mError; T mValue; };
};

static_assert(std::is_standard_layout<Result<int>>::value, "");

void check(bool condition) { if (!condition) std::terminate(); }

void f(const ResultBase& alias, Result<int>& r)
{
    r.setError(std::make_exception_ptr(std::runtime_error("!")));
    check(alias.error() != nullptr);

    r.setError(std::exception_ptr());
    check(alias.error() == nullptr);
}

int main()
{
    Result<int> r;
    f(r, r);
}

(This is stripped down, see extended version if unclear).

The base class takes advantage of standard-layout to find the address of the error field at offset zero. Then it casts the pointer to error_type (assuming this really is the current dynamic type of the union).

Am I right to assume this is portable? Or is it breaking some pointer aliasing rule?

EDIT: My question was 'is this portable', but many commenters are puzzled by the use of inheritance here, so I will clarify.

First, this is a toy example. Please don't take it too literally or assume there is no use for the base class.

The design has three goals:

Compactness. Error and result are mutually exclusive, so they should be in a union.
No runtime overhead. Virtual functions are excluded (plus, holding vtable pointer conflicts with goal 1). RTTI also excluded.
Uniformity. The common fields of different Result types should be acessible via homogenous pointers or wrappers. For example: if instead of Result<T> we were talking about Future<T>, it should be possible to do whenAny(FutureBase& a, FutureBase& b) regardless of a / b concrete type.

If willing to sacrifice (1), this becomes trivial. Something like:

struct ResultBase
{
    error_type mError;
};

template <class T>
struct Result : ResultBase
{
    std::aligned_storage_t<sizeof(T), alignof(T)> mValue;
};

If instead of goal (1) we sacrifice (2), it might look like this:

struct ResultBase
{
    virtual error_type error() const = 0;
};

template <class T>
struct Result : ResultBase
{
    error_type error() const override { ... }

    union { error_type mError; T mValue; };
};

Again, the justification is not relevant. I just want to make sure original sample is conformant C++11 code.

Greg answered 11/10, 2015 at 19:18 Comment(31)

Regardless of whether this is defined behavior or not, this is fragile as hell. What you seem to want to achieve looks like a sum type, i.e. Either Error Int, just with more than two possibilities, right? Like "a value is either an error, an int, a string or a MyObject"? – Blondy 11/10, 2015 at 19:24

@DanielJour Only two possibilities: an error or some T, where for the purpose of this question both are assumed standard layout. I need the union for compactness, so I can't put just the common part in base class. I agree it looks fragile, normally I would use virtual functions but here I want max performance. – Greg 11/10, 2015 at 19:29

Then what's wrong with a template<class T> struct Errorneous { union { error_type error; T value;} data; bool is_error; }; ? Why use inheritance here? – Blondy 11/10, 2015 at 19:32

Because I need a limited form of runtime polimorphism, based on common layout instead of vtable. In reality, my 'results' are similar to Future<R> and I would like to support whenAny(Future<T1>&, Future<T2>&, ...). – Greg 11/10, 2015 at 19:38

I honestly have no idea what inheritance is buying you here. Have you seen Alexandrescu's talk on Expected<T>? It's very similar, his implementation is very clean and does not use inheritance. – Orectic 11/10, 2015 at 20:24

Imagine iterating over a vector<ExpectedBase*> to check if any item has exception. Expected<T> doesn't have a non-template base class, so you can't easily mix ham types :). – Greg 11/10, 2015 at 20:46

I’m pretty sure the reinterpret_cast is not guaranteed to work the way you want. You can get actual polymorphism through multiple inheritance and dynamic_cast. – Jeddy 12/10, 2015 at 0:40

But the objects themselves would contain different things, and you'd have to have another channel to actually extract the type. Honestly if you're going to do this you may as well just stuff them in a boost::any. – Orectic 12/10, 2015 at 1:6

@NirFriedman The objects may contain different result types, but this is irrelevant for certain operations. I have updated the question, hope whenAny example makes more sense. When the actual result needs to be extracted, T is known so we just read it via derived class. Regarding boost::any - I can't use type erasure that relies on virtual methods. – Greg 12/10, 2015 at 7:1

How can you tell if result contains an error or a T value? – Fisherman 12/10, 2015 at 7:5

See extended version. – Greg 12/10, 2015 at 7:11

Why no virtual methods? – Orectic 12/10, 2015 at 13:40

Also, if your goal is to support whenAny(Future<T1> &, Future<T2>&...), none of this is necessary, you realize that? – Orectic 12/10, 2015 at 14:21

I'd like to support whenAny(FutureBase&, FutureBase& ...), and have already detailed the implementation goals. This is not productive. – Greg 12/10, 2015 at 15:20

I think the code you've written is guaranteed to be ok. EBCO is required since C++11, and unions are not allowed to have padding at the beginning. Given those two things, the address of the base class and the error must be identical. – Orectic 12/10, 2015 at 18:4

However, I still think this is an X-Y problem. You should write template <class ... Ts> whenAny(Ts ... ts). You will be able to do whatever you need to do faster if you keep the types around. You started off with a whenAny more like that (earlier in comments), and changed to type erased version. It seems like you are insisting on an implementation independent of whether an alternative meets your goals. – Orectic 12/10, 2015 at 18:4

Thank you for the opinion on conformance. Please expand into an answer if you'd like. – Greg 13/10, 2015 at 6:46

A templated whenAny would work only if the arg count is known at compile time. I'm not sure about the performance benefit (strict aliasing rules might help), though it will affect compile time and binary size. More importantly, what to do when the item count is determined at run-time? whenAny(std::vector<FutureBase*>) works for me. Without a base class, you'd need a wrapper for type erasure and this normally relies on dynamic dispatch. I'm not insisting on a particular implementation, just pointing out that none of the alternatives meet all 3 goals. – Greg 13/10, 2015 at 6:46

It is not possible (I mean literally impossible, without resorting to a mechanism equivalent in assembly to type erasure) without using type erasure, oh by the way std::exception_ptr is already using type erasure so you ends up using type erasure twice, while using your own type erasure allows you yo use type erasure just once. See my answer, also the answer from PiotrNycz is very good. @ValentinMilea – Lollop 18/10, 2015 at 12:51

Unless you are trying to get rid of mHasErrorin your complete version, it sounds like you should move the discriminator (mHasError) to the base class, and use something similar to Result<error_type> when you have an error. – Deakin 18/10, 2015 at 13:11

In your real code you are holding a boolean discriminant in the derived class. Why not in the base? – Entwistle 18/10, 2015 at 14:31

std::vector<FutureBase*> this type, like any C++ type, has sizeof determined at compile time. However it is not guaranteed to work as a part of a union. – Entwistle 18/10, 2015 at 15:0

@n.m. The discriminant bool could be placed in the base class, but then the derived class would no longer be (formally) standard-layout. To be standard-layout, all non-static data members have to be declared in the same class. – Greg 18/10, 2015 at 15:33

@n.m. R must be standard-layout, I can't be std::vector. – Greg 18/10, 2015 at 15:34

That's the point, if you put the bool in the base class, you might not need the standard layout anymore. You would use Result<Thing> for some things, Result<error_type> for others, and could store pointers to both as Base* and access the bool in that Base. – Deakin 19/10, 2015 at 5:34

Result<Thing> holds either Thing or error_type depending on run-time conditions. So it doesn't make sense in my case to have Result<error_type>. – Greg 19/10, 2015 at 6:14

I just realized there's no way to tell if there is an error, and therefore that it's safe to call the error function, unless there's additional State outside of the Union specifying the type. since additional non-union members are required, then if error_type is small, simply move it outside of the Union. (and If you solve this problem, then you also don't need standard layout) – Impi 14/5 at 19:18

If you don't want additional state, then @Lollop is right, this is not possible without some form of type erasure. Your current implementations of error() (and has_error()` in the "real" code, both incorrectly claim there's an error when there actually isn't, for exactly this issue. You must use virtual functions, or additional space, in order to tell if there's an error or not. Or rely on undefined behavior and make assumptions about the byte patterns of T vs std::exception_ptr. Those are the only solutions. – Impi 14/5 at 19:29

coliru.stacked-crooked.com/a/d4c9d6ee9d1d7789 is the best I could whip up in 30 minutes – Impi 14/5 at 20:9

it's worth noting that your solution, and similar solutions may or may not actually be faster than virtual functions. virtual functions are predictable branchless jumps. your version and my version instead use a Boolean parameter to keep track of if it's an error or not, which means it has to do a value read, comparison, and then jump, with the exact same predictability as the virtual function. So the virtual functions are probably faster than our hacks. – Impi 14/5 at 20:14

" virtual functions are slow" is true relative to non-virtual functions. But they are faster than if-checks. – Impi 14/5 at 20:16

Here is my own attempt at an answer focusing strictly on portability.

Standard-layout is defined in §9.1[class.name]/7:

A standard-layout class is a class that:

has no non-static data members of type non-standard-layout class (or array of such types) or reference,

has no virtual functions (10.3) and no virtual base classes (10.1),

has the same access control (Clause 11) for all non-static data members,

has no non-standard-layout base classes,

either has no non-static data members in the most derived class and at most one base class with non-static data members, or has no base classes with non-static data members, and

has no base classes of the same type as the first non-static data member.

By this definition Result<T> is standard-layout provided that:

Both error_type and T are standard-layout. Note that this is not guaranteed for std::exception_ptr, though likely in practice.
T is not ResultBase.

§9.2[class.mem]/20 states that:

A pointer to a standard-layout struct object, suitably converted using a reinterpret_cast, points to its initial member (or if that member is a bit-field, then to the unit in which it resides) and vice versa. [ Note: There might therefore be unnamed padding within a standard-layout struct object, but not at its beginning, as necessary to achieve appropriate alignment. —end note ]

This implies that empty base class optimization is mandatory for standard-layout types. Assuming Result<T> does have standard-layout, this in ResultBase is guaranteed to point at the first field in Result<T>.

9.5[class.union]/1 states:

In a union, at most one of the non-static data members can be active at any time, that is, the value of at most one of the non-static data members can be stored in a union at any time. [...] Each non-static data member is allocated as if it were the sole member of a struct.

And additionaly §3.10[basic.lval]/10:

If a program attempts to access the stored value of an object through a glvalue of other than one of the following types the behavior is undefined

the dynamic type of the object,

a cv-qualified version of the dynamic type of the object,

a type similar (as defined in 4.4) to the dynamic type of the object,

a type that is the signed or unsigned type corresponding to the dynamic type of the object,

a type that is the signed or unsigned type corresponding to a cv-qualified version of the dynamic type of the object,

an aggregate or union type that includes one of the aforementioned types among its elements or nonstatic data members (including, recursively, an element or non-static data member of a subaggregate or contained union),

a type that is a (possibly cv-qualified) base class type of the dynamic type of the object,

a char or unsigned char type.

This guarantees reinterpret_cast<const error_type*>(this) will yield a valid pointer to the mError field.

All controversy aside, this technique looks portable. Just keep formal limitations in mind: error_type and T must be standard-layout, and T may not be type ResultBase.

Side note: On most compilers (at least GCC, Clang and MSVC) non-standard-layout types will work as well. As long as Result<T> has predictable layout, error and result types are irrelevant.

Greg answered 22/10, 2015 at 9:10 Comment(0)

To answer the question: Is that portable?

No it is not even possible

Details:

This is not possible without at least type erasure (wich do not need RTTI/dynamic_cast, but needs at least a virtual function). There are already working solutions for type erasure (Boost.Any)

The reason is the following:

You want to instantiate the class

Result<int> r;

Instantiating a template class means allowing the compiler deduce member variables size so it can allocating the object on the stack.

However in your implementation:

private:
union { error_type mError; T mValue; };

You have a variable error_type wich seems you want to use in a polymorphic way. However if you fix the type at template instantiation you cannot later change it (a different type could have a different size! you could as well impose yourself to fix the size of the objects, but don't do that. Ugly and hackish).

So you have 2 solutions, use virtual functions, or use error codes.

It could be possible to do what you want, but you cannot do that:

 Result<int> r;
 r.setError(...);

with the exact interface that you want.

There are many possible solutions as long as you allow virtual functions and error codes, why exactly you don't want virtual functions here? If performance matters keep in mind that the cost of "setting" an error is as much as setting a pointer to a virtual class (if you do not have errors you don't need to resolve the Vtable, and anyway Vtable in templated code is likely to be optimized away most times).

Also if you don't want to "allocate" error codes, you can pre-allocate them.

You can do the following:

template< typename Rtype>
class Result{
     //... your detail here


    ~Result(){
         if(error)
             delete resultOrError.errorInstance;
         else
             delete resultOrError.resultValue;
    }

private:
    union {
        bool error;
        std::max_align_t mAligner;
    };
    union uif 
    { 
        Rtype               *          resultValue;
        PointerToVirtualErrorHandler  errorInstance;
    } resultOrError;
}

Where you have 1 result type, or 1 pointer to a virtual class with desired error. You check the boolean to see if currently you got an error or a result, and then you get corresponding value from the union. The virtual cost is paid only if you have an error, while for regular result you have only the penalty for the boolean check.

Of course in the above solution I used a pointer to result because that allow generic result, if you are interested in basic data type results or POD structs with only basic data types then you can avoid using a pointer also for result.

Note in your case std::exception_ptr does already type erasure, but you lose some type info, to get again the missing type info you can implement yourself something similiar to std::exception_ptr but with enough virtual methods to allow safe casting to proper exceptions type.

Lollop answered 18/10, 2015 at 12:7 Comment(5)

error_type is a fixed standard-layout type, only the result type varies and even R is restricted to being standard-layout. With these limitations in mind, I propose it may be possible to take advantage of common layout and avoid type erasure. I appreciate the attempted answer, but this is saying not possible universally, while the question is more limited in scope. – Greg 18/10, 2015 at 15:24

What I say is that a standard layout type may have a varying size in bytes, even if you restrict yourself to standard layout, the compiler have to know in advance the size to place it on the stack. Hence you are not only limiting to standard layout, but to standard layout with fixed size. Actually you do not have any check for a particular size. – Lollop 18/10, 2015 at 22:58

Size of derived objects is irrelevant because they are accessed by reference. See goal (3) example. – Greg 19/10, 2015 at 6:12

this is possible if the base class owns the buffer. it's just a touch fragile to guarantee that the buffer is big enough to hold the data required for every derived class. We only need type erasure for the derived specific Data. boba – Impi 14/5 at 19:11

I did something similar at coliru.stacked-crooked.com/a/090c0d4c5d0ab42a, And it can be improved now that I know about STD:: launder – Impi 14/5 at 19:14

There is common mistake made by C++ programmers in believing that virtual functions causes higher usage of CPU and memory. I call it mistake even though I know using virtual functions costs memory and CPU. But, hand written replacements for virtual functions mechanism are in most cases much worst.

You already said how to achieve the goal using virtual functions - just to repeat:

class ResultBase
{
public:
    virtual ~ResultBase() {}

    virtual bool hasError() const = 0;

    virtual std::exception_ptr error() const = 0;

protected:
    ResultBase() {}
};

And its implementation:

template <class T>
class Result : public ResultBase
{
public:
    Result(error_type error) { this->construct(error); }
    Result2(T value) { this->construct(value); }

    ~Result(); // this does not change
    bool hasError() const override { return mHasError; }
    std::exception_ptr error() const override { return mData.mError; }

    void setError(error_type error); // similar to your original approach
    void setValue(T value); // similar to your original approach
private:
    bool mHasError;
    union Data
    {
        Data() {} // in this way you can use also Non-POD types
        ~Data() {}

        error_type mError;
        T mValue;
    } mData;

    void construct(error_type error)
    {
        mHasError = true;
        new (&mData.mError) error_type(error);
    }
    void construct(T value)
    {
        mHasError = false;
        new (&mData.mValue) T(value);
    }
};

Look at full example here. As you can see there version with virtual functions is 3 times smaller and 7 (!) times faster - so, not so bad...

Another benefit is that you might have "cleaner" design and no "aliasing"/"aligning" problems.

If you really have some reason called compactness (I have no idea what it is) - with this very simple example you might implement virtual functions by hand (but why???!!!). Here you are:

class ResultBase;
struct ResultBaseVtable
{
    bool (*hasError)(const ResultBase&);
    error_type (*error)(const ResultBase&);
};

class ResultBase
{
public:
    bool hasError() const { return vtable->hasError(*this); }

    std::exception_ptr error() const { return vtable->error(*this); }

protected:
    ResultBase(ResultBaseVtable* vtable) : vtable(vtable) {}
private:
    ResultBaseVtable* vtable;
};

And the implementation is identical to previous version with the differences showed below:

template <class T>
class Result : public ResultBase
{
public:
    Result(error_type error) : ResultBase(&Result<T>::vtable)
    {
        this->construct(error);
    }
    Result(T value) : ResultBase(&Result<T>::vtable)
    {
        this->construct(value);
    }

private:
    static bool hasErrorVTable(const ResultBase& result)
    {
        return static_cast<const Result&>(result).hasError();
    }
    static error_type errorVTable(const ResultBase& result)
    {
        return static_cast<const Result&>(result).error();
    }
    static ResultBaseVtable vtable;
};

template <typename T>
ResultBaseVtable Result<T>::vtable{
    &Result<T>::hasErrorVTable, 
    &Result<T>::errorVTable,    
};

The above version is identical in CPU/memory usage with "virtual" implementation (surprise)...

Tenebrae answered 18/10, 2015 at 12:34 Comment(6)

Thanks for taking the time to answer, but the question is about correctness not design alternatives. That benchmark is ridiculous - a single iteration, and on ideone? In practice there should be no speed difference in your example because the types are known at compile time and easily optimized. clang -O3 with at 1000000 iterations shows almost identical execution time. – Greg 18/10, 2015 at 15:26

Regarding object size being larger than in polymorphic implementation, it's due to padding -- this depends on the max alignment supported for R, and if relevant data can be placed in base class to avoid padding, the non-polymorphic version will end up smaller. – Greg 18/10, 2015 at 15:26

Finally, I have no idea what implementing vtable by hand is supposed to buy me since the point was to avoid carrying vtable pointer. – Greg 18/10, 2015 at 15:26

Actually you are right, my answer is more comment to your question. To good answer from @DarioOO (+1) I would add one additional drawback - you do not have virtual destructor in your base class - it is against many principles of OO - in practice you will not have proper deallocation - but I assume you already know that. And regarding performance - yes - you can achieve slightly better performance w/out virtual-ism - but in your case I really do not see any reason for that,,,, – Tenebrae 18/10, 2015 at 15:55

Virtual destructor supports polymorphic delete -- irrelevant here, because destruction will always be done through concrete type. e.g: { Result<A> r1; Result<B> r2; anyHasError(r1, r2); } with bool anyHasError(ResultBase& r1, ResultBase& r2). – Greg 19/10, 2015 at 6:13

@ValentinMilea - then to be on safe side - you should prohibit new/delete on your type (e.g. operator delete ... = delete;). How can you be sure that none will ever use your type in a way you did not want it to be used? – Tenebrae 19/10, 2015 at 8:39

Here is my own attempt at an answer focusing strictly on portability.

Standard-layout is defined in §9.1[class.name]/7:

A standard-layout class is a class that:

has no non-static data members of type non-standard-layout class (or array of such types) or reference,

has no virtual functions (10.3) and no virtual base classes (10.1),

has the same access control (Clause 11) for all non-static data members,

has no non-standard-layout base classes,

either has no non-static data members in the most derived class and at most one base class with non-static data members, or has no base classes with non-static data members, and

has no base classes of the same type as the first non-static data member.

By this definition Result<T> is standard-layout provided that:

Both error_type and T are standard-layout. Note that this is not guaranteed for std::exception_ptr, though likely in practice.
T is not ResultBase.

§9.2[class.mem]/20 states that:

A pointer to a standard-layout struct object, suitably converted using a reinterpret_cast, points to its initial member (or if that member is a bit-field, then to the unit in which it resides) and vice versa. [ Note: There might therefore be unnamed padding within a standard-layout struct object, but not at its beginning, as necessary to achieve appropriate alignment. —end note ]

9.5[class.union]/1 states:

In a union, at most one of the non-static data members can be active at any time, that is, the value of at most one of the non-static data members can be stored in a union at any time. [...] Each non-static data member is allocated as if it were the sole member of a struct.

And additionaly §3.10[basic.lval]/10:

If a program attempts to access the stored value of an object through a glvalue of other than one of the following types the behavior is undefined

the dynamic type of the object,

a cv-qualified version of the dynamic type of the object,

a type similar (as defined in 4.4) to the dynamic type of the object,

a type that is the signed or unsigned type corresponding to the dynamic type of the object,

a type that is the signed or unsigned type corresponding to a cv-qualified version of the dynamic type of the object,

an aggregate or union type that includes one of the aforementioned types among its elements or nonstatic data members (including, recursively, an element or non-static data member of a subaggregate or contained union),

a type that is a (possibly cv-qualified) base class type of the dynamic type of the object,

a char or unsigned char type.

This guarantees reinterpret_cast<const error_type*>(this) will yield a valid pointer to the mError field.

All controversy aside, this technique looks portable. Just keep formal limitations in mind: error_type and T must be standard-layout, and T may not be type ResultBase.

Side note: On most compilers (at least GCC, Clang and MSVC) non-standard-layout types will work as well. As long as Result<T> has predictable layout, error and result types are irrelevant.

Greg answered 22/10, 2015 at 9:10 Comment(0)

union {
    error_type mError;
    T mValue;
};

Type T is not guaranteed to work with unions, for example it could have a non trivial constructor. some info about unions and constructors: Initializing a union with a non-trivial constructor

Underplot answered 18/10, 2015 at 11:50 Comment(0)

-1

Abstract base class, two implementations, for error and data, both with multiple inheritance, and use RTTI or an is_valid() member to tell which it is at runtime.

Jeddy answered 12/10, 2015 at 0:32 Comment(1)

This doesn't answer the question. Also, I cannnot rely on RTTI / dynamic_cast. – Greg 12/10, 2015 at 6:57

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags