Wrapping C++ in C: Derived to base conversions
Asked Answered
F

4

5

I am wrapping a simple C++ inheritance hierarchy into "object-oriented" C. I'm trying to figure out if there any gotchas in treating the pointers to C++ objects as pointers to opaque C structs. In particular, under what circumstances would the derived-to-base conversion cause problems?

The classes themselves are relatively complex, but the hierarchy is shallow and uses single-inheritance only:

// A base class with lots of important shared functionality
class Base {
    public:
    virtual void someOperation();
    // More operations...

    private:
    // Data...
};

// One of several derived classes
class FirstDerived: public Base {
    public:
    virtual void someOperation();
    // More operations...

    private:
    // More data...
};

// More derived classes of Base..

I am planning on exposing this to C clients via the following, fairly standard object-oriented C:

// An opaque pointers to the types
typedef struct base_t base_t;
typedef struct first_derived_t first_derived_t;

void base_some_operation(base_t* object) {
     Base* base = (Base*) object;
     base->someOperation();
}

first_derived_t* first_derived_create() {
     return (first_derived_t*) new FirstDerived();
}

void first_derived_destroy(first_derived_t* object) {
     FirstDerived* firstDerived = (FirstDerived*) object;
     delete firstDerived;
}

The C clients only pass around pointers to the underlying C++ objects and can only manipulate them via function calls. So the client can finally do something like:

first_derived_t* object = first_derived_create();
base_some_operation((base_t*) object); // Note the derived-to-base cast here
...

and have the virtual call to FirstDerived::someOperation() succeed as expected.

These classes are not standard-layout but do not use multiple or virtual inheritance. Is this guaranteed to work?

Note that I have control over all the code (C++ and the C wrapper), if that matters.

Flowerless answered 6/2, 2012 at 22:50 Comment(1)
Just use void* everywhere as your opaque type.Nuncle
A
4
// An opaque pointers to the types
typedef struct base_t base_t;
typedef struct first_derived_t first_derived_t;

// **********************//
// inside C++ stub only. //
// **********************//

// Ensures you always cast to Base* first, then to void*,
// then to stub type pointer.  This enforces that you'll
// get consistent a address in presence of inheritance.
template<typename T>
T * get_stub_pointer ( Base * object )
{
     return reinterpret_cast<T*>(static_cast<void*>(object));
}

// Recover (intermediate) Base* pointer from stub type.
Base * get_base_pointer ( void * object )
{
     return reinterpret_cast<Base*>(object);
}

// Get derived type pointer validating that it's actually
// the right type.  Returs null pointer if the type is
// invalid.  This ensures you can detect invalid use of
// the stub functions.
template<typename T>
T * get_derived_pointer ( void * object )
{
    return dynamic_cast<T*>(get_base_pointer(object));
}

// ***********************************//
// public C exports (stub interface). //
// ***********************************//

void base_some_operation(base_t* object)
{
     Base* base = get_base_pointer(object);
     base->someOperation();
}

first_derived_t* first_derived_create()
{
     return get_stub_pointer<first_derived_t>(new FirstDerived());
}

void first_derived_destroy(first_derived_t* object)
{
     FirstDerived * derived = get_derived_pointer<FirstDerived>(object);
     assert(derived != 0);

     delete firstDerived;
}

This means that you can always perform a cast such as the following.

first_derived_t* object = first_derived_create();
base_some_operation((base_t*) object);

This is safe because the base_t* pointer will be cast to void*, then to Base*. This is one step less than what happened before. Notice the order:

  1. FirstDerived*
  2. Base* (via implicit static_cast<Base*>)
  3. void* (via static_cast<void*>)
  4. first_derived_t* (via reinterpret_cast<first_derived_t*>)
  5. base_t* (via (base_t*), which is a C++-style reinterpret_cast<base_t*>)
  6. void* (via implicit static_cast<void*>)
  7. Base* (via reinterpret_cast<Base*>)

For calls that wrap a FirstDerived method, you get an extra cast:

  1. FirstDerived* (via dynamic_cast<FirstDerived*>)
Artichoke answered 7/2, 2012 at 0:52 Comment(1)
Nice, this seems to be both safe and clean, even in the face of future maintenance programmers who might not grok the whole end-to-end situation. Thanks!Flowerless
N
4

You can certainly make a C interface to some C++ code. All you need is extern "C", and I recommend a void * as your opaque data type:

// library.h, for C clients

typedef void * Handle;

extern "C" Handle create_foo();
extern "C" void destroy_foo(Handle);

extern "C" int magic_foo(Handle, char const *);

Then implement it in C++:

#include "library.h"
#include "foo.hpp"

Handle create_foo()
{
    Foo * p = nullptr;

    try { p = new Foo; }
    catch (...) { p = nullptr; }

    return p
}

void destroy_foo(Handle p)
{
    delete static_cast<Foo*>(p);
}

int magic_foo(Handle p, char const * s)
{
    Foo * const f = static_cast<Foo*>(p);

    try
    {
        f->prepare();
        return f->count_utf8_chars(s);
    }
    catch (...)
    {
        return -1;
        errno = E_FOO_BAR;
    }
}

Remember never to allow any exceptions to propagate through a calling C function!

Nuncle answered 6/2, 2012 at 23:8 Comment(12)
I've done this before, and using void* is really problematic for type safety reasons on the client side (it's easy to pass anything as that argument). Consider the windows API where a files and mutex objects are both identified by type HANDLE, meaning you can pass a file to ReleaseMutex(). You can use an incomplete struct for each distinct type that should appear on the client side. For example, struct foo_t; typedef struct foo_t* foo_t;. Then, use foo_t create_foo(); and void destroy_foo(foo_t);.Krissykrista
It adds a few nasty casts inside the C++-to-C stub implementation, but at least you get a "type safe" interface on the client side.Krissykrista
@Kerrek SB: Thanks for the answer, but this is simpler than my case. I'm worried about the cast from FirstDerived* to Base* via a C-style cast. For example, if there was multiple inheritance involved, we'd be in trouble, no? I'm looking for other gotchas...Flowerless
@Adrian, I think you might want to consider updating your question to make this issue about Derived-to-Base conversions more explicit. This current answer doesn't involve any inheritance. I think there is no really safe way to allow any conversions in the C code; you might have to implement many functions such as Base * getBaseFromDerived1(Derived1 *) and Base * getBaseFromDerived2(Derived2 *) and so on.Diplomacy
@Adrian: Very simple: You must always cast the exact same thing back that you originally cast to void *. It really doesn't matter if and how the C++ classes are related; you must simply get the original type back, right on the nose.Nuncle
Indeed, you might need to static_cast<void*>(static_cast<Base*>(derived)) if necessary to ensure that the exact address is received.Krissykrista
@AndréCaron, yes, we can use those casts in the C++ code, but the question is about the casts in the C code. As far as I can see, we won't be able to safely allow any casts in the C code.Diplomacy
@AaronMcDaid: yes you can. Cast always cast all pointers to Base*. In stubs for derived class functions, use a static_cast back to Derived* before calling the real function. You can even use a dynamic_cast to make sure the client has not attempted something foolish.Krissykrista
@AndréCaron, maybe I'm stating the obvious, but the C code itself obviously cannot have static_cast in it. So, we're not wondering how to write the C++ code, we're trying to come up with a good interface to make available to the C code. The challenge is that in the C code there may be occasions when there are two variables of two different 'handle' types (derived1_t and base_t) which refer to the same object. We have to be very careful if the C code has any casts in it between these two handle types.Diplomacy
@AaronMcDaid: Re-read my last comment. Doing what I suggested allows you to write the casts in the C code as in the example in your question. Because you always cast to Base* before casting to the dummy C pointer type, casts to base classes are "safe" on the C side of things.Krissykrista
@AndréCaron, I see what you mean. Your idea is correct. Apologies. I think I dislike the idea of having a Derived* mean something different in the C code than it would mean in the C++ code. But that's just a style thing. My worry is that you might compile the C code with the C++ compiler and suddenly the (Base*) derived1_pointer casts break down.Diplomacy
@AaronMcDaid: I added my idea as a answer. You can check that out for possible flaws.Krissykrista
A
4
// An opaque pointers to the types
typedef struct base_t base_t;
typedef struct first_derived_t first_derived_t;

// **********************//
// inside C++ stub only. //
// **********************//

// Ensures you always cast to Base* first, then to void*,
// then to stub type pointer.  This enforces that you'll
// get consistent a address in presence of inheritance.
template<typename T>
T * get_stub_pointer ( Base * object )
{
     return reinterpret_cast<T*>(static_cast<void*>(object));
}

// Recover (intermediate) Base* pointer from stub type.
Base * get_base_pointer ( void * object )
{
     return reinterpret_cast<Base*>(object);
}

// Get derived type pointer validating that it's actually
// the right type.  Returs null pointer if the type is
// invalid.  This ensures you can detect invalid use of
// the stub functions.
template<typename T>
T * get_derived_pointer ( void * object )
{
    return dynamic_cast<T*>(get_base_pointer(object));
}

// ***********************************//
// public C exports (stub interface). //
// ***********************************//

void base_some_operation(base_t* object)
{
     Base* base = get_base_pointer(object);
     base->someOperation();
}

first_derived_t* first_derived_create()
{
     return get_stub_pointer<first_derived_t>(new FirstDerived());
}

void first_derived_destroy(first_derived_t* object)
{
     FirstDerived * derived = get_derived_pointer<FirstDerived>(object);
     assert(derived != 0);

     delete firstDerived;
}

This means that you can always perform a cast such as the following.

first_derived_t* object = first_derived_create();
base_some_operation((base_t*) object);

This is safe because the base_t* pointer will be cast to void*, then to Base*. This is one step less than what happened before. Notice the order:

  1. FirstDerived*
  2. Base* (via implicit static_cast<Base*>)
  3. void* (via static_cast<void*>)
  4. first_derived_t* (via reinterpret_cast<first_derived_t*>)
  5. base_t* (via (base_t*), which is a C++-style reinterpret_cast<base_t*>)
  6. void* (via implicit static_cast<void*>)
  7. Base* (via reinterpret_cast<Base*>)

For calls that wrap a FirstDerived method, you get an extra cast:

  1. FirstDerived* (via dynamic_cast<FirstDerived*>)
Artichoke answered 7/2, 2012 at 0:52 Comment(1)
Nice, this seems to be both safe and clean, even in the face of future maintenance programmers who might not grok the whole end-to-end situation. Thanks!Flowerless
P
1

This is the approach I've used in the past (perhaps as implied by Aaron's comment). Note that the same type names are used in both C and C++. Casts are all done in C++; this naturally represents good encapsulation irrespective of questions of legality. [Obviously you need delete methods as well.] Note that to call someOperation() with a Derived*, an explicit upcast to Base* is required. If Derived does not provide any new methods such as someOtherOperation, then you do not need to expose Derived* to clients, and avoid the client side casts.

Header file:"BaseDerived.H"

#ifdef __cplusplus
extern "C"
{
#endif
    typedef struct Base Base;
    typedef struct Derived Derived;

    Derived* createDerived();
    Base* createBase();
    Base* upcastToBase(Derived* derived);
    Derived* tryDownCasttoDerived(Base* base);
    void someOperation(Base* base);
void someOtherOperation(Derived* derived);
#ifdef __cplusplus
}
#endif

Implementation: "BaseDerived.CPP"

#include "BaseDerived.H"
struct Base 
{
    virtual void someOperation()
    {
        std::cout << "Base" << std::endl;
    }
};
struct Derived : public Base
{
public:
    virtual void someOperation()
    {
        std::cout << "Derived" << std::endl;
    }
private:
};

Derived* createDerived()
{
    return new Derived;
}

Base* createBase()
{
    return new Base;
}

Base* upcastToBase(Derived* derived)
{
    return derived;
}

Derived* tryDownCasttoDerived(Base* base)
{
    return dynamic_cast<Derived*>(base);
}

void someOperation(Base* base)
{
    base->someOperation();
}

void someOperation(Derived* derived)
{
    derived->someOperation();
}
Psychedelic answered 7/2, 2012 at 0:35 Comment(0)
D
0

I think these two lines are the nub of the question:

first_derived_t* object = first_derived_create();
base_some_operation((base_t*) object); // Note the derived-to-base cast here
...

There is no really safe way to allow this in the C code. In C, such a cast never changes the raw integral value of the pointer, but sometimes C++ casts will do so and therefore you need a design that never has any casts within the C code.

Here is one (overly complex?) solution. First, decide on a policy that the C code will always strictly deal with a value which is effectively a Base* - this is a somewhat arbitrary policy to ensure consistency. This means that the C++ code will sometimes have to to a dynamic_cast, we'll come to that later.

(You can make the design work correctly with C code simply by using casts, as has been mentioned by others. But I'd would be worried that the compiler will allow all sorts of crazy casts, such as (Derived1*) derived2_ptr or even casts to types in a different class hierarchy. My goal here is to enforce the proper object-oriented is-a relationship within the C code.)

Then, the C handle classes could be something like

struct base_t_ptr {
    void * this_; // holds the Base pointer
};
typedef struct {
    struct base_t_ptr get_base;
} derived_t_ptr;

This should make it easy to use something like casts in a concise and safe way: Note how we pass in object.get_base in this code:

first_derived_t_ptr object = first_derived_create();
base_some_operation(object.get_base);

where the declaration of base_some_operation is

extern "C" base_some_operation(struct base_t_ptr);

This will be quite type safe, as you won't be able to pass a derived1_t_ptr to this function without going via the .get_base data member. It will also help your C code to know a little about the types and which conversions are valid - you don't want to accidentally convert Derived1 to Derived2.

Then, when implementing the non-virtual methods defined only in a derived class, you'll need something like:

extern "C" void derived1_nonvirtual_operation(struct derived1_t_ptr); // The C-style interface. Type safe.

void derived1_nonvirtual_operation(struct derived1_t_ptr d) {
    // we *know* this refers to a Derived1 type, so we can trust these casts:
    Base * bp = reinterpret_cast<Base*>(d.get_base.this_);
    Derived1 *this_ = dynamic_cast<Derived1*>;
    this_ -> some_operation();
}
Diplomacy answered 7/2, 2012 at 0:32 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.