Me and a colleague are trying to achieve a simple polymorphic class hierarchy. We're working on an embedded system and are restricted to only using a C compiler. We have a basic design idea that compiles without warnings (-Wall -Wextra -fstrict-aliasing -pedantic) and runs fine under gcc 4.8.1.
However, we are a bit worried about aliasing issues as we do not fully understand when this becomes a problem.
In order to demonstrate we have written a toy example with an 'interface' IHello and two classes implementing this interface 'Cat' and 'Dog.
#include <stdio.h>
/* -------- IHello -------- */
struct IHello_;
typedef struct IHello_
{
void (*SayHello)(const struct IHello_* self, const char* greeting);
} IHello;
/* Helper function */
void SayHello(const IHello* self, const char* greeting)
{
self->SayHello(self, greeting);
}
/* -------- Cat -------- */
typedef struct Cat_
{
IHello hello;
const char* name;
int age;
} Cat;
void Cat_SayHello(const IHello* self, const char* greeting)
{
const Cat* cat = (const Cat*) self;
printf("%s I am a cat! My name is %s and I am %d years old.\n",
greeting,
cat->name,
cat->age);
}
Cat Cat_Create(const char* name, const int age)
{
static const IHello catHello = { Cat_SayHello };
Cat cat;
cat.hello = catHello;
cat.name = name;
cat.age = age;
return cat;
}
/* -------- Dog -------- */
typedef struct Dog_
{
IHello hello;
double weight;
int age;
const char* sound;
} Dog;
void Dog_SayHello(const IHello* self, const char* greeting)
{
const Dog* dog = (const Dog*) self;
printf("%s I am a dog! I can make this sound: %s I am %d years old and weigh %.1f kg.\n",
greeting,
dog->sound,
dog->age,
dog->weight);
}
Dog Dog_Create(const char* sound, const int age, const double weight)
{
static const IHello dogHello = { Dog_SayHello };
Dog dog;
dog.hello = dogHello;
dog.sound = sound;
dog.age = age;
dog.weight = weight;
return dog;
}
/* Client code */
int main(void)
{
const Cat cat = Cat_Create("Mittens", 5);
const Dog dog = Dog_Create("Woof!", 4, 10.3);
SayHello((IHello*) &cat, "Good day!");
SayHello((IHello*) &dog, "Hi there!");
return 0;
}
Output:
Good day! I am a cat! My name is Mittens and I am 5 years old.
Hi there! I am a dog! I can make this sound: Woof! I am 4 years old and weigh 10.3 kg.
We're pretty sure the the 'upcast' from Cat and Dog to IHello is safe since IHello is the first member of both these structs.
Our real concern is the 'downcast' from IHello to Cat and Dog respectively in the corresponding interface implementations of SayHello. Does this cause any strict aliasing issues? Is our code guaranteed to work by the C standard or are we simply lucky that this works with gcc?
Update
The solution that we eventually decide to use must be standard C and cannot rely on e.g. gcc extensions. The code must be able to compile and run on different processors using various (proprietary) compilers.
The intention with this 'pattern' is that client code shall receive pointers to IHello and thus only be able to call functions in the interface. However, these calls must behave differently depending on which implementation of IHello that was received. In short, we want identical behaviour to the OOP concept of interfaces and classes implementing this interface.
We are aware of the fact that the code only works if the IHello interface struct is placed as the first member of the structs which implement the interface. This is a limitation that we are willing to accept.
According to: Does accessing the first field of a struct via a C cast violate strict aliasing?
§6.7.2.1/13:
Within a structure object, the non-bit-field members and the units in which bit-fields reside have addresses that increase in the order in which they are declared. A pointer to a structure object, suitably converted, points to its initial member (or if that member is a bit-field, then to the unit in which it resides), and vice versa. There may be unnamed padding within a structure object, but not at its beginning.
The aliasing rule reads as follows (§6.5/7):
An object shall have its stored value accessed only by an lvalue expression that has one of the following types:
- a type compatible with the effective type of the object,
- a qualified version of a type compatible with the effective type of the object,
- a type that is the signed or unsigned type corresponding to the effective type of the object,
- a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,
- an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or
- a character type.
According to the fifth bullet above and the fact that structures contain no padding at the top we are fairly sure that 'upcasting' a derived struct that implements the interface to a pointer to the interface is safe, i.e.
Cat cat;
const IHello* catPtr = (const IHello*) &cat; /* Upcast */
/* Inside client code */
void Greet(const IHello* interface, const char* greeting)
{
/* Users do not need to know whether interface points to a Cat or Dog. */
interface->SayHello(interface, greeting); /* Dereferencing should be safe */
}
The big question is whether the 'downcast' used in the implementation of the interface function(s) is safe. As seen above:
void Cat_SayHello(const IHello* hello, const char* greeting)
{
/* Is the following statement safe if we know for
* a fact that hello points to a Cat?
* Does it violate strict aliasing rules? */
const Cat* cat = (const Cat*) hello;
/* Access internal state in Cat */
}
Also note that changing the signature of the implementation functions to
Cat_SayHello(const Cat* cat, const char* greeting);
Dog_SayHello(const Dog* dog, const char* greeting);
and commenting out the 'downcast' also compiles and runs fine. However, this generates a compiler warning for function signature mismatch.
cat.SayHello(&cat , "Good day!");
– Unicellularvoid Cat_SayHello(Cat* cat)
andvoid Dog_SayHello(Dog* dog)
however this is not sufficient. The idea here is that the client code only receives anIHello*
and can then only call functions implemented by this interface but the result will differ depending on which implementation ofIHello
that was passed to the client code. – LabiaKitten
"class" withSayHello
pointing toKitten_SayHello()
, invokingGreet((Cat *)kitten, greeting);
will invokeKitten_SayHello()
, notCat_SayHello()
as intended by the programmer. You'd need a helper function-like macro that mutates the type information temporarily to make it work that way. That could obviously get ugly very easily. E.g. IHello->Cat->Kitten means casting Kitten to Cat would involve manipulatingkitten->cat.hello.SayHello
, rather than a simple cast that allowsCat_SayHello()
. – ExtramuralSayHello(&cat.hello, "Hi there!");
(compiler check) instead ofSayHello((IHello*) &cat, "Good day!");
(no check at all) would be nice – Footton