Complete encapsulation without malloc

Asked 27/8, 2014 at 23:6 Answered 25/6, 2015 at 23:44

c c99 c11 pimpl-idiom variable-length-array

I was experimenting with C11 and VLAs, trying to declare a struct variable on the stack with only an incomplete declaration. The objective is to provide a mechanism to create a variable of some struct type without showing the internals (like the PIMPL idiom) but without the need to create the variable on the heap and return a pointer to it. Also, if the struct layout changes, I don't want to recompile every file that uses the struct.

I have managed to program the following:

private.h:

#ifndef PRIVATE_H_
#define PRIVATE_H_

typedef struct A{
    int value;
}A;

#endif /* PRIVATE_H_ */

public.h:

#ifndef PUBLIC_H_
#define PUBLIC_H_

typedef struct A A;

size_t A_getSizeOf(void);

void A_setValue(A * a, int value);

void A_printValue(A * a);

#endif /* PUBLIC_H_ */

implementation.c:

#include "private.h"
#include "stdio.h"

size_t A_getSizeOf(void)
{
    return sizeof(A);
}

void A_setValue(A * a, int value)
{
    a->value = value;
}

void A_printValue(A * a)
{
    printf("%d\n", a->value);
}

main.c:

#include <stdalign.h>
#include <stddef.h>

#include "public.h"

#define createOnStack(type, variable) \
    alignas(max_align_t) char variable ## _stack[type ## _getSizeOf()]; \
    type * variable = (type *)&variable ## _stack

int main(int argc, char *argv[]) {
    createOnStack(A, var);

    A_setValue(var, 5335);
    A_printValue(var);
}

I have tested this code and it seems to work. However I'm not sure if I'm overlooking something (like aliasing, alignment or something like that) that could be dangerous or unportable, or could hurt performance. Also I want to know if there are better (portable) solutions to this problem in C.

Andino answered 27/8, 2014 at 23:6 Comment(3)

You cannot sensibly do this without recompiling when the struct layout changes as sizeof will be optimized to a compile time constant if you use a VLA or alloca – Paleoecology 28/8, 2014 at 0:32

@Vality: look again at the code - that would be a link-time optimization; Mabus' reasoning should be sound – Gallfly 28/8, 2014 at 0:34

@Andino thanks for posting this. I hadn't thought of using a VLA of chars to provide storage for other types. – Brutus 25/6, 2015 at 23:50

This of course violates the effective typing rules (aka strict aliasing) because the C language does not allow an object of tye char [] to be accessed through a pointer that does not have that type (or a compatible one).

You could disable strict aliasing analysis via compiler flags like -fno-strict-aliasing or attributes like

#ifdef __GNUC__
#define MAY_ALIAS __attribute__((__may_alias__))
#else
#define MAY_ALIAS
#endif

(thanks go to R.. for pointing out the latter), but even if you do not do so, in practice everything should work just fine as long as you only ever use the variable's proper name to initialize the typed pointer.

Personally, I'd simplify your declarations to something along the lines of

#define stackbuffer(NAME, SIZE) \
    _Alignas (max_align_t) char NAME[SIZE]

typedef struct Foo Foo;
extern const size_t SIZEOF_FOO;

stackbuffer(buffer, SIZEOF_FOO);
Foo *foo = (void *)buffer;

The alternative would be using the non-standard alloca(), but that 'function' comes with its own set of issues.

Gallfly answered 28/8, 2014 at 0:24 Comment(12)

If you're willing to assume a GNU-C-compatible compiler with -fno-strict-aliasing, then rather than ruin optimization of the whole program with that flag, you should put __attribute__((__may_alias__)) on the type Foo. This should achieve the results just for the one type. – Demur 28/8, 2014 at 2:0

Why it violates strict aliasing? I thought that a pointer to char can alias with any other pointer without problems. – Andino 28/8, 2014 at 6:54

@Mabus: a pointer to char may alias anything, but this is the opposite case - a pointer to Foo aliasing a character array; it's better to think in terms of effective types: the memory block buffer has effective type char[], but you're accessing it as Foo – Gallfly 28/8, 2014 at 7:0

@Gallfly I thought that alias was conmutative. But even if it's not the case, as I'm only going to access the memory trough the pointer, can I ignore aliasing? – Andino 28/8, 2014 at 7:4

@Mabus: aliasing is symmetric, effective typing is not - the rules are that all objects (think memory locations) have a real type, and accessing them through an expression with incompatible type is UB; anyway, if you only use buffer to initialize the Foo* and not to read or modify the data, you should be fine – Gallfly 28/8, 2014 at 7:8

It appears OP wants to extend this to use VLA such that A_getSizeOf(void) would not always return the same value in a given executable. This extension appears to not work with the alternative idea of a global extern const size_t SIZEOF_FOO;. BTW: nice Q & A. – Modular 29/8, 2014 at 14:43

@chux: as I understood it, he only wants compatibility at link-time when releasing a new version of his code; a given version only ever comes with a single definition of struct Foo and thus a constant return value of A_getSizeOf(void) – Gallfly 29/8, 2014 at 15:59

"a given version only ever comes with a single definition of struct Foo" is true, but it is an incomplete definition of struct Foo that does not change with new releases. typedef struct A A; is the global part and only pointers to type A are used globally. The not-"implementation.c" code never sees the inner workings of A nor sizeof(A). The size of A is hidden and could be dynamic, hence the global function A_getSizeOf(). OP's scheme looks intriguing as it appears OP can get away with it. – Modular 29/8, 2014 at 16:29

A_getSizeOf(void) must return the same value in the same executable if implementation.c is statically linked (but in that case I expect that I can change the implementation, recompile only the implementation and relink). However, if implementation.c is part of a shared library, I think that changing the implementation won't break the ABI (I haven't tested that yet). AFAIK the same would happen if A_getSizeOf() is a external variable defined in implementation.c, but I'm not 100% sure. – Andino 29/8, 2014 at 17:16

In fact, as @chux says, using a function is more flexible, and allows the size to change, but that wasn't planned and I can't imagine a possible use of that (maybe a daemon that can be updated on the fly, with some locking for changing libraries? I don't know if that's possible XD). – Andino 29/8, 2014 at 17:22

@Andino I now see your "VLA" discussion was only with variable ## _stack[type ## _getSizeOf()]. I was thinking you were prepping for a VLA also in typedef struct A{ int value; char varray[0] }A;. – Modular 29/8, 2014 at 19:23

That's a "flexible array member", not a true VLA. – Andino 29/8, 2014 at 20:9

I am considering adopting a strategy similar to the following to solve essentially the same problem. Perhaps it will be of interest despite being a year late.

I wish to prevent clients of a struct from accessing the fields directly, in order to make it easier to reason about their state and easier to write reliable design contracts. I'd also prefer to avoid allocating small structures on the heap. But I can't afford a C11 public interface - much of the joy of C is that almost any code knows how to talk to C89.

To that end, consider the adequate application code:

#include "opaque.h"
int main(void)
{
  opaque on_the_stack = create_opaque(42,3.14); // constructor
  print_opaque(&on_the_stack);
  delete_opaque(&on_the_stack); // destructor
  return 0;
}

The opaque header is fairly nasty, but not completely absurd. Providing both create and delete functions is mostly for the sake of consistency with structs where calling the destructor actually matters.

/* opaque.h */
#ifndef OPAQUE_H
#define OPAQUE_H

/* max_align_t is not reliably available in stddef, esp. in c89 */
typedef union
{
  int foo;
  long long _longlong;
  unsigned long long _ulonglong;
  double _double;
  void * _voidptr;
  void (*_voidfuncptr)(void);
  /* I believe the above types are sufficient */
} alignment_hack;

#define sizeof_opaque 16 /* Tedious to keep up to date */
typedef struct
{
  union
  {
    char state [sizeof_opaque];
    alignment_hack hack;
  } private;
} opaque;
#undef sizeof_opaque /* minimise the scope of the macro */

void print_opaque(opaque * o);
opaque create_opaque(int foo, double bar);
void delete_opaque(opaque *);
#endif

Finally an implementation, which is welcome to use C11 as it's not the interface. _Static_assert(alignof...) is particularly reassuring. Several layers of static functions are used to indicate the obvious refinement of generating the wrap/unwrap layers. Pretty much the entire mess is amenable to code gen.

#include "opaque.h"

#include <stdalign.h>
#include <stdio.h>

typedef struct
{
  int foo;
  double bar;
} opaque_impl;

/* Zero tolerance approach to letting the sizes drift */
_Static_assert(sizeof (opaque) == sizeof (opaque_impl), "Opaque size incorrect");
_Static_assert(alignof (opaque) == alignof (opaque_impl), "Opaque alignment incorrect");

static void print_opaque_impl(opaque_impl *o)
{
  printf("Foo = %d and Bar = %g\n",o->foo,o->bar);
}

static void create_opaque_impl(opaque_impl * o, int foo, double bar)
{
  o->foo = foo;
  o->bar = bar;
}

static void create_opaque_hack(opaque * o, int foo, double bar)
{
   opaque_impl * ptr = (opaque_impl*)o;
   create_opaque_impl(ptr,foo,bar);
}

static void delete_opaque_impl(opaque_impl *o)
{
  o->foo = 0;
  o->bar = 0;
}

static void delete_opaque_hack(opaque * o)
{
   opaque_impl * ptr = (opaque_impl*)o;
   delete_opaque_impl(ptr);
}

void print_opaque(opaque * o)
{
  return print_opaque_impl((opaque_impl*)o);
}

opaque create_opaque(int foo, double bar)
{
  opaque tmp;
  unsigned int i;
  /* Useful to zero out padding */
  for (i=0; i < sizeof (opaque_impl); i++)
    {
      tmp.private.state[i] = 0;
    }
  create_opaque_hack(&tmp,foo,bar);
  return tmp;
}

void delete_opaque(opaque *o)
{
  delete_opaque_hack(o);
}

The drawbacks I can see myself:

Changing the size define manually would be irritating
The casting should hinder optimisation (I haven't checked this yet)
This may violate strict pointer aliasing. Need to re-read the spec.

I am concerned about accidentally invoking undefined behaviour. I would also be interested in general feedback on the above, or whether it looks like a credible alternative to the inventive VLA technique in the question.

Brutus answered 25/6, 2015 at 23:44 Comment(2)

The most obvious problem that I see is that if you change sizeof_opaque you can't have binary compatibility. With the VLA implementation, you can change the size of the struct and the compiled programs still work (supposing that you are programming a shared library). Also note that GCC and other compilers have an attribute similar to alignas (aligned), if you prefer to write C99 code with extensions. Also, when you are creating a struct inside another outside your library, you will have to malloc it, in my case, because you don't know the size at compile time. – Andino 26/6, 2015 at 7:15

@Andino That's true, changing the size would break binary compatibility. VLA or malloc are probably the only workarounds. On the bright side, composition of opaque structures works nicely when the size is fixed, without the heap. Always tradeoffs! Binary compatibility of libraries is something I need to give (much) more thought to. – Brutus 26/6, 2015 at 9:13

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags