Static allocation of opaque data types
Asked Answered
G

11

52

Very often malloc() is absolutely not allowed when programming for embedded systems. Most of the time I'm pretty able to deal with this, but one thing irritates me: it keeps me from using so called 'opaque types' to enable data hiding. Normally I'd do something like this:

// In file module.h
typedef struct handle_t handle_t;

handle_t *create_handle();
void operation_on_handle(handle_t *handle, int an_argument);
void another_operation_on_handle(handle_t *handle, char etcetera);
void close_handle(handle_t *handle);


// In file module.c
struct handle_t {
    int foo;
    void *something;
    int another_implementation_detail;
};

handle_t *create_handle() {
    handle_t *handle = malloc(sizeof(struct handle_t));
    // other initialization
    return handle;
}

There you go: create_handle() performs a malloc() to create an 'instance'. A construction often used to prevent having to malloc() is to change the prototype of create_handle() like this:

void create_handle(handle_t *handle);

And then the caller could create the handle this way:

// In file caller.c
void i_am_the_caller() {
    handle_t a_handle;    // Allocate a handle on the stack instead of malloc()
    create_handle(&a_handle);
    // ... a_handle is ready to go!
}

But unfortunately this code is obviously invalid, the size of handle_t isn't known!

I never really found a solution to solve this in a proper way. I'd very like to know if anyone has a proper way of doing this, or maybe a complete different approach to enable data hiding in C (not using static globals in the module.c of course, one must be able to create multiple instances).

Gensmer answered 14/12, 2010 at 14:59 Comment(4)
Maybe I'm missing something. Why isn't the size of handle_t known? "create_handle" takes an argument of type "handlet_t*" so it should have knowledge about the size of it. I think it would be a different matter if you passed an array though.Strongminded
@Strongminded The size of handle_t isn't known in caller.c, only a pointer to handle_t can be used. The size of handle_t is only known in module.cGensmer
@Strongminded Forward declaration and pointers allows use of opaque types so that only the implementation knows the size, not the client.Seneschal
This may help also: https://mcmap.net/q/354362/-allocate-memory-to-buffer-through-function-callMarileemarilin
E
18

You can use the _alloca function. I believe that it's not exactly Standard, but as far as I know, nearly all common compilers implement it. When you use it as a default argument, it allocates off the caller's stack.

// Header
typedef struct {} something;
size_t get_size();
something* create_something(void* mem);

// Usage
something* ptr = create_something(_alloca(get_size())); // or define a macro.

// Implementation
size_t get_size() {
    return sizeof(real_handle_type);
}
something* create_something(void* mem) {
    real_handle_type* ptr = (real_handle_type*)mem;
    // Fill out real_type
    return (something*)mem;
}

You could also use some kind of object pool semi-heap - if you have a maximum number of currently available objects, then you could allocate all memory for them statically, and just bit-shift for which ones are currently in use.

#define MAX_OBJECTS 32
real_type objects[MAX_OBJECTS];
unsigned int in_use; // Make sure this is large enough
something* create_something() {
     for(int i = 0; i < MAX_OBJECTS; i++) {
         if (!(in_use & (1 << i))) {
             in_use |= (1 << i);
             return &objects[i];
         }
     }
     return NULL;
}

My bit-shifting is a little off, been a long time since I've done it, but I hope that you get the point.

Effectually answered 14/12, 2010 at 15:19 Comment(6)
alloca() doesn't fix the opaque handle problem - the size of the object needs to be known, so the object can't be opaque. The memory pool is often used.Culliton
An object pool of the maximum expected size is the best solution. You basically use it as a custom allocator. If you don't know the maximum number of objects needed, pick a large number (within memory limits) and deal with allocation failures.Seneschal
@Michael The size is acquired with the get_size() which would be just a wrapper around "sizeof( struct handle_t )". If alloca isn't supported you could always use C99 variable length arrays instead.Strongminded
@Strongminded and DeadMG: you're right that I missed the key part of how get_size() lets this work. I'm still not a huge an of alloca(), but this is a quite workable option to the problem posed in the question.Culliton
I'd never adopt heap or heap-equivalent memory allocation system only for the sake of turning a field opaque, doesn't seems to be a good trade-off.Agostino
It's probably in_use |= (1 << i); when you want to set the flag.Begrime
A
10

One way would be to add something like

#define MODULE_HANDLE_SIZE (4711)

to the public module.h header. Since that creates a worrying requirement of keeping this in sync with the actual size, the line is of course best auto-generated by the build process.

The other option is of course to actually expose the structure, but document it as being opaque and forbidding access through any other means than through the defined API. This can be made more clear by doing something like:

#include "module_private.h"

typedef struct
{
  handle_private_t private;
} handle_t;

Here, the actual declaration of the module's handle has been moved into a separate header, to make it less obviously visible. A type declared in that header is then simply wrapped in the desired typedef name, making sure to indicate that it is private.

Functions inside the module that take handle_t * can safely access private as a handle_private_t value, since it's the first member of the public struct.

Aperiodic answered 14/12, 2010 at 15:6 Comment(2)
You can even add some macros to mean that the element "private" is defined with different names according to which .c file includes it; that way it becomes more obvious when code is doing something it shouldn't (eg h->do_not_use_thisfrom_anywhere_ever.num++) and also makes it slightly easier to grep for violations...Bile
I could live with this solution, but still has the downside that if a header file only used by the implementation changes, also the using .c file has to be recompiled. Also for compiling the using .c the same include path is needed as for compiling the implementation.Gensmer
C
7

Unfortunately, I think the typical way to deal with this problem is by simply having the programmer treat the object as opaque - the full structure implementation is in the header and available, it's just the responsibility of the programmer to not use the internals directly, only through the APIs defined for the object.

If this isn't good enough, a few options might be:

  • use C++ as a 'better C' and declare the internals of the structure as private.
  • run some sort of pre-processor on the headers so that the internals of the structure are declared, but with unusable names. The original header, with good names, will be available to the implementation of the APIs that manage the structure. I've never seen this technique used - it's just an idea off the top of my head that might be possible, but seems like far more trouble than it's worth.
  • have your code that uses opaque pointers declare the statically allocated objects as extern (ie., globals) Then have a special module that has access to the full definition of the object actually declare these objects. Since only the 'special' module has access to the full definition, the normal use of the opaque object remains opaque. However, now you have to rely on your programmers to not abuse the fact that thee objects are global. You have also increased the change of naming collisions, so that need to be managed (probably not a big problem, except that it might occur unintentionally - ouch!).

I think overall, just relying on your programmers to follow the rules for the use of these objects might be the best solution (though using a subset of C++ isn't bad either in my opinion). Depending on your programmers to follow the rules of not using the structure internals isn't perfect, but it's a workable solution that is in common use.

Culliton answered 14/12, 2010 at 15:52 Comment(0)
F
7

One solution if to create a static pool of struct handle_t objects, and provide then as neceessary. There are many ways to achieve that, but a simple illustrative example follows:

// In file module.c
struct handle_t 
{
    int foo;
    void* something;
    int another_implementation_detail;

    int in_use ;
} ;

static struct handle_t handle_pool[MAX_HANDLES] ;

handle_t* create_handle() 
{
    int h ;
    handle_t* handle = 0 ;
    for( h = 0; handle == 0 && h < MAX_HANDLES; h++ )
    {
        if( handle_pool[h].in_use == 0 )
        {
            handle = &handle_pool[h] ;
        }
    }

    // other initialization
    return handle;
}

void release_handle( handle_t* handle ) 
{
    handle->in_use = 0 ;
}

There are faster faster ways of finding an unused handle, you could for example keep a static index that increments each time a handle is allocated and 'wraps-around' when it reaches MAX_HANDLES; this would be faster for the typical situation where several handles are allocated before releasing any one. For a small number of handles however, this brute-force search is probably adequate.

Of course the handle itself need no longer be a pointer but could be a simple index into the hidden pool. This would enhance data hiding and protection of the pool from external access.

So the header would have:

typedef int handle_t ;

and the code would change as follows:

// In file module.c
struct handle_s 
{
    int foo;
    void* something;
    int another_implementation_detail;

    int in_use ;
} ;

static struct handle_s handle_pool[MAX_HANDLES] ;

handle_t create_handle() 
{
    int h ;
    handle_t handle = -1 ;
    for( h = 0; handle != -1 && h < MAX_HANDLES; h++ )
    {
        if( handle_pool[h].in_use == 0 )
        {
            handle = h ;
        }
    }

    // other initialization
    return handle;
}

void release_handle( handle_t handle ) 
{
    handle_pool[handle].in_use = 0 ;
}

Because the handle returned is no longer a pointer to the internal data, and inquisitive or malicious user cannnot gain access to it through the handle.

Note that you may need to add some thread-safety mechanisms if you are getting handles in multiple threads.

Febri answered 14/12, 2010 at 17:15 Comment(0)
W
2

To expand on some old discussion in comments here, you can do this by providing an allocator function as part of the constructor call.

  • Given some opaque type typedef struct opaque opaque;, then

  • Define a function type for an allocator function typedef void* alloc_t (size_t bytes);. In this case I used the same signature as malloc/alloca for compatibility purposes.

  • The constructor implementation would look something like this:

      struct opaque
      {
        int foo; // some private member
      };
    
      opaque* opaque_construct (alloc_t* alloc, int some_value)
      {
        opaque* obj = alloc(sizeof *obj);
        if(obj == NULL) { return NULL; }
    
        // initialize members
        obj->foo = some_value;
    
        return obj;
      }
    

    That is, the allocator gets provided the size of the opaque object from inside the constructor, where it is known.

  • For static storage allocation like done in embedded systems, we can create a simple static memory pool class like this:

    #define MAX_SIZE 100
    static uint8_t mempool [MAX_SIZE];
    static size_t mempool_size=0;
    
    void* static_alloc (size_t size)
    {
      uint8_t* result;
    
      if(mempool_size + size > MAX_SIZE)
      {
        return NULL;
      }
    
      result = &mempool[mempool_size];
      mempool_size += size;
      return result;
    }
    

    (This might be allocated in .bss or in your own custom section, whatever is preferred.)

  • Now the caller can decide how each object is allocated and all objects in for example a resource-constrained microcontroller can share the same memory pool. Usage:

    opaque* obj1 = opaque_construct(malloc, 123);
    opaque* obj2 = opaque_construct(static_alloc, 123);
    opaque* obj3 = opaque_construct(alloca, 123); // if supported
    

This is useful for the purpose of saving memory. In case you have multiple drivers in a microcontroller application and each makes sense to hide behind a HAL, they can now share the same memory pool without the driver implementer having to speculate how many instances of each opaque type that will be needed.

Say for example that we have generic HAL for hardware peripherals to UART, SPI and CAN. Rather than each implementation of the driver providing its own memory pool, they can all share a centralized section. Normally I would otherwise solve that by having a constant such as UART_MEMPOOL_SIZE 5 exposed in uart.h so that the user may change it after how many UART objects they need (like the the number of present UART hardware peripherals on some MCU, or the number of CAN bus message objects required for some CAN implementation etc etc). Using #define constants is an unfortunate design since we typically don't want application programmers to mess around with provided standardized HAL headers.

Weakness answered 11/1, 2022 at 13:48 Comment(5)
@Marileemarilin "Also sizeof *obj -> sizeof (*obj)". Generally I agree but that's a subjective matter of style.Weakness
I've got that, thanks.Marileemarilin
@Marileemarilin Style, like I said. Linus Torvalds famously had one of his usual childish/psychopathic tantrums about it at one point, claiming that people using the wrong style should be shot or some such.Weakness
why your static_alloc function returns uint8_t*, I think it must be void* too, why not? Again style?Marileemarilin
@Marileemarilin The only reason it returns void* is for compatibility with malloc etc, as described in the answer. Apart from that void* is a pretty useless type, you'll always have to convert it to some other type in order to use it. There's also various advanced type concerns here like alignment and strict aliasing, but that's another story.Weakness
S
1

I faced a similar problem in implementing a data structure in which the header of the data structure, which is opaque, holds all the various data that needs to be carried over from operation to operation.

Since re-initialization might cause a memory leak, I wanted to make sure that data structure implementation itself never actually overwrite a point to heap allocated memory.

What I did is the following:

/** 
 * In order to allow the client to place the data structure header on the
 * stack we need data structure header size. [1/4]
**/
#define CT_HEADER_SIZE  ( (sizeof(void*) * 2)           \
                        + (sizeof(int) * 2)             \
                        + (sizeof(unsigned long) * 1)   \
                        )

/**
 * After the size has been produced, a type which is a size *alias* of the
 * header can be created. [2/4] 
**/        
struct header { char h_sz[CT_HEADER_SIZE]; };
typedef struct header data_structure_header;

/* In all the public interfaces the size alias is used. [3/4] */
bool ds_init_new(data_structure_header *ds /* , ...*/);

In the implementation file:

struct imp_header {
    void *ptr1, 
         *ptr2;
    int  i, 
         max;
    unsigned long total;
};

/* implementation proper */
static bool imp_init_new(struct imp_header *head /* , ...*/)
{
    return false; 
}

/* public interface */
bool ds_init_new(data_structure_header *ds /* , ...*/) 
{
    int i;

    /* only accept a zero init'ed header */
    for(i = 0; i < CT_HEADER_SIZE; ++i) {
        if(ds->h_sz[i] != 0) {
            return false;
        }
    }

    /* just in case we forgot something */
    assert(sizeof(data_structure_header) == sizeof(struct imp_header));

    /* Explicit conversion is used from the public interface to the
     * implementation proper.  [4/4]
     */
    return imp_init_new( (struct imp_header *)ds /* , ...*/); 
}

client side:

int foo() 
{
    data_structure_header ds = { 0 };

    ds_init_new(&ds /*, ...*/);
}
Supporter answered 22/5, 2013 at 9:40 Comment(2)
+1: But CT_HEADER_SIZE can be less than sizeof(struct imp_header), as padding can occur in the struct. And for me it needs to much redundant, handish working for the CT_HEADER_SIZE.Pronunciamento
struct header might not be correctly aligned if allocated statically: it doesn't have the same alignment requirements than struct imp_header. See https://mcmap.net/q/354363/-why-one-should-not-hide-a-structure-implementation-that-wayStokehold
C
1

This is an old question, but since it's also biting me, I wanted to provide here a possible answer (which I'm using).

So here is an example :

// file.h
typedef struct { size_t space[3]; } publicType;
int doSomething(publicType* object);

// file.c
typedef struct { unsigned var1; int var2; size_t var3; } privateType;

int doSomething(publicType* object)
{
    privateType* obPtr  = (privateType*) object;
    (...)
}

Advantages : publicType can be allocated on stack.

Note that correct underlying type must be selected in order to ensure proper alignment (i.e. don't use char). Note also that sizeof(publicType) >= sizeof(privateType). I suggest a static assert to make sure this condition is always checked. As a final note, if you believe your structure may evolve later on, don't hesitate to make the public type a bit bigger, to keep room for future expansions without breaking ABI.

Disadvantage : The casting from public to private type can trigger strict aliasing warnings.

I discovered later on that this method has similarities with struct sockaddr within BSD socket, which meets basically the same problem with strict aliasing warnings.

Caribbean answered 25/6, 2015 at 5:1 Comment(0)
T
0

I'm a little confused why you say you can't use malloc(). Obviously on an embedded system you have limited memory and the usual solution is to have your own memory manager which mallocs a large memory pool and then allocates chunks of this out as needed. I've seen various different implementations of this idea in my time.

To answer your question though, why don't you simply statically allocate a fixed size array of them in module.c add an "in-use" flag, and then have create_handle() simply return the pointer to the first free element.

As an extension to this idea, the "handle" could then be an integer index rather than the actual pointer which avoids any chance of the user trying to abuse it by casting it to their own definition of the object.

Than answered 14/12, 2010 at 15:24 Comment(3)
malloc() is often forbidden on embedded systems in favor of static allocation because it can introduce fragmentation and scenarios that are difficult or impossible to test for. Particularly for systems that have long 'up time' requirements. If your objects are allocated statically, memory allocation cannot fail if the system builds.Culliton
Maybe I should put that as a question just so you can answer it. We have some problems with fragmentation on our system. We have a memory pool type that has some sort of system of moveable blocks (not too sure how it works) so you can defragment the memory but no-one uses it that I know of.Than
Another reason to avoid using malloc() on embedded systems is code size. Typically, the libc malloc implementation is not small and has lots of other code that it pulls in, and if you're up against a code-size boundary, you'd much rather not do that.Gaspar
P
0

The least grim solution I've seen to this has been to provide an opaque struct for the caller's use, which is large enough, plus maybe a bit, along with a mention of the types used in the real struct, to ensure that the opaque struct will be aligned well enough compared to the real one:

struct Thing {
    union {
        char data[16];
        uint32_t b;
        uint8_t a;
    } opaque;
};
typedef struct Thing Thing;

Then functions take a pointer to one of those:

void InitThing(Thing *thing);
void DoThingy(Thing *thing,float whatever);

Internally, not exposed as part of the API, there is a struct that has the true internals:

struct RealThing {
    uint32_t private1,private2,private3;
    uint8_t private4;
};
typedef struct RealThing RealThing;

(This one just has uint32_t' anduint8_t' -- that's the reason for the appearance of these two types in the union above.)

Plus probably a compile-time assert to make sure that RealThing's size doesn't exceed that of Thing:

typedef char CheckRealThingSize[sizeof(RealThing)<=sizeof(Thing)?1:-1];

Then each function in the library does a cast on its argument when it's going to use it:

void InitThing(Thing *thing) {
    RealThing *t=(RealThing *)thing;

    /* stuff with *t */
}

With this in place, the caller can create objects of the right size on the stack, and call functions against them, the struct is still opaque, and there's some checking that the opaque version is large enough.

One potential issue is that fields could be inserted into the real struct that mean it requires an alignment that the opaque struct doesn't, and this won't necessarily trip the size check. Many such changes will change the struct's size, so they'll get caught, but not all. I'm not sure of any solution to this.

Alternatively, if you have a special public-facing header(s) that the library never includes itself, then you can probably (subject to testing against the compilers you support...) just write your public prototypes with one type and your internal ones with the other. It would still be a good idea to structure the headers so that the library sees the public-facing Thing struct somehow, though, so that its size can be checked.

Persis answered 14/12, 2010 at 18:25 Comment(3)
Your approach is buggy due to alignment considerations. The opaque struct needs to be something like long opaque[MAX_SIZE/sizeof(long)]; or better yet, a union containing a char array of the desired size and all the "large" types for alignment purposes.Smash
@R I've posted a question/answer about such alignment problem: #17619515Stokehold
what about strict aliasing warnings ?Caribbean
A
0

It is simple, simply put the structs in a privateTypes.h header file. It will not be opaque anymore, still, it will be private to the programmer, since it is inside a private file.

An example here: Hiding members in a C struct

Agostino answered 12/3, 2013 at 16:42 Comment(8)
This isn't a good idea, because the main reason for private encapsulation isn't so much the worry about the programmer doing bad things on purpose, but rather the programmer doing bad things by accident, if the struct declaration is globally visible. This is especially true in the days of IDE code completion where you can type myfoo. and then the IDE is happy to give you some alternatives to pick from.Weakness
@Weakness This is an idea that is defended by books such as "TDD for Embedded C" other references. I agree with the drawbacks you mention and I believe that true privates will make your software design way harder or impact in runtime modifications such as the adoption of malloc.Agostino
Many answers in this thread such as the one posted by Clifford shows that it is quite simple to keep the opaque type by implementing a simple, private memory pool - which is ideal for embedded systems. And well, I did briefly read that book at one point and wasn't very impressed, it is hardly a canonical reference.Weakness
We can argue a lot, this is a matter of taste. I'd use Clifford solution if I actually need a memory pool due to true requirements, not only for the sake of opaqueness. You see it differently, its ok, I don't think your view is not a good idea, these are matter of taste. I can argue that you are adding complexity, and you can argue I provide no security. I think we could skip trying to figure out which is better ;)Agostino
What I do in real applications, is to keep the struct public if it is just some simplistic one, but keep it opaque if it is something more intricate like a driver with a HAL. Also, you can use an opaque type implementation with a private header, which you only allow derived classes of the opaque type to access. That way you can achieve polymorphism in C.Weakness
Nice input. I favor in leaving to the client of a module to decide how allocation should be done. So I usually don't adopt opaqueness.Agostino
Well, if you wish to dive deeper into meta programming, you could pass the constructor a C++:ish "functor" in the form of a function pointer, pointing out the allocation method. Example: adt_t* adt_init (alloc_t* alloc, /*parameters*/) { adt_t* result = alloc(sizeof (adt_t)); /* init members */ return result; } Where alloc_t is a typedef'd function that allocates n number of bytes, preferably with a type compatible to void* (*) (size_t) so that you can either pass a pointer to malloc or to a custom, statical allocation function.Weakness
Let us continue this discussion in chat.Agostino
C
0

To expand further, I find this compromise between simplicity and opacity sufficient:

/* header.h */
typedef struct
{
    uint8_t public_value_0;
    uint8_t public_value_1;
    uint8_t opaque_internal_reserved_area[3];
} my_object_t;

void fun(my_object_t* object);
/* module.c */
typedef struct
{
    uint8_t public_external_area[2];
    uint8_t private_member_0;
    uint8_t private_member_1;
    uint8_t private_member_2;
} my_private_object_t;

typedef union
{
    my_object_t public;
    my_private_object_t private;
}my_private_object_tu;

void fun(my_object_t* object)
{
    my_private_object_tu* work_obj = (my_private_object_tu*) object;
    work_obj->private.private_member_1;
    work_obj->public.public_value_1;
}

This has multiple limitations : rely on the initialisation from the caller, duplicate the allocated area size that have to be kept in sync...

Croon answered 26/6 at 9:53 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.