Opaque struct without definition
Asked Answered
V

3

6

I am designing an iterator interface for my hashmap data structure. The current design looks like this:

// map.h
typedef struct map_iterator map_iterator;

// map.c
struct map_iterator
{
    // Implementation details
};

// client.c
map *m = map_new();
map_iterator *it = map_iterator_new(m);
void *key, *value;
while (map_iterator_next(it, &key, &value)) {
    // Use key, value
}
map_iterator_free(it);

However, this requires a heap allocation for the iterator object, and the client must remember to free the iterator when they're done. If I make map_iterator_new return the iterator on the stack, the code looks like this:

// map.h
typedef struct map_iterator
{
    // Implementation details
};

// client.c
map *m = map_new();
map_iterator it = map_iterator_new(m);
void *key, *value;
while (map_iterator_next(&it, &key, &value)) {
    // Use key, value
}

However, this requires that I provide the definition of the map_iterator struct to client code (otherwise I get an incomplete type error). I would like to hide this definition and provide only the declaration.

Is there any way of achieving this? Essentially, I'm looking for a way to tell the client code "this struct takes up X bytes so you can allocate it on the stack, but I'm not telling you how to access its members".

Edit: Standard C only, please! (no compiler extensions/platform-specific functions)

Virtu answered 27/6, 2016 at 9:32 Comment(10)
What is under 'implementation details' in the iterator?Rucksack
@Rucksack Iterator state, e.g. pointer to hashmap, current bucket index, modification count, etc.Virtu
I don't think providing a pointer to your iterator and freeing it afterwards is an issue here. Many libraries, like curl etc. also provide such an interfaces. I would rather use the first option than the second.Mineralize
Agreeing with @ckruczek. Also, if you provide some sort of re-setting of the iterator, a function using only needs to call new and free once instead of once for every loop.Elberfeld
@Mineralize Thanks, this is the approach I ended up taking. The others were too hacky or required more effort than malloc and free in the first place.Virtu
That's great. Especially because now you are more transparent and leave more space and freedom to the developer. That can be a bad thing, but also a very good thing to do ;)Mineralize
@AndrewSun Did you read my answer?Rucksack
@Rucksack Sorry, I didn't see it earlier. It does work, but next() will be called in a loop and memcpy() on each loop iteration is far worse than a single malloc() + free(), unfortunately. Maybe struct reinterpretation will work, but I'm not sure if that's guaranteed by the C standard.Virtu
@AndrewSun Did you measure it? How much slower (%) is the actual production code? Reinterpreting the structs is usually undefined behavior.Rucksack
Note that it's not actually "struct reinterpretation" if the only code to access the data hidden inside the char array is the implementation code. Otherwise, malloc() wouldn't work!Rectangle
R
2

Use serialization.

You are always allowed to copy an object of the T to an array of unsigned char and then back to an object of type T. That object will be the same as the original object. T can be any object type, and this can be done with automatic storage duration (stack):

int value = 568;
unsigned char store[sizeof( int )];
memcpy( store , &value , sizeof( int ) );
int other;
memcpy( &other , store , sizeof( int ) );
assert( other == value ),

This can be (ab)used to hide the implementation from the user. Define two struct, the one that defines the actual implementation and is not visible to the user, and one that is visible and contains only an array of unsigned char of appropriate size.

implementation.c

#include "implementation.h"

struct invisible
{
    int element1;
    char element2
    float element3;
    void** element4;
};

_Static_assert( sizeof( struct invisible ) <= VISIBLE_SIZE );

struct visible New( void )
{
    struct invisible i = { 1 , '2' , 3.0F , NULL };
    struct visible v = { 0 };
    memcpy( &v , &i , sizeof(i) );
    return v;
}

void Next( struct visible* v )
{
    struct invisible i = { 0 };
    memcpy( &i , &v , sizeof(i) );
    i.element1++;    //make some changes
    const int next = i.element;  
    memcpy( &v , &i , sizeof(i) );
    return next;
}

implementation.h

#define VISIBLE_SIZE 24    //make sure it greater or equal to the size of struct invisible
struct visible
{
    unsigned char[VISIBLE_SIZE];
};

struct visible New( void );
int Next( struct visible* v );

user.c

#include "implementation.h"

void Func( void )
{
    struct visible v = New();
    while( 1 )
    {
         const int next = Next( &v );
         if( next == 10 )
         {
              break;
         }
         printf( "%d\n" , next );
    }
}

This example only touches the member element1. In a real implementation, you can modify the invisible struct freely.

Rucksack answered 27/6, 2016 at 14:51 Comment(0)
R
2

Short answer: There isn't a good way. The best bet is to stick with having the API return an object that the caller frees.

Long answer: Here is an alternative, allowing stack allocation of an opaque object. There are caveats:

  1. You need to know the size of the actual object
  2. You need to understand alignment requirements for your machine and toolkit (a little)
    • alignment of fields
    • end-padding of structs, to achieve field alignment

Caveat 1 can be handled by having a utility function to print the size (not exported to the user API), along with assertions to catch errors.

Caveat 2 can be handled by adding an element of the type with the strictest alignment requirements in the user-visible definition (though it needn't be in the same place.)

Maintaining alignment avoids the need to use serialization as used in @2501's answer.

In the example below, you can ignore code below "// trivial implementation" comments. They're simply to provide a complete working example, but the algorithms are irrelevant to the OP.

map.h

#include <stdlib.h>

#define MAP_ITER_SIZE 16

typedef struct {
  void *p;          // force alignment to match implementation
  char space[MAP_ITER_SIZE-sizeof(void*)];
} map_iterator;

typedef struct map map;

map *map_new(void);
void map_iterator_init(map_iterator *iter, map *m);
int map_iterator_next(map_iterator *iter, int *p_key);

map_user.c

#include <stdlib.h>
#include <stdio.h>
#include "map.h"

int main(int argc, char * argv[])
{
    map_iterator it;
    int key;

    map *m = map_new();
    map_iterator_init(&it, m);
    while (map_iterator_next(&it, &key)) {
        printf("%d\n", key);
    }
}

map.c

#include <stdlib.h>
#include <assert.h>
#include "map.h"

#define INITIAL_KEY (-1)

struct map {
    int key_count;
    int first_key;
};

// Keep struct size consistent with MAP_ITER_SIZE in map.h
typedef struct {
    map *m;
    int cur_key;
} map_iterator_impl;

map *map_new(void) {
    map *m = malloc(sizeof(struct map));

    // trivial implementation for example only
    m->key_count = 2;
    m->first_key = 10;
}

void map_iterator_init(map_iterator *iter, map *m)
{
    map_iterator_impl *iter_impl = (map_iterator_impl *)iter;

    assert(sizeof(map_iterator) == sizeof(map_iterator_impl)); // optimizes out

    // trivial implementation for example only
    iter_impl->m = m;
    iter_impl->cur_key = INITIAL_KEY;      // not a valid key
}

int map_iterator_next(map_iterator *iter, int *p_key)
{
    map_iterator_impl *iter_impl = (map_iterator_impl *)iter;

    // trivial implementation for example only
    if (iter_impl->cur_key == INITIAL_KEY) {
        iter_impl->cur_key = iter_impl->m->first_key;
    } else {
        ++iter_impl->cur_key;
    }

    if (iter_impl->cur_key - iter_impl->m->first_key >= iter_impl->m->key_count) {
        return 0;
    }

    *p_key = iter_impl->cur_key;
    return 1;
}

unsigned int get_impl_size()
{
    return (unsigned int) sizeof(map_iterator_impl);
}

Pundits will argue against this and they'll have good points. The main argument is that the code isn't portable without jumping through hoops to get the SIZE constant correct for all supported (processor,compiler) cases. You also need to know what data type has the biggest alignment requirement, for each case.

Rectangle answered 26/2, 2020 at 17:12 Comment(0)
H
-2

There is function called alloca, that reserves memory on caller's stack. So your code may look like:

// map.h
typedef struct map_iterator map_iterator;
extern int map_iterator_size;
// map.c
struct map_iterator
{
    // Implementation details
};
int map_iterator_size = sizeof(struct map_iterator);
// client.c
map *m = map_new();
map_iterator *it = alloca(map_iterator_size);
map_iterator_new(m, it);
void *key, *value;
while (map_iterator_next(it, &key, &value)) {
    // Use key, value
}
Hapte answered 27/6, 2016 at 10:0 Comment(3)
Thanks, but I forgot to mention that I am looking for a strictly C99 standards-compliant solution, sorry about that.Virtu
alloca is one of those functions you should never think of using, given its history of involvement in stack corruption vulnerabilities. Furthermore, this has very little to do with API design, because it's a coding mistake/choice that the API user makes. Finally, incomplete types don't have a sizeof().Kizzee
I don't see any difference (from stack corruption perspective) between alloca() and having the object directly on stack. the sizeof is computed at place with complete defintion.Hapte

© 2022 - 2024 — McMap. All rights reserved.