How does the C offsetof macro work? [duplicate]

R

4

20

Possible Duplicate:
Why does this C code work?
How do you use offsetof() on a struct?

I read about this offsetof macro on the Internet, but it doesn't explain what it is used for.

#define offsetof(a,b) ((int)(&(((a*)(0))->b)))

What is it trying to do and what is the advantage of using it?

Ri answered 26/10, 2011 at 1:47 Comment(1)

That offsetof macro is incorrect. They should cast to size_t, not int, and they should probably subtract (char*)0 from the result before casting even though it's a null pointer constant. – Deficiency 26/10, 2011 at 2:27

K

19

It has no advantages and should not be used, since it invokes undefined behavior (and uses the wrong type - int instead of size_t).

The C standard defines an offsetof macro in stddef.h which actually works, for cases where you need the offset of an element in a structure, such as:

#include <stddef.h>

struct foo {
    int a;
    int b;
    char *c;
};

struct struct_desc {
    const char *name;
    int type;
    size_t off;
};

static const struct struct_desc foo_desc[] = {
    { "a", INT, offsetof(struct foo, a) },
    { "b", INT, offsetof(struct foo, b) },
    { "c", CHARPTR, offsetof(struct foo, c) },
};

which would let you programmatically fill the fields of a struct foo by name, e.g. when reading a JSON file.

Koloski answered 26/10, 2011 at 1:55 Comment(12)

I am sorry - how does the offsetof macro cause undefined behavior especially since its was defined in the C standard? – Londrina 26/10, 2011 at 2:21

The standard offsetof macro from stddef.h does not invoke UB. Defining your own hack to compute offsets this way does invoke UB. – Koloski 26/10, 2011 at 2:24

Please quote me the standard reference that says defining your own version of the macro causes undefined behaviour – Londrina 26/10, 2011 at 2:27

@Adrian: He didn't say, "defining your own version of the macro causes undefined behaviour." He specifically said, "Defining your own hack to compute offsets this way does invoke UB." In the code, at this point: ((a*)(0))-> you've invoked undefined behavior by dereferencing null. – Fusco 26/10, 2011 at 2:29

@GMan - where the hell have you referenced null its casting null as a a* pointer. And "Defining your own hack" is that a technical term for some code that I dont know after 20 years of C programming? let see how linux defines it #ifndef offsetof # define offsetof(T,F) ((unsigned int)((char *)&((T *)0L)->F - (char *)0L)) #endif Hmm look very similar to OP – Londrina 26/10, 2011 at 2:36

@Adrian: x-> is defined to be (*x).. In our case x is (a*)0, and *x dereferences null. And congrats: after twenty years of C you still don't know what implementation-defined behavior is. Quoting a specific definition on a specific implementation at a specific time has nothing to do with the language definition of the macro. The language states the effects of the macro and that's it, it doesn't define an implementation. I mean hell, you quoted the standard yourself; where in there does is state the definition of the macro? – Fusco 26/10, 2011 at 2:38

@AdrianCornish: The implementation is allowed to define offsetof however it likes as long as it implements the correct behavior. Your application does not have this privilege because it can't define the behavior of anything; it can only use already-defined language constructs. That's how C works. – Koloski 26/10, 2011 at 2:42

6.5.2.3 does not use the word "dereference", but specifies it as "the named member of the object to which the first expression points". Since (a*)(0) does not point to an object of type a, the behavior is undefined (by virtue of not being defined). – Koloski 26/10, 2011 at 2:49

[I love language standard debates] Which subclause (are you using c99) I cannot find one. I would argue: 6.3.2.3 Pointers 1 A pointer to void may be converted to or from a pointer to any incomplete or object type. A pointer to any incomplete or object type may be converted to a pointer to void and back again; the result shall compare equal to the original pointer. – Londrina 26/10, 2011 at 2:54

The text you cited is irrelevant. No pointer to incomplete or object type is converted to a pointer to void in the bogus macro. – Koloski 26/10, 2011 at 2:57

@R.. what is the type of foo_desc here? Did you mean food_desc[] instead? – Clasp 15/3, 2014 at 10:3

I added the include directive because it just felt right after this edit suggestion got rejected. – Slob 17/12, 2016 at 10:17

C

47

R.. is correct in his answer to the second part of your question: this code is not advised when using a modern C compiler.

But to answer the first part of your question, what this is actually doing is:

(
  (int)(         // 4.
    &( (         // 3.
      (a*)(0)    // 1.
     )->b )      // 2.
  )
)

Working from the inside out, this is ...

Casting the value zero to the struct pointer type a*
Getting the struct field b of this (illegally placed) struct object
Getting the address of this b field
Casting the address to an int

Conceptually this is placing a struct object at memory address zero and then finding out at what the address of a particular field is. This could allow you to figure out the offsets in memory of each field in a struct so you could write your own serializers and deserializers to convert structs to and from byte arrays.

Of course if you would actually dereference a zero pointer your program would crash, but actually everything happens in the compiler and no actual zero pointer is dereferenced at runtime.

In most of the original systems that C ran on the size of an int was 32 bits and was the same as a pointer, so this actually worked.

Clambake answered 26/10, 2011 at 2:15 Comment(1)

Excellent! Thank you. The key to me was placing a struct object at memory address zero and then finding out at what the address of a particular field is. – Sluggish 17/10, 2018 at 10:43

K

19

It has no advantages and should not be used, since it invokes undefined behavior (and uses the wrong type - int instead of size_t).

The C standard defines an offsetof macro in stddef.h which actually works, for cases where you need the offset of an element in a structure, such as:

#include <stddef.h>

struct foo {
    int a;
    int b;
    char *c;
};

struct struct_desc {
    const char *name;
    int type;
    size_t off;
};

static const struct struct_desc foo_desc[] = {
    { "a", INT, offsetof(struct foo, a) },
    { "b", INT, offsetof(struct foo, b) },
    { "c", CHARPTR, offsetof(struct foo, c) },
};