Is it legal to cast a struct to an union containing it?
Asked Answered
C

2

9

This is a followup to a question about printing a common member from different structs

I thought that unions permit to examine the common initial sequence of two of their elements. So I ended with the following code:

#include <stdio.h>

struct foo {
    const char *desc;
    float foo;
};
struct bar {
    const char *desc;
    int bar;
};
union foobar {
    struct foo foo;
    struct bar bar;
};

void printdesc(const union foobar * fb) {
    printf("%s\n", fb->foo.desc);          // allowed per 6.5.2.3 Structure and union members
}

int main() {

    struct bar bb = {"desc bar", 2};

    union foobar fb = { .bar=bb};

    printdesc((union foobar *) &(fb.bar)); // allowed per 6.7.2.1 Structure and union specifiers
    printdesc((union foobar *) &bb);       // legal?

    return 0;
}

It compiles without even a warning and gives the expected result

desc bar
desc bar

The point here is the line with the // legal? comment. I have converted a bar * into a foobar *. When the bar in a member of a foobar union, it is permitted per 6.7.2.1 Structure and union specifiers. But here I do not know.

Is it permitted to convert a pointer to a bar object to a pointer to a foobar object if the bar was not declared as a member of a foobar?

The question is not about whether it can work in a specific compiler. I am pretty sure that it does with all the common compilers in their current versions. The question is about whether it is legal C code.


Here is my current research.

References from draft n1570 for C11:

6.5.2.3 Structure and union members § 6

... if a union contains several structures that share a common initial sequence (see below), and if the union object currently contains one of these structures, it is permitted to inspect the common initial part of any of them...

6.7.2.1 Structure and union specifiers § 16

... A pointer to a union object, suitably converted, points to each of its members ..., and vice versa...

Cinemascope answered 11/3, 2021 at 10:16 Comment(11)
You might want to consider the list in 6.5p7 since according to note 88 "the intent of this list is to specify those circumstances in which an object may or may not be aliased."Introit
@user3386109: I think that an aggregate or union type that includes one of the aforementioned types among its members in 6.5p7 covers my use case.Cinemascope
I think, unless otherwise specified, what is guaranteed about the result of a cast is specified in the emphasized sentence A pointer to an object type may be converted to a pointer to a different object type. If the resulting pointer is not correctly aligned for the referenced type, the behavior is undefined. Otherwise, when converted back again, the result shall compare equal to the original pointer.Nubbly
I think that «an aggregate or union type that includes one of the aforementioned types among its members» in 6.5p7 covers my use case. fb->foo.desc is not an access through a union type.Nubbly
@SergeBallesta My experience is that 6.5p7 is so poorly written that you'll never get any two people to agree on what it means. You may want to check the C17 spec to see if anything has changed in that paragraph.Introit
@LanguageLawyer: my comment only concerned the list in 6.5p7 and the comment from user3386109.Cinemascope
@user3386109: I was not aware of the n2176 for C17 draft. I have just checked it, and no relevant parts from 6.5, 6.5.2.3 or 6.7.2.1 have changed.Cinemascope
@SergeBallesta That's unfortunate.Introit
@Introit pointer conversion rules are also heavily underspecified in the C standard. Like, what does suitably converted in 6.7.2.1p16 mean? So it is really thankless job tryna answer such questions.Nubbly
@LanguageLawyer — I'm surprised the 'suitably converted' phrase causes trouble. Suppose you have union U { Member1 m1; Member2 m2; }. Further, given union U *up;, the 'suitably converted' comment means that Member1 *mp1 = (Member1 *)up; is well defined, and so is Member2 *mp2 = (Member2 *)up;. Trying OtherType *otp = (OtherType *)up; is not 'suitably converted', nor is Member1 *mp1 = (Member2 *)up; — though the chances are that the result is the same for the latter, even though it is not 'suitably converted'.Gelid
If you have a union of two different size structs, it'll be the size of the biggest struct, but then if you cast the small struct as a union you might segfault.Cockayne
A
0

In general, no.

Suppose you had the following definitions:

struct foo {
    int flag;
    double foo;
};
struct bar {
    int flag;
    int bar;
};
union foobar {
    struct foo foo;
    struct bar bar;
};

The likely alignment of struct foo would be 8 while the likely alignment of struct bar would be 4. So if you did this:

struct bar b;
union foobar *fb = (union foobar *)&b;

You could run into an alignment issue. Section 6.3.2.3p7 of C11 states:

A pointer to an object type may be converted to a pointer to a different object type. If the resulting pointer is not correctly aligned for the referenced type, the behavior is undefined. Otherwise, when converted back again, the result shall compare equal to the original pointer. When a pointer to an object is converted to a pointer to a character type, the result points to the lowest addressed byte of the object. Successive increments of the result, up to the size of the object, yield pointers to the remaining bytes of the object.

So if b is not aligned on an 8 byte boundary, you have undefined behavior.

Anglophile answered 28/5, 2022 at 4:19 Comment(0)
K
1

One serious landmine off the top of my head:

void copydesc(const union foobar * fb, union foobar * fb_copy) {
  static_assert(sizeof(*fb) == sizeof(union foobar), "broken compiler");
  // NOTE: Buffer-overflow when struct cast smaller than `union foobar`.
  memcpy(fb_copy, fb, sizeof(*fb));
}

Most programmers won't catch this and the few that do are too valuable to waste time on it. And compilers/static-analysis-tools may not warn against your technique.

A safer approach, still not necessarily bullet-proof, is to add a struct with the same first shared fields, add constructor tests, and list the common struct first just like you wonderfully packed them in your example to make it more readable.

It's a bad sign whenever you find yourself wondering whether some not-so-common construct works correctly. It can be argued that such constructs are useful if you really know what you're doing (that's one feature of C) but this introduces significant risk that an unexpected chain reaction of independently correct parts (like the buffer-overflow above) causes serious issues.

Kinch answered 28/5, 2022 at 4:6 Comment(0)
A
0

In general, no.

Suppose you had the following definitions:

struct foo {
    int flag;
    double foo;
};
struct bar {
    int flag;
    int bar;
};
union foobar {
    struct foo foo;
    struct bar bar;
};

The likely alignment of struct foo would be 8 while the likely alignment of struct bar would be 4. So if you did this:

struct bar b;
union foobar *fb = (union foobar *)&b;

You could run into an alignment issue. Section 6.3.2.3p7 of C11 states:

A pointer to an object type may be converted to a pointer to a different object type. If the resulting pointer is not correctly aligned for the referenced type, the behavior is undefined. Otherwise, when converted back again, the result shall compare equal to the original pointer. When a pointer to an object is converted to a pointer to a character type, the result points to the lowest addressed byte of the object. Successive increments of the result, up to the size of the object, yield pointers to the remaining bytes of the object.

So if b is not aligned on an 8 byte boundary, you have undefined behavior.

Anglophile answered 28/5, 2022 at 4:19 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.