Are C-structs with the same members types guaranteed to have the same layout in memory?
Asked Answered
L

4

21

Essentially, if I have

typedef struct {
    int x;
    int y;
} A;

typedef struct {
    int h;
    int k;
} B;

and I have A a, does the C standard guarantee that ((B*)&a)->k is the same as a.y?

Luisluisa answered 6/11, 2013 at 5:17 Comment(1)
No, I don't think the standard does guarantee that. In practice, compilers will do it as you want and expect, but the standard does not guarantee it. It is undefined behaviour; anything could happen.Airliner
K
18

Are C-structs with the same members types guaranteed to have the same layout in memory?

Almost yes. Close enough for me.

From n1516, Section 6.5.2.3, paragraph 6:

... if a union contains several structures that share a common initial sequence ..., and if the union object currently contains one of these structures, it is permitted to inspect the common initial part of any of them anywhere that a declaration of the completed type of the union is visible. Two structures share a common initial sequence if corresponding members have compatible types (and, for bit-fields, the same widths) for a sequence of one or more initial members.

This means that if you have the following code:

struct a {
    int x;
    int y;
};

struct b {
    int h;
    int k;
};

union {
    struct a a;
    struct b b;
} u;

If you assign to u.a, the standard says that you can read the corresponding values from u.b. It stretches the bounds of plausibility to suggest that struct a and struct b can have different layout, given this requirement. Such a system would be pathological in the extreme.

Remember that the standard also guarantees that:

  • Structures are never trap representations.

  • Addresses of fields in a structure increase (a.x is always before a.y).

  • The offset of the first field is always zero.

However, and this is important!

You rephrased the question,

does the C standard guarantee that ((B*)&a)->k is the same as a.y?

No! And it very explicitly states that they are not the same!

struct a { int x; };
struct b { int x; };
int test(int value)
{
    struct a a;
    a.x = value;
    return ((struct b *) &a)->x;
}

This is an aliasing violation.

Kataway answered 6/11, 2013 at 6:3 Comment(11)
Why N1516? I'm referring to N1570…Octant
@Potatoswatter: It's what I had lying around. The language has been there since ANSI C days anyway (section 3.3.2.3).Kataway
If a complete union type declaration containing both struct a and struct b is visible where code inspects the struct member, a conforming and non-buggy compiler will recognize the possibility of aliasing. Some compiler writers who only want to abide by the standard when it suits them will break such code even though the Standard guarantees that it will work; that merely means their compilers are not conforming.Prowler
@Prowler Yes, but not a single compiler (that uses strict aliasing during optimization) I know of implements this rule, so it cannot be relied upon. In the future this clause might be removed. Standards are mostly crap anyway, most compiler do not really follow them.Flameproof
@Ivan: The only two compilers I know of which wouldn't make the CIS rule useful are gcc and clang in their buggy and non-conforming "-fstrict-aliasing" mode. For whatever reason, those compilers have latched onto an interpretation of the aliasing rules which requires compilers to handle an unworkable number of corner cases (many of which gcc and clang don't handle even in Strictly Conforming code), while deliberately ignoring actions which derive one lvalue or pointer from another. What other compilers do you know of where the CIS rule wouldn't be useful?Prowler
@Ivan: Perhaps it would have been helpful if the authors of the Standard, beyond just saying in a footnote that the purpose of the rule is to clarify when compilers must recognize aliasing, had also clarified that deriving an lvalue from another and using the derived lvalue does not involve aliasing unless the derived lvalue is used sometime after the next time code does something with the original, or enters a function or loop that does something with the original. That should perhaps have been obvious in 1989, but compiler writers no longer see things that way.Prowler
@Prowler -fstrict-aliasing is also the default mode for those compilers and its purpose is to allow compilers to use optimizations that standard tries to allow them to use. At least GCC, Clang and ICC do not implement this rule. MSVC does not use strict aliasing, so it is conformant. In my experience C rules that were not adopted by C++ standard are usually buggy. Clang uses C++ semantics in at least few cases (this one and loop termination at least) even when compiling in C mode.Flameproof
@Ivan: I think the authors of C89 intended the rule to be simple and straightforward, but as written it fails to allow even something so basic as someStruct.member = value; unless member has a character type. Problems with this rule became apparent with Defect Report #28, but the response reached a correct conclusion using a totally nonsensical rationale, and that silly rationale formed the basis for C99's unnecessary and unworkable "effective type" rules.Prowler
I don't see how it's an aliasing violation. Both xs will have the same address and will be accessed using the same type - int.Isologous
@wonder.mice: It’s not enough that x has the same type in both. The problem is that a has type struct a, and you’re accessing it through a type of struct b. Here is a link that shows you how a compiler will optimize based on aliasing: gcc.godbolt.org/z/7PMjbT try removing -fstrict-aliasing and seeing how the generated code changes.Kataway
@Isologous If I understand correctly, the problem arises if: - You modified the source member. - The compiler decided to keep the new value of the member in a register and not update the corresponding memory location until a later time. - During that time, you decided to read the same memory location (through another pointer, a pointer that the compiler doesn't know corresponds to the same location) and assign the value to the destination member. Thus, using the outdated value instead of the new one.Theurich
E
7

Piggybacking on the other replies with a warning about section 6.5.2.3. Apparently there is some debate about the exact wording of anywhere that a declaration of the completed type of the union is visible, and at least GCC doesn't implement it as written. There are a few tangential C WG defect reports here and here with follow-up comments from the committee.

Recently I tried to find out how other compilers (specifically GCC 4.8.2, ICC 14, and clang 3.4) interpret this using the following code from the standard:

// Undefined, result could (realistically) be either -1 or 1
struct t1 { int m; } s1;
struct t2 { int m; } s2;
int f(struct t1 *p1, struct t2 *p2) {
    if (p1->m < 0)
        p2->m = -p2->m;
    return p1->m;
}
int g() {
    union {
        struct t1 s1;
        struct t2 s2;
    } u;
    u.s1.m = -1;
    return f(&u.s1,&u.s2);
}

GCC: -1, clang: -1, ICC: 1 and warns about the aliasing violation

// Global union declaration, result should be 1 according to a literal reading of 6.5.2.3/6
struct t1 { int m; } s1;
struct t2 { int m; } s2;
union u {
    struct t1 s1;
    struct t2 s2;
};
int f(struct t1 *p1, struct t2 *p2) {
    if (p1->m < 0)
        p2->m = -p2->m;
    return p1->m;
}
int g() {
    union u u;
    u.s1.m = -1;
    return f(&u.s1,&u.s2);
}

GCC: -1, clang: -1, ICC: 1 but warns about aliasing violation

// Global union definition, result should be 1 as well.
struct t1 { int m; } s1;
struct t2 { int m; } s2;
union u {
    struct t1 s1;
    struct t2 s2;
} u;
int f(struct t1 *p1, struct t2 *p2) {
    if (p1->m < 0)
        p2->m = -p2->m;
    return p1->m;
}
int g() {
    u.s1.m = -1;
    return f(&u.s1,&u.s2);
}

GCC: -1, clang: -1, ICC: 1, no warning

Of course, without strict aliasing optimizations all three compilers return the expected result every time. Since clang and gcc don't have distinguished results in any of the cases, the only real information comes from ICC's lack of a diagnostic on the last one. This also aligns with the example given by the standards committee in the first defect report mentioned above.

In other words, this aspect of C is a real minefield, and you'll have to be wary that your compiler is doing the right thing even if you follow the standard to the letter. All the worse since it's intuitive that such a pair of structs ought to be compatible in memory.

Economize answered 6/11, 2013 at 8:40 Comment(9)
Thanks a lot for the links, although they're largely inconsequential sadly. For what little it might be worth, the consensus among the few (lay)people I've discussed this with seems to be that it means the function must be passed the union, not raw pointers to the contained types. This, however, defeats the point of using a union in the first place, to my mind. I've got a question about this clause - specifically its notable (and perhaps accidental?) exclusion from C++ - over here: https://mcmap.net/q/246365/-union-39-punning-39-structs-w-quot-common-initial-sequence-quot-why-does-c-99-but-not-c-stipulate-a-39-visible-declaration-of-the-union-type-39/2757035Gorget
Not inconsequential at all! Via a 2nd GCC discussion linked from yours, we see that C++ may have deliberately rejected this - whereas C didn't really think before adding this wording, has never really taken it seriously, & might be reversing it: gcc.gnu.org/bugzilla/show_bug.cgi?id=65892 From there, we get to C++ DR 1719 open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#1719 which suggests a major wording change that seems to make C++'s perspective on exactly where such structs can be 'punned' very clear. I've collected this and much more into an answer on my linked questionGorget
@underscore_d: The interpretation used by gcc is nonsensical. Two arbitrary structure types S1 and S2 will generally not appear together in a union declaration unless code somewhere relies upon the CIS guarantee with respect to the structures involved. If S1 and S2 include member m as part of the CIS, then given S1 *p1; S2 *p2;, a presumption that p1->m might alias p2->m does not seem overly pessimistic. If such access weren't required, why declare the union type?Prowler
@Prowler The C union visibility rule is batcrazy. It makes the semantics of a function potentially dependent on definitions that aren't even referenced in the function. It makes code ridiculously brittle. Code should never depend on things it doesn't name. It's the language semantics equivalent of hearsay: someone else used a type in some context, so it means that... horrible. No language should have that.Ceballos
@curiousguy: For the CIS rule to be useful on compilers that are incapable of recognizing the act of deriving a pointer or lvalue of one type from a pointer or lvalue of another as sequenced relative to other actions involving those types, there needs to be a means of telling the compiler "this pointer will identify one of these structure types, and I don't know which one, but I need to be able to use CIS members of one to access CIS members of all of them". Having union declarations serve that purpose in addition to declaring union types would avoid the need to introduce a new directive...Prowler
...for that purpose. Note that the way 6.5p7 is written, given struct foo {int x;} *p, it;, something like p=&it; p->x=4; would invoke UB since it uses an lvalue of type int to modify an object of type struct foo, but the authors of the Standard expect that compiler writers won't be so obtuse as to pretend they shouldn't treat that as defined. The Standard has never made any reasonable attempt to fully specify the full range of semantics that should be supported by an implementation targeting any particular platform and purpose. The nonsensical "effective type" rules can't even...Prowler
...handle the most basic operations on structure members of non-character types. If one were to tweak 6.5p7 to say that any byte of storage which is changed during any particular execution of a function or loop must be accessed within its lifetime exclusively via lvalues that are derived--during that execution--from the same object or elements of the same array, and that all use of a derived lvalue in relation to a byte precede the next use of the parent in relation to that byte, one could ditch everything to do with "effective types" and make things both simpler and more powerful.Prowler
@Prowler Long story short, strict aliasing is a train wreck.Ceballos
@curiousguy: It's the worst symptom of a more general problem: the refusal of the Standard to consider "quality of implementation" issues, and its reliance upon implementers to recognize situations where features and guarantees beyond those mandated by the Standard may be necessary to make an implementation be suitable for various purposes.Prowler
O
3

This sort of aliasing specifically requires a union type. C11 §6.5.2.3/6:

One special guarantee is made in order to simplify the use of unions: if a union contains several structures that share a common initial sequence (see below), and if the union object currently contains one of these structures, it is permitted to inspect the common initial part of any of them anywhere that a declaration of the completed type of the union is visible. Two structures share a common initial sequence if corresponding members have compatible types (and, for bit-fields, the same widths) for a sequence of one or more initial members.

This example follows:

The following is not a valid fragment (because the union type is not visible within function f):

struct t1 { int m; };
struct t2 { int m; };
int f(struct t1 *p1, struct t2 *p2)
{
    if (p1->m < 0)
          p2->m = -p2->m;
    return p1->m;
}

int g() {
    union {
          struct t1 s1;
          struct t2 s2;
    } u;
    /* ... */
    return f(&u.s1, &u.s2);}
}

The requirements appear to be that 1. the object being aliased is stored inside a union and 2. that the definition of that union type is in scope.

For what it's worth, the corresponding initial-subsequence relationship in C++ does not require a union. And in general, such union dependence would be an extremely pathological behavior for a compiler. If there's some way the existence of a union type could affect a concerete memory model, it's probably better not to try to picture it.

I suppose the intent is that a memory access verifier (think Valgrind on steroids) can check a potential aliasing error against these "strict" rules.

Octant answered 6/11, 2013 at 5:56 Comment(12)
C++ might not stipulate that the union declaration is required, but still it behaves identically to C - not allowing aliasing on 'naked' pointers to union members - via both GCC and Clang. See @ecatmur's test on my question here about why this clause was left out of C++: https://mcmap.net/q/246365/-union-39-punning-39-structs-w-quot-common-initial-sequence-quot-why-does-c-99-but-not-c-stipulate-a-39-visible-declaration-of-the-union-type-39/2757035 Any thoughts readers might have on this difference would be very welcome. I suspect this clause should be added to C++ and was just accidentally omitted for 'inheritance' from C99, where it was added (C99 did not have it).Gorget
@Gorget The visibility part was purposely omitted from C++ because it's widely considered to be ludicrous and unimplementable (or at least distant from the practical considerations of any implementation). Alias analysis is part of the compiler back-end, and declaration visibility is typically only known in the front-end.Octant
I did read about the derisive response to N685 here: gcc.gnu.org/bugzilla/show_bug.cgi?id=65892 Is there an 'on the record' account of the decision to exclude this from C++? (I wonder whether C will end up reversing it, too, since it's never been clearly enough defined or supported to be of any real use anyway.) Thanks in advance for any more concrete info on this.Gorget
@Gorget The folks in that discussion are essentially "on the record" there. Andrew Pinski is a hardcore GCC backend guy. Martin Sebor is an active C committee member. Jonathan Wakely is an active C++ committee member and language/library implementer. That page is more authoritative, clear, and complete than anything I could write.Octant
Thanks! JW was the only person I'd seen before. Because I'm not used to summing up these types of situation, would you agree the following is accurate? (A) The discussion confirms that, all along, the wording without the 'visible declaration' bit was intended to mean 'punning' of qualifying structs in a union must happen via a locally visible instance of said union. (B) N685 was a misreading of that, applied to the union type and aliasing, mandating complexity that most implementors disagreed with and ignored. (C) The C++ reflector quoted shows a conscious decision to ignore N685Gorget
@Gorget The intent of N685 isn't particularly clear, since it doesn't go into much depth as to why its proposed words actually solve the problem. C++, which omits the N685 wording, is also undecided (or perhaps finally reaching consensus) as to what can be done with pointers into the initial subsequence. The reflector quote shows someone deriving proper rules from practicalities, not the standard. The C and C++ committees (via Martin and Clark) will try to find a consensus and hammer out wording so the standard can finally say what it means.Octant
@Potatoswatter: What's the ambiguity? The "union visibility" rule defines a means by which structures that need to allow interchangeable access to a common initial sequence can indicate that requirement. If the requirement were that something be accessed through the union, then the requirement that its declaration be "visible" would be meaningless since one can't access things through a union whose declaration isn't visible.Prowler
@Potatoswatter: What's really needed is a recognition that the purpose of the rule was to avoid forcing overly pessimistic aliasing assumptions in cases where a compiler had no reason to expect aliasing. Unless the example in the rationale was deliberately disingenuous, I see no reason to believe that the authors of the Standard wouldn't have intended that, given foo *p,*q;, a presumption that ((bar*)p)->x=5; might alter q would be considered considered "overly pessimistic", or that a quality compiler shouldn't be capable of recognizing aliasing in such cases.Prowler
@Prowler Are you suggesting that rule violations should be permitted when they are done openly? That position seems to be obvious to Linus but it's shocking at some level.Ceballos
@curiousguy: Given the stated purpose of the rule, and given that the rationale for at least the C Standard makes clear that UB was intended to allow the makers of quality implementations to use their judgment in supporting useful behaviors beyond those mandated by the Standard, and given that the rules as written (at least in the C Standard) don't even mandate the behavior of something like struct S {int x;} s={0}; s.x=1; as UB because it modifies an object of type struct S using an lvalue of type int, in violation of 6.5p7, I think it's pretty clear...Prowler
...that the authors did not intend 6.5p7 to fully describe all cases compilers should support. Instead, they expected that compiler writers would be able to better judge the situations when they should recognize an access to a derived pointer or lvalue as being an access or potential access to the original value. The problem is that some compiler writers have gotten a warped idea that the Standard was ever intended to fully describe all behaviors programmers should expect from quality implementations, even though the rationale makes it clear that was not the case.Prowler
@curiousguy: To put it another way, the authors of the Standard didn't mandate that compilers should recognize the use of derived lvalues in situations where they can see them because (1) they thought it obvious that compilers should recognize such uses in cases where they can see them, and (2) such a requirement would be meaningless without a spec as to what cases compilers should notice, but (3) they thought the exact range of cases where compilers notice such derivation be a Quality-of-Implementation matter.Prowler
B
0

I want to expand on @Dietrich Epp 's answer. Here is a quote from C99:

6.7.2.1 point 14 ... A pointer to a union object, suitably converted, points to each of its members ... and vice versa.

Which means we can copy the memory from a struct to a union containing it:

struct a
{
    int foo;
    char bar;
};

struct b
{
    int foo;
    char bar;
};

union ab
{
    struct a a;
    struct b b;
};

void test(struct a *aa)
{
    union ab ab;
    memcpy(&ab, aa, sizeof *aa);

    // ...
}

C99 also says:

6.5.2.3 point 5 One special guarantee is made in order to simplify the use of unions: if a union contains several structures that share a common initial sequence ..., and if the union object currently contains one of these structures, it is permitted to inspect the common initial part of any of them anywhere that a declaration of the complete type of the union is visible. Two structures share a common initial sequence if corresponding members have compatible types .... for a sequence of one or more initial members.

Which means the following will also be legal after the memcpy:

ab.a.bar;
ab.b.bar;

The struct could be initialized in a separate translation unit and the copying is done in the standard library (out of the control of the compiler). Thus, memcpy will copy byte-by-byte the value of the object of type struct a and the compiler has to ensure the result is valid for both structs. The compiler cannot do anything other than generate instructions that read from the corresponding memory offset for both of those lines, thus the address needs to be the same.

Even though it is not stated explicitly, I would say the standard implies that C-structs with the same member types have the same layout in memory.

Bovine answered 4/9, 2022 at 10:42 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.