Common initial sequence and alignment
Asked Answered
S

3

14

While thinking of a counter-example for this question, I came up with:

struct A
{
    alignas(2) char byte;
};

But if that's legal and standard-layout, is it layout-compatible to this struct B?

struct B
{
    char byte;
};

Furthermore, if we have

struct A
{
    alignas(2) char x;
    alignas(4) char y;
};
// possible alignment, - is padding
// 00 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15
//  x  -  -  -  y  -  -  -  x  -  -  -  y  -  -  -

struct B
{
    char x;
    char y;
}; // no padding required

union U
{
    A a;
    B b;
} u;

Is there a common initial sequence for A and B? If so, does it include A::y & B::y? I.e., may we write the following w/o invoking UB?

u.a.y = 42;
std::cout << u.b.y;

(answers for C++1y / "fixed C++11" also welcome)


  • See [basic.align] for alignment and [dcl.align] for the alignment-specifier.

  • [basic.types]/11 says for fundamental types "If two types T1 and T2 are the same type, then T1 and T2 are layout-compatible types." (an underlying question is whether A::byte and B::byte have layout-compatible types)

  • [class.mem]/16 "Two standard-layout struct types are layout-compatible if they have the same number of non-static data members and corresponding non-static data members (in declaration order) have layout-compatible types."

  • [class.mem]/18 "Two standard-layout structs share a common initial sequence if corresponding members have layout-compatible types and either neither member is a bit-field or both are bit-fields with the same width for a sequence of one or more initial members."

  • [class.mem]/18 "If a standard-layout union contains two or more standard-layout structs that share a common initial sequence, and if the standard-layout union object currently contains one of these standard-layout structs, it is permitted to inspect the common initial part of any of them."

Of course, on a language-lawyer level, another question is what it means that the inspection of the common initial sequence is "permitted". I guess some other paragraph might make the above u.b.x undefined behaviour (reading from an uninitialized object).

Salyers answered 1/2, 2014 at 15:33 Comment(12)
I don't think this is a good example. The structure with an int and a char has int alignment. That alignas(2) attribute for char byte as a first element is a no-op because that first element already has alignas(int) alignment. A possibly better example: struct A {int x; alignas(double) char byte;};Damn
@DavidHammen Ouch, true, I've meant to add padding after the byte. Fixing..Salyers
@DavidHammen I hope the example is better now.Salyers
There is no padding in front of x (regarding ASCII-art where x is at 02)Alveta
@DieterLücking Hmm no that would be illegal. There can be no padding at the beginning of a standard-layout struct. But the Standard doesn't allow the "odd" alignment I had in mind either, so I've removed that line. The remaining one is the alignment g++ and clang++ seem to be using.Salyers
Hmm. I thought had an answer, but then I thought some more. The more I look at the standard there appears to be a misalignment problem. Is alignas a part of the type-id or not? In some places it appears that this is the case, in others, it appears that this definitely is not the case.Damn
@DavidHammen Yeah.. I started wondering about the whole issue when I tried static_assert(std::is_same<decltype(A::byte), char>::value, "!"); which then lead to this question.Salyers
Side note: If a class using alignas on its members is not intended to be standard-layout, then sizeof(A) could be four, with the second member at offset 0, and the first at offset 2. Somewhat more relevant note: the current wording of "standard-layout" that already makes the literal requirements unimplementable for other reasons. Details here. I looked for open issues regarding alignment too, but found nothing of interest.Kingofarms
Nasty. But yeah, looks like the Standard doesn't address this well enough. The "obvious" intent is to define layout-compatible structs and common initial sequences as involving the same base classes with same alignments and the same member types with same alignments.Insensibility
@Tshepang According to the tag wiki, [union] is for SQL UNION, whereas [unions] is for C, C++ etc. unions.Salyers
That feels forced @dyp. We need better tags, maybe c-union.Parvis
@Tshepang I agree. Maybe [union] should be replaced by [SQL-UNION] and [unions] by [c-union]. Maybe there's been some discussion on meta? Otherwise, it might be worth a new question there. Edit: just upvoted your suggestion :)Salyers
C
2

I may not speak for C++11 standard, but I am a firmware/microchip programmer and have had to use such features that exist for a long time (pragma pack, alignment attributes).

Using alignas cannot be considered "standard layout", thus all the implications are useless. Standard layout means one fixed alignment distribution (per architecture - usually all is align(min(sizeof,4)) or some may be align(8)). The standard probably wants to say what is obvious: without using special features (align,pack) structures are compatible on the same architecture if they appear to be the same (same types in same order). Otherwise, they may or may not be compatible - depending on architecture (may be compatible on one architecture but different on another).

Consider this struct:

struct foo{ char b; short h; double d; int i; };

On one architecture (e.g. x86 32bit) it is what it seems to be, but on Itanium or ARM it actually looks like this:

struct foo{char b, **_hidden_b**; short h; **int _maybe_hidden_h**; double d; int i;}  

Notice _maybe_hidden_h - it can be omitted in older AEABI (align to max 4) or there for 64bit/8B alignment.

x86 Standard Layout (pack(1)):

alignas(1) char b; alignas(1) short h; alignas(1) double d; alignas(1) int i;  

32bit Alignment Standard Layout (pack(4) - ARM architecture, older version - EABI)

alignas(1) char b; alignas(2) short h; **alignas(4) double d**; alignas(4) int i;  

64bit Alignment Standard Layout (pack(8) - Itanium and newer ARM/AEABI)

alignas(1) char b; alignas(2) short h; **alignas(8) double d**; alignas(4) int i;

To your example:
offsetof(A,y) = 4 while offsetof(B,y) = 2 and the union does not change that (thus &u.a.y != u.b.y)

Candlewood answered 26/7, 2014 at 11:3 Comment(0)
R
2

It looks like a hole in the standard. The responsible thing would be to file a defect report.

Several things, though:

  • Your first example doesn't really demonstrate a problem. Adding a short after the char would also have the effect of aligning the char to a 2-byte boundary, without changing the common subsequence.
  • alignas is not C++-only; it was added simultaneously to C11. Since the standard-layout property is a cross-language compatibility facility, it is probably preferable to require corresponding alignment specifiers to match than to disqualify a class with a nonstatic member alignment-specifier.
  • There would be no problem if the member alignment specifiers appertained to the types of the members. Other problems may result from the lack of adjustment to types, for example a function parameter ret fn( alignas(4) char ) may need to be mangled for the ABI to process it correctly, but the language might not provide for such adjustment.
Revue answered 27/7, 2014 at 12:10 Comment(1)
Oops, of course. The first example was the more language-lawyery formulation of the problem that manifests in the second example.Salyers
D
0

(an underlying question is whether A::byte and B::byte have layout-compatible types)

Yes. This is the essential part. The alignas-attribute appertains to the entity declared, not the type. Can be easily tested by std::is_same and decltype.

I.e., may we write the following w/o invoking UB?

This is therefore not UB, the relevant paragraphes have been quoted by you.

EDIT: Pardon me, this can of course result in UB because the padding between members is not (or implementation-) defined (§9.2/13)! I accidently misread the example, because i thought it accessed x instead of y, because with x it actually always works - whereas with y it theoretically doesn't have to (though it practically always will).

Disannul answered 2/5, 2014 at 15:44 Comment(7)
How is that implemented, then? u.a.y = 42; writes to the second byte of the structure; if u.b.y shall contain the same value one would need either to track the active member of the union or also write to the fifth byte of the structure in u.a.y = 42;, right?Salyers
Oh, wait a minute - i just realized you did a completely different example than i thought. You use y instead of x. Well now that is NOT defined, since there can be any padding between members! I'll add that to the post as well :)Disannul
Yes, the padding between members is implementation-defined. But if there's a common initial sequence, you might access all those members of this sequence (not just the first). The question now is: What is the common initial sequence for those two structs? Does it include the second member? I'd say no, but where is this specified?Salyers
You gave the necessary quote yourself: If the members have layout-compatible types. And since the types are the same, you may modify them. Alignment is automatically taken from the strictest requirement from all the union members. Still i reckon this being weird, since this would make assumptions over the compability of structs...Disannul
So you mean that u.a.y will also have an alignment of 4 since u.a in the same union as u.b? Then, other assignments like A x; x = u.a; had to be changed. I'm not yet convinced how this all is intended to fits together.Salyers
Yes, it will have the same alignment. Otherwise this rule couldn't apply. Or the standard does somehow not specify precisely enough, how layout-compability is related to alignment of the objects declared with alignas.Disannul
I suspect the latter is the case ;) That's why I keep asking.. Well maybe I should ask this on the isocpp mailing list to see whether it's considered a defect or not.Salyers

© 2022 - 2024 — McMap. All rights reserved.