What happened to the "aggregate or union type that includes one of the aforementioned types" strict aliasing rule?
Asked Answered
F

1

6

Previously, in basic.lval, there was this bullet point:

an aggregate or union type that includes one of the aforementioned types among its elements or non-static data members (including, recursively, an element or non-static data member of a subaggregate or contained union),

In the current draft, it is gone.

There is some background information at WG21's site: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1359r0.html#2051:

The aliasing rules of 7.2.1 [basic.lval] paragraph 10 were adapted from C with additions for C++. However, a number of the points either do not apply or are subsumed by other points. For example, the provision for aggregate and union types is needed in C for struct assignment, which in C++ is done via constructors and assignment operators in C++, not by accessing the complete object.

Can anyone explain to me, what this means? What has this strict aliasing rule to do with struct assignment in C?

cppreference says about this rule:

These bullets describe situations that cannot arise in C++

I don't understand, why it is true. For example,

struct Foo {
    float x;
};

float y;
float z = reinterpret_cast<Foo*>(&y)->x;

The last line seems to do what the bullet point describes. It accesses y (a float) through an aggregate, which includes a float (member x).

Can anyone shed some light on this?

Freitag answered 3/7, 2019 at 22:12 Comment(6)
The first bullet point never made much sense anyway (in either language) so I'm glad to see it gone!Actinomycete
@Actinomycete could you explain the "in either language" claim?Chondrule
@LanguageLawyer in C or C++Actinomycete
@Actinomycete I mean why it does not make sense in CChondrule
@LanguageLawyer say you have struct S { int a; float b; }; and float *c = malloc(100); and int and float are the same size and no padding; then c[0] = c[1] = 0; S s = *(S *)c; is not an aliasing violation according to that text, because the float c[0] is accessed by an aggregate type that has a float as its member (never mind the fact that the offset of the member doesn't correspond to the memory being accessed)Actinomycete
@M.M: If one assumes that a compiler that can see that an lvalue is derived from another will treat an access to the former (for purposes of N1570 6.5p7) as though made using the latter, then an access myUnion.member1 would be an access to myUnion, which would be a struct union which contains myUnion.member2, and thus such an access would be allowed to modify the stored value of the latter.Ocasio
S
7

The lvalue you use to access the stored value of y is not *reinterpret_cast<Foo*>(&y), of type Foo, but it is reinterpret_cast<Foo*>(&y)->x, which has the type float. Accessing a float using an lvalue of type float is fine. In C++, you can not "access the value of a union or struct" (as whole), you can only access individual members. The rationale you quoted points to a difference between C and C++:

  struct X { int a, b; };
  struct X v1 = {1, 2}, v2;
  v2 = v1;

In C, the standard says that the assignment loads the value of v1 (as whole) to assign it to v2. Here the values of the objects v1.a and v2.b (both have types int) are accessed using an lvalue of type struct X (which is not int).

In C++, the standard says that the assignment calls the compiler generated assignment operator which is equivalent to

struct X {
   ...
   struct X& operator=(const struct X&other)
   {
       a = other.a;
       b = other.b;
   }
};

In this case, calling the assignment operator does not access any value, because the RHS is passed by reference. And executing the assignment operator accesses the two int fields separately (which is fine, even without the aggregate rule), so this is again not accessing a value through an lvalue of type struct X.

Sabbatarian answered 3/7, 2019 at 22:32 Comment(8)
What would the rule say in circumstances where storage received from malloc is written using one PODS and read with another that has the same layout?Ocasio
I agree with this answer, but I think an unwritten assumption in the question is that A.B "accesses" A (and so P->B "accesses" *P). I don't see anything in the Standard that clearly states that's not so, but the fact that A.B on its own does not imply modification or lvalue-to-rvalue of anything supports it.Kamin
@Ocasio This rule still is OK with it. If the two types are layout-compatible, the individual members are layout-compatible. If they are layout-compatible, accessing members from one POD type through a pointer to a different POD type is covered by an earlier bullet in the "aliasing clause".Sabbatarian
@MichaelKarcher: From what I can tell, neither gcc nor clang supports such usage. What's needed to make the C rule work is recognition that an access via pointer which is visibly derived from an lvalue should be considered as an access to that lvalue, but gcc doesn't support that usage either.Ocasio
What if Foo is struct Foo { char c; float x; };? Can't we say that in float z = reinterpret_cast<Foo*>(&y)->x; we access y through an lvalue of type float and so this is fine, too?Gleaning
No, it's not fine. There is no float object at reinterpret_cast<Foo*>(&y)->x, so this access is undefined behaviour. But it's not the rule discussed in this question that forbids it.Sabbatarian
float y; reinterpret_cast<Foo*>(&y)->x is finally fixed in cplusplus.github.io/CWG/issues/2535.htmlChondrule
thats a strange cwg issue actually, since int i = 0; int j = reinterpret_cast<C&>(i).m; is well defined (as well as float y; reinterpret_cast<Foo*>(&y)->x) as its shown here. They have probably changed the wording for the UB they put in the exampleUlloa

© 2022 - 2024 — McMap. All rights reserved.