On the term "(strict) aliasing violation" relating to class member access
Asked Answered
V

1

10

This question refers to the current C++20 draft. The quoted passages have been slightly modified from previous standard iterations, but not in relevant ways as far as I know.

I am looking for clarification on the the terms "aliasing violation" or "strict aliasing violation".

My previous impression was that these terms refer specifically to violations of the standard paragraph [basic.lval]/11:

If a program attempts to access ([defns.access]) the stored value of an object through a glvalue whose type is not similar ([conv.qual]) to one of the following types the behavior is undefined:

  • the dynamic type of the object,
  • a type that is the signed or unsigned type corresponding to the dynamic type of the object, or
  • a char, unsigned char, or std​::​byte type.

If a program invokes a defaulted copy/move constructor or copy/move assignment operator for a union of type U with a glvalue argument that does not denote an object of type cv U within its lifetime, the behavior is undefined. [ Note: Unlike in C, C++ has no accesses of class type. — end note ]

The note at the end is clarified further through notes and normative references in [defns.access] to mean that only scalar types can be accessed. See also the notes section on the related cppreference page.

Therefore it seems to me that the paragraph can never apply to access of non-static members of classes through the wrong class type and that "(strict) aliasing violation" cannot apply e.g. to the following example:

struct A { void f() {}; int a; };
struct B { void f() {}; int a; };

int main() {
    auto a = new A; // aligned for all non-overaligned types
    B& b = reinterpret_cast<B&>(*a);
    b.f();   //1
    b.a = 1; //2
}

Assume that sizeof(A) == sizeof(B) and offsetof(A, a) == offsetof(B, b) (although I don't think either makes a difference and the latter is trivial because the classes are standard-layout). Another variation of interest would be struct A { };, but again I think there won't be a difference.

In the case of b.f() there is no access to any scalar object at all. According to [expr.ref], in the latter case the referenced object is supposed to be the data member a of the B object referenced by b, but since that clearly doesn't exist, I suppose forming the member access might already be undefined behavior?

Both //1 and //2 should clearly be undefined behavior in some way though. I think //1 violates [class.mfct.non-static]/2, but I am not exactly sure which paragraph //2 violates exactly.

Which standard paragraphs do the two lines //1 and //2 violate exactly and is this violation considered to be covered by the terms "(strict) aliasing violation" as well?

Vyse answered 27/2, 2020 at 17:56 Comment(23)
You are correct about both. //2 is undefined by omission. //1 is undefined by the paragraph you cite.Relent
What storage would you be expecting //2 to access even in the absence of the so-called "aliasing" rules?Purehearted
@Purehearted I don't expect it to access anything. I expect it to be undefined behavior whether or not the aliasing rule as quoted above exists. My question is 1. whether it is made explicitly undefined by a standard clause or is just undefined by omission (the latter according to T.C.) and 2. whether one should call this an "aliasing violation" when there never actually is anything being aliased. I could have given A an int member as well and I think in that case referring to it as aliasing violation would be common, but the quoted aliasing rule is still not violated by that.Vyse
@walnut: With regard to #1, neither the C nor C++ Standard makes any effort to specify all of the constructs which do nothing with no side effects, but instead they only considers those which might (in the absence of a specification) have "accidental" side-effects (such as calling free(0) which is explicitly specified as a no-op). With regard to #2, I've seen nothing that would suggest that C++ was not intended to retain C's ability to implicitly create objects of structure type in untyped storage by writing the members thereof without having to first use a trivial constructor upon them...Purehearted
...but I don't think the Standard adequately describes how structure-member access operators work on lvalues that don't "yet" identify objects of the proper type, and I doubt that it would be possible for the Committee members to ever reach a consensus with regard to whether certain tricky corner cases should be expected to behave predictably. What's especially sad is the refusal of some people to recognize that some constructs that compilers with optimizations disabled almost unanimously process in the same predictable fashion offer useful semantics that should be, but aren't...,Purehearted
...available via any Standard-recognized means. Such recognition, even if combined with introduction of new constructs to support such semantics and deprecation of the old ones, would imply that quality compilers should seek to support code which hasn't been rewritten to use the new constructs.Purehearted
Even if you remove all "aliasing rules", you still can't use non existing objects. And you can't implicitly create an object by assignment either (as is (or should be) done implicitly w/ unions) w/o storage for the object. And b.a doesn't refer to a union member, so it can't be an lvalue "in the future": a normal lvalue must already refer to an object. Only few specific lvalues can be in the future.Judijudicable
Anyway, even w/o your two "clearly be undefined behavior" lines, the reinterpret_cast<B&>(a) itself looks like a potential alignment violation, as there is no reason why an A would always be sufficiently aligned to store a B. (And you don't even create a B here, but you couldn't anyway.)Judijudicable
Now, even if you managed to solve alignement and make the cast legit, and get a "proper" reference to the wrong type, do you have storage sufficient for an int? If you do, you can use placement new for an int. And if you have storage for a B, you can create a B. (Careful w/ automatic dtor invocation at the end of scope if you create objects that way. Here dtors are trivial so it's OK.) In that case, cast+assignment is not expected to implicitly create objects, even w/ a fixed std. (Std is known to be defective WRT lifetime.)Judijudicable
@Judijudicable Good catch with the alignment issue, I forgot about it. That is fixed now. Of course I know that there is no B object and that we can therefore not access any `B´ member. My question was whether this is stated explicitly somewhere in the standard or whether it is UB by omission. (T.C. pretty much answered that in the first comment). And secondly I want to know whether this (and the function call) would be called "aliasing violations". The question results from an answer I wrote mentioning that "aliasing violations" can only apply to scalar objects, that was heavily downvoted.Vyse
@Vyse You can try to interpret the rule as not applying to non union classes (and it's very intuitive to do so), but you can't exclude unions. Like scalar, unions must clearly be accessed as such, as there is no way to do member-wise copy of unions, unlike structs.Judijudicable
b.a = 1; accesses out of bounds (at best), so it is moot whether there is also a strict aliasing problemPopular
@Judijudicable traditionally there have been no rules about alignment of casts in C++ , since the form of the strict aliasing rule already excludes any possible code that produces a misaligned access .Popular
@Popular The standard states that the reinterpret_cast I am doing here results in an unspecified value if not properly aligned (see timsong-cpp.github.io/cppwp/n4659/expr.static.cast#13). Using such a value in any way would cause undefined behavior, making my points moot.Vyse
Regarding your first comment: I have added a variation where both classes have (probably) same layout, in which case it would make sense to talk about accessing that memory to some degree. But still I think this doesn't change anything and my conclusion is, similar to yours, that the aliasing rule is never relevant in such a case. This seems to be contra to how the term "strict aliasing violation" is commonly used though. I see it often used in relation to access on results from "invalid" reinterpret_casts on class types.Vyse
@Vyse the code in your question is correctly aligned however (since you used new) . IMO it'd be clearer if the question just used your second definition of A, since there can then be no issue with accessing out of bounds like the first version does havePopular
The rule has always been very underspecified and so required a healthy dose of intent in applying it. For example, in C, equivalent code has been ruled out by interpreting that (*p).x accesses *p and the rule applies to *p. I have no idea whatsoever about the intent of the latest changes in C++ though , so it seems we're lost in the murk for now ...Popular
@Popular I fixed the alignment issue after curiousguy commented about it. I have also exchanged the two variations of the code as you suggested.Vyse
@Vyse OK. The text in expr.ref/6.2 seems to define b.a via the text "the expression designates the named member of the object designated by the first expression". The first expression b does now designate an object (albeit an object of type A, not B) , and that object does have a named member a. So I'm not sure on what grounds T.C.'s argument of "undefined by omission" would now apply. Of course I do strongly doubt that this was the intent since A's member a might be somewhere elsePopular
@Popular eel.is/c++draft/expr#ref-4 specifies that the named member is referring to the one of the class type of the expression, not of the dynamic type of the object pointed-to. (And I think we agree that there can't be any magical connection between same-name members in different classes, right?) The omission here would be in that this sentence has no meaning if E1 doesn't actually refer to an object of the expression type, I guess.Vyse
@Vyse expr.ref/4 says that the id-expression shall name a member of the class of E1's type -- which it does -- but says nothing about the meaning of E1.E2 (which is covered by expr.ref/6) . I'm skeptical of any long chain of inference based on nuances of wording of multiple paragraphs . It'd be nice if the standard just clearly stated this sort of thing .Popular
@Relent would you care to write an answer for the updated code? Particularly, laying out how the paragraphs which would make a.a = 1; defined don't apply to b.a = 1Popular
The "named member" is B::a. There's no B::a in an A.Relent
H
1

What aliasing is

I am looking for clarification on the the terms "aliasing violation" or "strict aliasing violation".

That's going to be difficult, because these terms are vague and colloquial in C++. The only use in normative wording is [valarray.syn] p2 which is a pretty much meaningless paragraph.

[basic.lval] p11 (the paragraph you've quoted) has a somewhat relevant footnote though:

The intent of this list is to specify those circumstances in which an object can or cannot be aliased.

Generally, aliasing means that through two pointers or references, overlapping memory is reachable, so that the two memory regions can interfere with each other. More formally, the exact opposite of restrict in C.

To reduce the amount of possible aliasing, C++ makes it so that e.g. int* and float* cannot alias each other, since accessing an int throgh a glvalue of type float would be UB.

See also What is the Strict Aliasing Rule and Why do we care?

The nuances of accessing

The note at the end is clarified further through notes and normative references in [defns.access] to mean that only scalar types can be accessed.

The current wording says that "Only glvalues of scalar type can be used to access objects.".

struct A { void f() {}; int a; };
struct B { void f() {}; int a; };

In your example, it is possible to access A and B. If you wrote x.a where x is of type A, then you would be accessing an object of type A through a glvalue of type int, and through a glvalue of type A, albeit only the former is "used" for the access itself.

This is one of those cases where the intention of the committee is very clear, and A and B should not alias each other. However, the wording doesn't convey that intent as well as it could.

See also editorial pull request #4777, which changed the wording around "access" in the note.

Undefined behavior in your examples

Let's take a slightly simplified version of your code:

int main() {
    A a;
    B& b = reinterpret_cast<B&>(a);
    b.f();   //1
    b.a = 1; //2
}

//1 is undefined behavior because

A non-static member function may be called for an object of its class type, or for an object of a class derived ([class.derived]) from its class type [...]

- [class.mfct.non.static] p1

When calling b.f(), this calls B::f(), but the object is actually of type A, not of type B. This makes it UB by omission1).

//2 is undefined behavior because

If a program attempts to access the stored value of an object through a glvalue whose type is not similar to one of the following types the behavior is undefined: - [...]

- [basic.lval] p11

b.a = 1 is

  • accessing an object of type A through a glvalue b of type B, and
  • accessing its subobject A::a through a glvalue b.a of type int.

The second part is okay; the first part is undefined behavior. This is obvious when you forget about the note (which isn't normative anyway):

  • obviously, an object of type A is being accessed, and
  • obviously, this is through a glvalue b of type B (though not directly, but that isn't required)

Another way to think of it as UB is to look into [expr.ref] p6.2 and consider that in E1.E2, the designated subobject is one that doesn't exist if E1 doesn't have the right type, so it would be UB by omission.

Either way, the intention of the committee and implementers is that this is undefined behavior in some way.


1) There used to be another paragraph that makes it UB if the the object doesn't have the right type, but that was redundant, and everything is already said up here.

Hesler answered 28/8, 2023 at 23:2 Comment(1)
Until some compiler writers started abusing the term, aliasing as applied to the access of an object or resource by multiple references which were not freshly visibly related to each other, nor a common base reference, within a context where both were used, and the rules were meant to say when compilers must allow for the possibility of such aliasing of seemingly-unrelated references, rather than to invite compilers to be blind to relationships between references.Purehearted

© 2022 - 2024 — McMap. All rights reserved.