Undead objects ([basic.life]/8): why is reference rebinding (and const modification) allowed?

Asked 12/12, 2019 at 6:31 Answered 12/12, 2019 at 14:12

c++language-lawyer constants lifetime placement-new

The "undead" clause

I call the undead clause the C++ rule that after the destruction of an object, if a new object is created at the same address, it can sometimes be considered the same object as the old one. That rule always existed in C++ but with some changes on the additional conditions.

I was made to read the latest undead clause by this question. The revised conditions in Lifetime [basic.life]/8 are:

(8.1) the storage for the new object exactly overlays the storage location which the original object occupied, and

Well, duh. An object at a different address would not be the same object.

(8.2) the new object is of the same type as the original object (ignoring the top-level cv-qualifiers), and

Again, duh.

(8.4) neither the original object nor the new object is a potentially-overlapping subobject ([intro.object]).

It cannot a base class, classic (or a member with a special declaration that makes its address not unique). Again, duh.

(8.3) the original object is neither a complete object that is const-qualified nor a subobject of such an object, and

Now that's interesting. The object being replaced can't be either:

a complete const object
part of a complete const object

On the other hand, the object being resurrected can be:

a const member subobject
a subobject of such const member
an element in an array of const objects

Const subobject

So it seems to me that all of these objects x can be resurrected:

Const member subobject

struct CI {
  const int x;
};

CI s = { 1 };
new ((void*)&s.x) int(2);
int r = s.x; // OK, 2

Subobject of const member:

struct T {
  int x;
};

struct CT {
  const T m = { 1 };
};

CT s;
new ((void*)&s.m.x) int (2);
int r = s.m.x;

Element in an array of const objects:

const int x[1] = { 1 };
new ((void*)&x[0]) int (2);
int r = x[0];

Classes with const and reference members

Also object of class type with const or references members do not seem to be prohibited; the resurrected object is still called x.

Class with a const member:

struct CIM {
  CIM(int i): m(i) {}
  const int m;
};

CIM x(1);
new ((void*)&x) CIM(2);
int r = x.m; // OK, 2

Class with a reference member:

struct CRM {
  CRM (int &r): m(r) {}
  int &m;
};

int i=1,j=2;
CRM x(i);
new ((void*)&x) CRM(j);
int r = x.m; // OK, 2

The questions

Is that interpretation of the clause correct?
If so, is there any other clause that forbid these overwriting operations?
If so, is that intended? Why was that changed?
Is that a breaking change for code generators? Do all compilers really support that? Don't they optimize based on const members, const elements of arrays being immutable and references not being reboundable?
BONUS QUESTION: does that affect ROM-ability of const objects with adequate storage class (not dynamically created objects, of course) and adequate initialize?

Note: I added the bonus later because putting constants in ROM came up in the discussion.

Olericulture answered 12/12, 2019 at 6:31 Comment(15)

I suspect the intention is "neither an object that is const-qualified nor a subobject of such an object". Not sure though. – Ballflower 12/12, 2019 at 10:32

@L.F. Even if the object being replace is not const, it can contain a const member that will also be replaced (not by placement new itself, but by the constructor). – Olericulture 12/12, 2019 at 10:36

"If so, is that intended? Why was that changed?". Yes. 'Coz Яussians wanted it. – Translate 12/12, 2019 at 13:10

@LanguageLawyer So all roads really lead to Putin? – Olericulture 12/12, 2019 at 14:14

I think "can be considered the same object" is a bit of an unfortunate wording. Not like it's truly wrong, just... it got me puzzled. The standard merely says that pointers and references to the old object still "work", which most people would probably expect to be within the realm of "yeah of course, what else!" even if the standard didn't explicitly say so. My first reaction to reading "same object" was "Huh, no way! How is that supposed to work.". – Stanfordstang 12/12, 2019 at 18:46

@Stanfordstang Are you referring to the fact that pointers are trivial types, and on simple flat ptr architectures (ptr = address value), that means that 2 ptr w/ same bit presentation must point to the same object. But the triviality of ptr is a lie of the std. In the real world ptr are not trivial. If they were, you would be able to sometimes derive a ptr to an object from a ptr to another w/ arithmetic. I have posted many Q re: ptr representation like Is memcpy of a pointer the same as assignment? (C question but still) – Olericulture 12/12, 2019 at 20:40

In these (heavily downvoted) questions, I was searching the good terminology. Someone proposed "mystical" for the property that 2 ptr of identical numeric value would not be "equal" in term of their abstract value. Note that ptr are unlike integers. Think Java (which has pointers under the "reference" terminology. Integers can be enumerated (at least small ones, not big ints) but references can't (security property). C++ has no such "security" (you can cast int to ptr at will) but the C++ impl has such expectation. – Olericulture 12/12, 2019 at 20:47

For the array case, the array itself is also considered const-qualified. See eel.is/c++draft/basic.type.qualifier#6.sentence-2 – Bedtime 13/12, 2019 at 0:19

@Bedtime So at least an array of const elements can't be overwritten. Good. – Olericulture 13/12, 2019 at 0:25

Regarding point 5.: [basic.life]/10 forbids placing new objects in storage occupied by non-dynamic const complete objects at all. Since it only makes sense to put complete objects in ROM and they can only be placed there if the complete object is const, there is no issue with that here. – Manxman 13/12, 2019 at 6:18

@Manxman Actually [basic.life]/10 doesn't make much sense. Yet another LL question! – Olericulture 13/12, 2019 at 6:33

I refer to the "obvious" fact (and sometimes very wrong fact, e.g. with virtual classes, and member pointers) that a pointer is just a simple integer value that refers to an address. So if you pull the carpet below an object's feet and don't tell anyone, and create a new object then even if the standard doesn't say so, then "of course" the pointer still points at an object, and if it's the same type, it will "of course" work. That's the naive expectation that most people (me included) will have. Certainly a pointer in reality is not always just an integer, and does not always point – Stanfordstang 13/12, 2019 at 12:16

[...cont] to the same address. And sure enough, in these cases, it will not "just work". But still, I think the general thing as stated by the standard (pointers "just work") is not something very special or unexpected. Now on the other hand "considered the same object" is very much unexpected. Hence I said the wording is a little unlucky. It would e.g. suggest that if I delete the integer 5 at some address and allocate a new integer with value 7, then I can consider it being the same object (consequentially, I would face the surprise that 5 == 7). Now that is a surprise :-) – Stanfordstang 13/12, 2019 at 12:19

@Stanfordstang You need to differentiate pure C++ and separately compiled code. Separate compilation is done with interfaces w/ the outside governed by the ABI. Internal C++ code must follow the C++ rules. Take f.ex. the type aliasing rules: they must be followed by C++ (or C) code. But go through an ABI boundary and you can interpret any bag of bits as any type with compatible ABI definition. You can write an int64_t in one function and read an IEEE double in another function, even if they are in the same TU, if you went through the ABI by calling a separately compiled function. – Olericulture 13/12, 2019 at 21:44

(...) Also, you don't need to call a constructor to separately construct an object; on code using MSVC++ convention: you could memcpy any object even polymorphic (w/o virtual base); on code using GCC class representation: you can copy any object with memcpy even one w/ a virtual base. It isn't C++ legal but it's ABI legal. Once you cross the ABI boundary, nobody knows what you did. What happens in Vegas stays in Vegas (it's forgotten) and what happens in a separate compiled module is forgotten. The gains and loss in Vegas are kept and so is the state of all objects in memory. – Olericulture 13/12, 2019 at 22:26

It would be surprising if all requirement of the standard related to object life-time were not in [basic-life].

There are few chances that the "complete" adjective has been inadvertedly added to the name "object" in the standard paragraph you cite.

In the paper P0137, one can read this rational (paper cited in @LanguageLawyer comment below):

This is necessary to allow types such as std::optional to contain const subobjects; the existing restriction exists to allow ROMability, and so only affects complete objects.

To reassure us, we can verify that compilers do follow the standard wording at the letter: they perform constant optimization for complete const objects but not for const member suboject of non const complete objects:

Let's consider this code:

struct A{const int m;};

void f(const int& a);

auto g(){
    const int x=12;
    f(x);
    return x;
}

auto h(){
    A a{12};
    f(a.m);
    return a.m;
}

Both Clang and GCC generates this assembly when targeting x86_64:

g():                                  # @g()
        push    rax
        mov     dword ptr [rsp + 4], 12
        lea     rdi, [rsp + 4]
        call    f(int const&)
        mov     eax, 12     ;//the return cannot be anything else than 12
        pop     rcx
        ret
h():                                  # @h()
        push    rax
        mov     dword ptr [rsp], 12
        mov     rdi, rsp
        call    f(int const&)
        mov     eax, dword ptr [rsp]  //the content of a.m is returned
        pop     rcx
        ret

The returned value is placed in register eax (according to the ABI specification: System V x86 processor specific ABI):

In the function g the compiler is free to suppose that x can not be changed accross the call to f because x is a complete const object. So the value 12 is placed directly in the eax register as an immediate value: mov eax, 12.
In the function h the compiler is not free to suppose that a.m can not be changed accross the call to f because a.m is not a suboject of a complete const object. So after the call to f the value of a.m must be loaded from memory to eax : mov eax, dword ptr [rsp].

Genro answered 12/12, 2019 at 14:12 Comment(7)

"There are few chances that the "complete" adjective has been inadvertedly added" wg21.link/p0137r1 – Translate 12/12, 2019 at 17:3

@LanguageLawyer This paper gives the rational: This is necessary to allow types such as std::optional to contain const subobjects; the existing restriction exists to allow ROMability, and so only affects complete objects. – Genro 12/12, 2019 at 18:9

"the existing restriction exists to allow ROMability" That's quite a stretch. The restriction exists to allow reasonable optimizations many of which have nothing to do with ROM (which might be used for object with static duration, maybe those with automatic duration in main (but I doubt it's done in practice) but that's all). – Olericulture 12/12, 2019 at 20:50

@Olericulture The draft note is actually referring to what is now [basic.life]/10, which originally forbade placing new objects into memory occupied by any non-dynamic const object at all. The change is to loosen that requirement to storage within a const complete object, so that placement-new on const (sub-)objects makes sense in the first place, but still allowing const complete objects to be placed in ROM. – Manxman 13/12, 2019 at 6:32

@Manxman Sorry, I misread that quote! I thought "the existing restriction" referred to another restriction. – Olericulture 13/12, 2019 at 6:38

Is one storage created by the implemented in each automatic complete object, or for each object? I.e. do class members each get storage? – Olericulture 13/12, 2019 at 22:29

@Manxman Whether an object is covered by the ROM-ability clause and whether it's considered immutable over its lifetime are independent properties. – Olericulture 16/12, 2019 at 2:37

Most of the necessary bits of the answer have already been given by other users, so I am collecting them in a community wiki so that this question can have an accepted answer.

Yes, other than with respect to a const array (as pointed out in the comments by T.C.). An array whose elements are const is itself a const object, so if it is the complete object, then its elements cannot be destroyed and recreated in place (as pointed out in the comments by walnut).
No. As Language Lawyer pointed out in the comments, the fact that it's allowed was explicitly intended. The Russian national body asked for it to be allowed, and C++20 was correspondingly amended to allow it.
See above.
- As pointed out in Oliv's answer, compilers are indeed aware that they're not supposed to optimize based on the assumption that a const member of a non-const complete object can't be replaced---because the standard says that they can be replaced.
- The OP also asks whether this is a "breaking change". Well, in what sense? Since C++17 did not allow this form of replacement (leading to likely UB if attempted) and C++20 did allow it, it means that if the behaviour of a compiler changed in order to implement the C++20 behaviour, then there would be less UB, which is not a breaking change.
- Did compilers ever perform this kind of optimization in the C++17 era, which was subsequently disallowed by C++20? The Russian national body comment hints at a decent argument as to why the answer is likely "no". The linked US national body comment is even more explicit. An example of the code that would have been broken by this kind of optimization is given at the end of this answer to avoid formatting issues. Here, the first S object created in v is destroyed and then a new S object is created in its place. v.data() is a pointer that originally pointed to the old object. Does it now point to the new object (resulting in 2 being printed) or does it point to no object, resulting in UB? I think we can safely assume that no widely used compiler would treat this code as UB in order to apply the optimization that was technically allowed in C++17. (Of course, there are caveats: compilers could have used magic to turn off this optimization in standard library classes, or they could have made the push_back operation always replace the stored pointer with a laundered version of itself, but in practice, this was not done.) In general, it does not seem that compilers had to change their behaviour due to the relaxation of the replacement rules in C++20.
No it doesn't. Obviously only complete const objects were ever ROMable in the first place (not const subobjects of non-const complete objects), and as discussed, complete const objects aren't affected by this change.

Code:

     #include <iostream>
     #include <vector>
     struct S {
         S(int x) : x(x) {}
         const int x;
     };
     int main() {
         std::vector<S> v;
         v.push_back(S(1));
         v.pop_back();
         v.push_back(S(2));
         std::cout << v.data()->x;
     }

Sverige answered 12/12, 2019 at 6:31 Comment(0)

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

The "undead" clause

Const subobject

Classes with const and reference members

The questions

Recommended topics

Hot tags