Which union member becomes active after placement new
Asked Answered
K

1

12

Regarding this code:

#include <string>

int main()
{
    union u {
        u() { i = 0; }
        ~u() {}

        int i;
        std::string s1;
        std::string s2;
    } u;

    new (&u) std::string{};
}

[intro.object]/2 says that

Objects can contain other objects, called subobjects. A subobject can be a member subobject ([class.mem]), a base class subobject ([class.derived]), or an array element. An object that is not a subobject of any other object is called a complete object. If an object is created in storage associated with a member subobject or array element e (which may or may not be within its lifetime), the created object is a subobject of e's containing object if:
— the lifetime of e's containing object has begun and not ended, and
— the storage for the new object exactly overlays the storage location associated with e, and
— the new object is of the same type as e
(ignoring cv-qualification).

There is no requirement how an object is created in the storage associated with a member subobject. The code doesn't have to nominate the subobject in the argument of the address-of operator if the subobject is a member of a standard-layout union or the first member of a non-union class object. It is enough to get the address of the containing object to designate the storage of the member subobject in such cases.

«There is no requirement how an object is created», among other things, means that the pointer given to placement new does not have to point to the subobject. Mainly because there could be no object to point to (note, the [intro.object]/2 do not require subobject to be alive). In std-discussion mailing list it was asked, given an object x of type struct A { unsigned char buf[1]; };, is there a difference between new (&x) A{} and new (x.buf) A{}? And the answer was no, in both cases, x.buf would provide storage for A{}. Because

The wording in [intro.object] and [basic.life] concern themselves with the storage address represented by a pointer, not the object to which it points.


[class.union]/1 swears that «At most one of the non-static data members of an object of union type can be active at any time».

Which one became active in the code above, s1 or s2?

Kravits answered 17/1, 2019 at 13:25 Comment(18)
What is the reason you use placement new here, instead of just plain assignment to a member? Is it just plain curiosity, or is there some underlying problem? Or perhaps some existing code that uses this?Kinney
@Someprogrammerdude assignment does not start the lifetime of union members of std::string types.Kravits
Does the note in [class.union/1], cited above, apply? i.e. s1 and s2 have a common initial sequence which covers their entire sequence of data members, therefore they are indistinguishable?Draftsman
@PeterHull this note requires one of such standard-layout struct members to be active. But I don't know which one became active. Anyway, I can replace std::string with double and this note won't apply.Kravits
To see how to change the active member of a union see this: #46350220Brunette
Why do you think either s1 or s2 becomes active? It says "at most one ..."Turncoat
@Turncoat good question. So, even though I've created an object in a storage associated with member subobject, its lifetime did not start?Kravits
No. Sometimes a glvalue referring to old object may automatically refer to the new object, but that's not your case (the lifetime of u.s1 or u.s2 has even not begun), and even in that case, they are definitely two objects.Turncoat
My first guess is "neither".Recrudesce
@Turncoat so you claim that non-trivial types should not be members of unions, because you can't start their lifetime? And this note is lying?Kravits
See issue 1404. And I still think it is impossible to use a non-trivial data member if its lifetime has not ever begun.Turncoat
Assuming you properly activate an std::string in the union don't both s1 and s2 become active because std::string and std::string, being the same type exactly, share a common prefix and accessing the common prefix parts of union members is well defined?Maracanda
@Turncoat 1404 is about recreating objects with const/reference subobjects. You told that since union member's name never referred to an existing object, it won't refer when you start the lifetime of this member.Kravits
Irrespective on how the dialogue developed here, this is still an extremely good question. I'm starting to think that the standard doesn't describe this situation adequately. Hopefully an expert wades in. @LanguageLawyer: do ping me in a couple of days if still no adequate answer: I'll put a bounty on the question.Capreolate
There was a similarly interesting question about placement new more generally last week (with respect to reusing storage). I think that whole feature is just really underspecified in a few places.Emma
@LightnessRacesinOrbit The placement new feature is as old as C++ standardisation, yet the wording was changed significantly. (The non trivial type in a union is a more recent feature of course.) The wording re: unions member lifetime is also recent. It shows that C++ spec is severely lack on the basic stuff.Estrada
@Estrada Yep.Emma
@Capreolate you may put a bounty, but the Standard just does not have an answer to this question.Kravits
R
9

A pointer is an address, but to the object model, it is more than an address. It points to a specific object at that address. Multiple objects can exist at a certain address, but that doesn't mean that pointers to any of those objects are simultaneously pointers to other objects at that address. Consider what [expr.unary.op]/1 says of pointer indirection:

the result is an lvalue referring to the object or function to which the expression points.

Not to "an object at that address"; it is an lvalue referring to the object being pointed to. So clearly, in the C++ object model, multiple objects can exist at the same address, but a specific pointer into that address does not point to all of those objects. It only points to one of them.

[expr.unary.op]/2 says "The result of the unary & operator is a pointer to its operand". Therefore, &u points to u, which is of type u (BTW, was it really necessary to name the object the same as the type?). &u does not point to u.i, u.s1 or u.s2. All of those are guaranteed to share the same address as &u, but &u itself only points to u.

So the question now becomes, what is the storage represented by &u? Well, per [intro.object]/1, we know that "An object occupies a region of storage". If &u points to the object u, that pointer must therefore represent the region of storage occupied by that object. Not the storage of any of its subobjects; it is the storage for that object. In its entirety.

Now, we get to new(&u) std::string{}. This expression creates an object of type std::string{}, within the storage represented by &u. That represents reusing the storage of the object u. Which in accord with [basic.life]/1.4, terminates the lifetime of u. Which terminates the lifetime of its active member subobject.

So the answer to your question is that neither becomes active, because the object u doesn't exist anymore.

Rhombohedral answered 17/1, 2019 at 17:4 Comment(17)
I know about pointer values. And I've already covered this in the question: There is no requirement how an object is created in the storage associated with a member subobject. Which means you don't need to «point to» a member subobject. Richard Smith agrees here: The wording in [intro.object] and [basic.life] concern themselves with the storage address represented by a pointer, not the object to which it points groups.google.com/a/isocpp.org/d/msg/std-discussion/GHwA_pOc4CA/…Kravits
The most promising answer thus far in my opinion. I hope you don't mind my edit. I'm not convinced that @LanguageLawyer is confused: seems clued up to me.Capreolate
@LanguageLawyer: I don't agree with that interpretation of the standard. [basic.life] and everything about unions stops making sense under that interpretation, since by that reasoning, even new(&u.s1) std::string would apply to u.s2, so both subobjects would be activated, which is explicitly disallowed. So I choose the interpretation where the standard makes sense.Rhombohedral
@NicolBolas: That last comment makes perfect sense to me. I believe this answer to be correct. Have an upvote.Capreolate
pointer must therefore represent the region of storage occupied by that object pointer value only represents the address of the first byte in storage.Kravits
@LanguageLawyer: I don't understand that. Would you mind explaining the relevance of that comment please?Capreolate
@LanguageLawyer: OK, but where exactly did "only" come from? That a pointer represents an address does not mean it doesn't represent more than that. Otherwise, as I pointed out, [expr.unary.op]/1 doesn't work. It should also be noted that it says the "value of the pointer", not the pointer itself.Rhombohedral
@Capreolate I don't understand where the quoted comes from. The only thing about pointers representing something I know is representing the address of the first byte.Kravits
@NicolBolas given your interpretation, it is impossible to start lifetime of a member using new(&u.s1) std::string, because &u.s1 does not point to an object. No object exists or ever existed there.Kravits
@LanguageLawyer: [basic.life]/6 explains how you can use pointers to things that are about to become objects but aren't in their lifetime yet, or used to be within their lifetime but aren't anymore. u.s1 is an object outside of its lifetime, so it applies. That is, all of the union members are always objects, but they're not always within their lifetimes. Just look at [class.union]; it frequently talks about the members as being objects even if they aren't active.Rhombohedral
@Capreolate The most promising answer thus far in my opinion This could be a solution of the problem, but it requires changing the standard. I was thinking similarly about how to solve this. But I was not thinking in terms of pointer values, only about syntactically nominating a member. Like in assignment lifetime starting rules timsong-cpp.github.io/cppwp/n4659/class.union#5Kravits
@NicolBolas given your interpretation, it becomes hard to use arrays of unsigned char/std::byte to provide storage. Because what is usually given to placement new, is a pointer to an array member (representing only the storage under this member, in your opinion), not a pointer to the whole array. See examples here timsong-cpp.github.io/cppwp/n4659/intro.object#3Kravits
@LanguageLawyer: Except that [intro.object]/3 explicitly lays out how byte arrays provide storage. A pointer to the first element of a byte array is undeniably pointing to storage "associated with another object e of type “array of N unsigned char ” or of type “array of N std::byte”". So it counts as providing storage; the C++ object model knows when a pointer points into an array.Rhombohedral
@NicolBolas [intro.object]/3 explicitly lays out how byte arrays provide storage It requires an object to be created in a storage associated with an array. If I give a pointer to an array member, then the storage it represents associated with this member, not the array. Right?Kravits
@LanguageLawyer: My point is this: a pointer points to an object. That object has storage. In the case of [intro.object]/3, pointing to an array element subobject points to a piece of storage from that array. Therefore, it is pointing to storage "associated with that array". It may only be pointing to one object in that storage, but the storage itself is associated with the array. It talks about "storage associated with" rather than "an array" precisely because arrays decay to pointers.Rhombohedral
@NicolBolas I understand why your POV could be attractive, but can't agree. Both &u and &u.s1 represent the beginning of the same storage. And I don't see why if a storage of a subobject is associated with storage of the containing object, the storage of an object is not associated with the storage of its subobject.Kravits
Let us continue this discussion in chat.Rhombohedral

© 2022 - 2024 — McMap. All rights reserved.