Placement-new an STL container and destroying it safely afterwards
Asked Answered
C

2

6

This code implements an unrestricted union which provides access by name and by index to any of its three members.

Since std::string is non-trivially constructed and destroyed, I need to provide special constructor and destructor for the union.

#include <iostream>
#include <string>

using namespace std ;

union MyUnion{
    string parts[3] ;
    struct{ string part1, part2, part3 ; } ;
    
    MyUnion(){
        new(parts+0) string ; //constructs the 3 strings in-place
        new(parts+1) string ;
        new(parts+2) string ;
    }
    ~MyUnion(){
        parts[0].~string() ; //calls string's destructor
        parts[1].~string() ;
        parts[2].~string() ;
    }
} ;

int main(){
    
    MyUnion u ;
    
    u.part1 = "one" ; //access by name
    u.part2 = "two" ;
    u.part3 = "three" ;

    cout << u.parts[0] << endl ; //access by index
    cout << u.parts[1] << endl ;
    cout << u.parts[2] << endl ;
}

This example compiles and works fine (seemingly), but my questions are:

  • Is it safe to do this?
  • Can I be sure that there will be no memory leaks?
  • What if string's constructor throws an exception? Does that need to be caught so as to not try to destroy an object that was never constructed?

Note

The code compiles in VC2015, which does support unnamed structs. Please disregard that detail.

Chibouk answered 27/4, 2017 at 22:38 Comment(14)
There are no unnamed structs in C++. Your code is not valid.Paprika
assuming you fix the name problem, the construction is OK, but u.part1 = "one" would be undefined behaviour. The active member is parts, you are only allowed to access the active memberBrion
u.part1 = "one" ; is UB. You cannot access the non active member of a union.Nichy
@DOUGLASO.MOEN, the actual string's content is in heap memory. The overlapping only happens in the stack-memory part of the string object.Chibouk
@NathanOliver, there's no active member. The overlapping is between two identical types. It makes no difference to access one or the other.Chibouk
@Chibouk They overlap in memory, but the language does not guarantee how they overlap.Institute
@ephemient, even if the overlapping types are the same? Wouldn't that make both occupy the exact same space in memory?Chibouk
You're allowed to read from an inactive member of a union only if your union contains standard layout types, and the members in question contain a common initial sequence. There's no requirement for std::string to be standard layout.Extremely
@Chibouk Going strictly by the C and C++ language specifications, no. It's common for compilers to provide some stronger guarantees than required by the standard, though. For example, C++ guarantees &u.part1 < &u.part2 but doesn't require that &u.part1 + 1 == &u.part2. A compiler might promise that.Institute
@getfree your constructor makes the array the active member. Also they have to have a common initial sequence. A class type and an array, even of the same overtime and number, do not share a common initial sequence.Nichy
I can never understand why people resort to this kind of nasal-demon-infested hackery to save typing two characters at the use site. Just write some one-line accessors.Stickweed
@T.C., some of us like to experiment with the language to learn what's possible and what's not. This was not meant to be part of a real-world project.Chibouk
" The details of that allocation are implementation-defined, and it's undefined behavior to read from the member of the union that wasn't most recently written. Many compilers implement, as a non-standard language extension, the ability to read inactive members of a union." source: en.cppreference.com/w/cpp/language/union So you need to check your compiler documentation.Virtu
You can experiment with a compiler, which in general tells you nothing about another compiler. C++ is a specification. It says how compilers may and may not behave. There's very little room for experimentation with it.Tabathatabb
D
2

Is it safe to do this?

No. First, the common initial sequence rule only allows reading of members, not writing:

In a standard-layout union with an active member (9.3) of struct type T1, it is permitted to read a non-static data member m of another union member of struct type T2 provided m is part of the common initial sequence of T1 and T2; the behavior is as if the corresponding member of T1 were nominated.

Secondly, common initial sequence is a trait of standard layout types:

The common initial sequence of two standard-layout struct (Clause 9) types is [...]

and std::string is not required to be standard-layout.

Doty answered 28/4, 2017 at 17:20 Comment(11)
The active member is the one most recently written to. By writing to an inactive member you just make it the active one.Chibouk
@Chibouk No, that's not how it works. Writing to the inactive member is UB.Doty
What's your definition of active member then?Chibouk
@Chibouk It's not my definition: "In a union, a non-static data member is active if its name refers to an object whose lifetime has begun and has not ended (3.8)."Doty
So by writing to a member you make it active. That's my interpretation. What's yours?Chibouk
@Chibouk Writing to a member doesn't begin lifetime, so that interpretation is incorrect. See [basic.life].Doty
Then I don't understand how a member becomes active in the first place. If it's not the act of writing what makes a member active, what is it?Chibouk
@Chibouk You need to begin lifetime. The way you begin lifetime is acquiring storage and initialization. In this case we already have storage, so we just need initialization. That is, placement new.Doty
@Doty 9.3 Unions/5 ... When the left operand of an assignment operator involves a member access expression (5.2.5) that nominates a union member, it may begin the lifetime of that union member, as described below...Tabathatabb
@n.m. std::string doesn't have a trivial default constructor.Doty
@Doty that's right but it is not true that only placement new can begin lifetime of a union member. This code would be invalid even if it used int instead of std::string anyway,. because there's no common initial sequence in there.Tabathatabb
T
0

Is it safe to do this?

It depends on what you are willing to call safe. The code certainly invokes undefined behaviour by any reasonable interpretation of the standard.

You cannot read an inactive member of a union, except when there's a common subsequence involved (9.3 Unions), but these union members have no common initial sequence because the notion is only defined for two standard-layout structs (9.2 Class Members/20) and one member of the union is not a struct at all. It's an array so it cannot have a common initial sequence with anything.

This also applies to analogous code that uses primitive types, e.g int x[3]; and struct {int x0, x1, x2};. There's not even a guarantee that x2 and x[2] have the same address.

Tabathatabb answered 24/5, 2017 at 19:5 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.