Why does the use of `std::aligned_storage` allegedly cause UB due to it failing to "provide storage"?
Asked Answered
V

3

20

Inspired by: Why is std::aligned_storage to be deprecated in C++23 and what to use instead?

The linked proposal P1413R3 (that deprecates std::aligned_storage) says that:

Using aligned_* invokes undefined behavior (The types cannot provide storage.)

This refers to [intro.object]/3:

If a complete object is created ([expr.new]) in storage associated with another object e of type “array of N unsigned char” or of type “array of N std​::​byte” ([cstddef.syn]), that array provides storage for the created object if: ...

The standard then goes on to use the term "provides storage" in a few definitions, but I don't see it saying anywhere that using a different type as storage for placement-new (that fails to "provide storage") causes UB.

So, the question is: What makes std::aligned_storage cause UB when used for placement-new?

Vincentia answered 11/4, 2022 at 22:4 Comment(6)
The best I found was in an answer to a tweet by Vittorio Romeo - "The types don't provide storage in a general sense because they themselves are formal objects. It'd be akin to using any random POD as a source of storage which violates the object model." which kind of makes sense.Kristiekristien
@TedLyngmo I'm not sure about it violating the object model. The lifetime of the original object ends the storage is reused... My best guess is that calling ~aligned_storage_t() on such an object causes UB because aligned_storage_t is dead at that point.Vincentia
I don't see how the suggested replacement is any better. Enhancing alignas seems the best option, assuming that core language feature meets with any approval by the committee, and by the compiler vendors (who have representation on the committee). (Step 1: go to moon. Step 2: get rock. How hard could it be?)Battista
Yeah, I'm not 100% sure either. I'm hoping for the lawyers to come in and straighten this out :)Kristiekristien
@TedLyngmo "types … are formal objects …" which kind of makes sense This pile of words has zero sense.Ronnironnica
@LanguageLawyer :-) Perhapse he should have added "instances of" in there. That's what I was thinking when reading it anyway.Kristiekristien
V
11

The paper appears to be wrong on this.

If std::aligned_storage_t failed to "provide storage", then most uses of it would indirectly cause UB (see below).

But whether std::aligned_storage_t can actually "provide storage" appears to be unspecified. A common implementation that uses a struct with alignas(Y) unsigned char arr[X]; member (seemingly) does "provide storage" according to [intro.object]/3, even if you pass the address of the whole structure into placement-new, rather than the array. Even though this specific implementation isn't mandated now, I believe mandating it would be a simple non-breaking change.


If std::aligned_storage_t actually didn't "provide storage", then most use cases would cause UB:

Placement-new into an object that fails to "provide storage" is legal by itself, but...

This ends the lifetime of the object that failed to "provide storage" (aligned_storage_t), and, recursively, all enclosing objects. The next time you access any of those, you get UB.

Even if aligned_storage_t is not nested within other objects (which is rare), you'd have to be careful when destroying it, since calling its destructor would also cause UB, since its lifetime has already ended.

[basic.life]/1.5

... The lifetime of an object o of type T ends when:

— the storage which the object occupies ... is reused by an object that is not nested within [the object]

intro.object/4

An object a is nested within another object b if:

—a is a subobject of b, or

— b provides storage for a, or

— there exists an object c where a is nested within c, and c is nested within b.

Vincentia answered 8/5, 2022 at 8:6 Comment(8)
Your answer misses the explanation why aligned_storage_t defined as here will die from creating an object on top of itRonnironnica
@LanguageLawyer I took "fails to provide storage" for granted from the proposal. Now when I'm thinking about it, I'm not sure. Do you know the answer?Vincentia
If that bullet means something else than someone may forget to use _t/::type after std::aligned_storage, then it is just bullsh^W nonsenseRonnironnica
@LanguageLawyer They mention forgetting _t/::type separately, so this does smell of BS. I tried to fix the answer...Vincentia
I provided a separate answer but "even if you pass the address of the whole structure into placement-new, rather than the array" is answered by the standard directly. The address of the whole struct is the address of the array. They are "pointer-interconvertible" (keyword in the standard to search for).Lustrous
Also note that the standard doesn't mandate a specific implementation for aligned_storage _but it does mandate that the implementation works. This is another way to end the debate. "The member typedef type shall be a trivial standard-layout type suitable for use as uninitialized storage for any object whose size is at most Len and whose alignment is a divisor of Align." (20.15.7.6) If any given implementation is UB then it doesn't conform with the standard.Lustrous
@MatthewM. But does being "suitable for use as uninitialized storage" necessarily imply that it provides storage? It could be enough to merely be "reusable storage".Undirected
I sure think "suitable for use as ... storage" has the same meaning as "suitable to provide storage". Note that "provides storage" implies both that (1) it's suitable to store the object and (2) it's actually storing it.Lustrous
S
7

The C++ standard lets a very restricted set of types serve as storage for other objects. The set of types that can serve as storage for other objects cannot themselves have alignment packaged into their type.

Imagine:

template<std::size_t N>
using bytes=std::byte[N];
template<std::size_t S, std::size_t A>
struct alignas(A) aligned{
  bytes<S> data;
};

You cannot use &aligned<12,4> to store another object safely. You cannot make a typedef that carries alignment with it with this property.

You could use aligned<12,4> a; &a.data or similar, but that is syntactically different.

Now, the standard could get around it by adding wording; but the aligned storage existing definition does not have this magic wording, and no construct in C++ can have the properties users of aligned_storage_t are expecting without such wording. I mean, UB is UB, so the compiler is free to interpret your program as if it was a program in a language with that wording... but that is swatting a standard error with a nuclear bomb.

Shouse answered 11/4, 2022 at 23:35 Comment(11)
Hmm. But does the standard actually say that using a type that can't "provide storage" for placement-new causes UB?Vincentia
If there is a problem with alignas on aligned, then why not place alignas where the standard places it?Ronnironnica
@LanguageLawyer you still need to access foo<7,1>::type.__data: there is no typedef for aligned storage, and constructing in a type containing an array is not the same as construdting in the array.Shouse
constructing in a type containing an array is not the same as construdting in the array Any proofs?Ronnironnica
@LanguageLawyer No, to prove that I'd have to audit the entire standard. They are clearly not the same object (the object X and the array contained in X are distinct objects, even if they have the same size). There are plenty of rules that state ways in which they are interchangable. Proving they are not interchangable in this case cannot be done, because the standard could contain a clause anywhere that makes them interchangable. And I could be wrong, after all.Shouse
What do you think about cplusplus.github.io/CWG/issues/2470.html?Ronnironnica
The very existence of this bullet in the definition of provides storage suggests that one does not need to pass a pointer to the array to new. Or it means «there is no smaller array object that satisfies these constraints among those objects pointers to which were passed to single argument of placement new»?Ronnironnica
@Yakk-AdamNevraumont The section you're looking for is 6.8.2 Compound Types and the definition of "pointer-interconvertible". Yes, you can get a pointer to the unsigned char array contained in aligned_storage_t from a pointer to aligned_storage_t, itself. They have the same address and reinterpret_cast will obtain one from the other.Lustrous
@Yakk-AdamNevraumont And as far as aligned_storage_t goes, the alignment requirement is placed on the contained unsigned char array (and not directly on aligned_storage_t). But since pointers between the two are interconvertible, aligned_storage_t must inherit that alignment. Keep in mind that alignment is a property of the address itself (the value of the pointer). You don't "lose alignment" by casting through void*, for example. A particular alignment only means that a certain number of least-significant bits of the pointer value are 0.Lustrous
@LanguageLawyer "provides storage" is a term related to an object hierarchy. Suppose you have a large array A. Within it, you construct several small arrays B1, B2, B3, etc. Finally, you construct some object C within B1. The "provides storage" definition would apply to both A and B1 related to C except for that last bullet. B1 "provides storage" for C. A does not "provide storage" for C. But, B1 is "nested within" A. It's all about defining an object hierarchy.Lustrous
@LanguageLawyer And with that object hierarchy determined (who "provides storage", is "nested within", or is a "subobject of"), the Note following bullet 3.3 can be put into action (defining the lifetime of contained objects when storage is reused). Regarding passing a pointer to new: I recommend studying the example and note this line: B *b = new (a.a + 8) B; They are definitely not passing a pointer to the array (if you mean unsigned char (*) [32]). They are passing an unsigned char * offset 8 bytes after the beginning of the array to new.Lustrous
L
1

Preface: This is a long answer. Sorry about that! The fundamental answer is short and sweet. But there are a lot of bad arguments out there so there are a lot of basics to touch on.

The proposal claims "Using aligned_* invokes undefined behavior (The types cannot provide storage.)" but provides no additional discussion, proof, or references to language in the standard.

This appears to be a "language-lawyering" position based on an overly literal interpretation of "provides storage" as used in 6.7.2 of the C++ standard. The claim appears to draw from section 6.7.2.3 which states:

If a complete object is created (7.6.2.7) in storage associated with another object e of type “array of N unsigned char” or of type “array of N std::byte” (17.2.1), that array provides storage for the created object ...

The apparent argument then goes on to say, to paraphrase, "Well, aligned_storage_t isn't an array of bytes. It is a struct/union/class that contains an array of bytes. So while that array could provide storage, aligned_storage_t itself cannot." The obvious issue with this argument is... if section 6.7.2.3 doesn't apply here, some other section in the standard still might. There are plenty of other "things" in the standard that "provide storage".

But all of that is irrelevant. What isn't in debate is that the unsigned char array contained within aligned_storage_t can provide storage. We can access that storage if we can get a pointer to it (e.g., for placement new or memcpy of a trivially-copyable type).

And per the standard in the clearest language, we can get a unsigned char* (which points to the beginning of array storage) through the pointer to aligned_storage_t. This is defined behavior to get a pointer to the 'storage' contained inside aligned_storage_t:

std::aligned_storage_t<sizeof(T), alignof(T)> buf;
unsigned char* storage = reinterpret_cast<unsigned char*>(&buf);
T* tptr = new(storage) T;
tptr->~T();

Why? Because aligned_storage_t is implemented either as a union with the unsigned char array as a non-static data member or as standard-layout class (e.g., POD struct) with the array as the first non-static data member.

See section 6.8.2 Compound types of the standard on pages 72-73:

4 Two objects a and b are pointer-interconvertible if:

(4.2) — one is a union object and the other is a non-static data member of that object (11.5), ...

(4.3) — one is a standard-layout class object and the other is the first non-static data member of that object, ...

And the term "pointer-interconvertible" means:

If two objects are pointer-interconvertible, then they have the same address, and it is possible to obtain a pointer to one from a pointer to the other via a reinterpret_cast (7.6.1.9). [Note: An array object and its first element are not pointer-interconvertible, even though they have the same address. — end note]

And that's it. The array contained in aligned_storage_t does provide storage and we can legally get a pointer to that storage through the "pointer-interconvertible" rules.


EDIT: To address the discussion with Language Lawyer in the comments, there is in fact a literal "pointer to an array" (fun fact). Semantically, it has one more level of indirection (in a sense, a pointer to a pointer) than an array type. But logically, the pointer to an object of type array N T -> points to the beginning of array storage -> points to the location of the first element in the array. So:

  char buf[4];
  char (*ptr_buf)[4] = &buf;
  char* ptr_elem0 = &buf[0]; 

ptr_buf and ptr_elem have different types but the same address. They are not interconvertible. See the accepted answer over here. This language in the standard forbids something like this:

struct MyStruct {
  char name[4];
  int value;
};

void g(char *chr) {
  char (*name)[4] = reinterpret_cast<char (*)[4]>(chr); // Invalid
  MyStruct* s = reinterpret_cast<MyStruct *>(name);
  // This function uses a pointer to the first element in the array to 
  // access another member of the containing struct. C++ forbids this.
  s->value = 10;
}

void f() {
  MyStruct s;
  g(&s.name[0]);
}

But that language and restriction is irrelevant here. These pointer-interconvertible "rules" are relevant to get access to first non-static element within aligned_storage_t (the array that provides storage). It doesn't matter that we can't interconvert from unsigned char* back to unsigned (*) [sizeof(T)] (because we aren't even attempting to do that).

An array is convertible to a pointer to the first element (per Section 7.3.2). And that pointer to the first element can be used to access the entire array (because pointer arithmetic has defined behavior and arrays are contiguous). That pointer is all we need to provide to placement new. Note that operator new takes a void* so array-to-pointer conversion will happen anyways. It makes no difference if we do that array-to-pointer conversion ourselves beforehand.

Lustrous answered 27/7, 2022 at 14:31 Comment(16)
How does unsigned char* storage = reinterpret_cast<unsigned char*>(&buf); get a pointer to the array?Ronnironnica
Understand that this is the closest thing to a "pointer to the array" in the language. There is a pointer to the beginning of the array and there's the length of the array. An "array" is not a first class object in C/C++ like a struct or union is. There is no such thing as "unsigned char[] *". Is that what you're asking?Lustrous
Apologies for the many edits and deleted comments. It took me a minute to grok exactly what you were asking.Lustrous
There is no such thing as unsigned char[] * or unsigned char[sizeof(T)]* in C or C++ Ehm. But there is unsigned char(*)[] or unsigned char(*)[sizeof(T)]. Which are pointers to an array.Ronnironnica
Well, that's cool. I've never seen one of those. And how does their existence impact the argument? We both know that the pointer to the beginning of the array is a pointer to the array and has been used as such since the invention of C.Lustrous
And how does their existence impact the argument? The Note you've cited even says how: An array object and its first element are not pointer-interconvertibleRonnironnica
It's also worth noting that placement new, memcpy, etc. do not take these strange animals as arguments. Placement-new accepts the array itself which decays to a pointer to the beginning of the array. memcpy accepts a void* which, again, points to the beginning of the array.Lustrous
We both know that the pointer to the beginning of the array is a pointer to the array I don't even know what «the pointer to the beginning of the array» is.Ronnironnica
Yes. The pointer to the array object and the pointer to the first element are not interconvertible. But the array object (not its pointer) and the pointer to the first element are.Lustrous
But the array object (not its pointer) and the pointer to the first element are 🤦‍♂️Ronnironnica
The array object is convertible to a pointer to its first element. You'll find this in Section 7.3.2 Array-to-pointer conversion. This is so basic. How is this a question? If you're not understanding the distinction between "pointer to array" and "array" (which decays to a pointer), read my edits. Also see this SO answer: #47924603Lustrous
@MatthewM. An array object is convertible to a pointer to its first element, but not pointer-interconvertible (nor "interconvertible", which isn't actually defined by the standard). I think your second-to-last comment seemed to imply the latter.Undirected
@Undirected As you said, I did not use pointer-interconvertible (a "legal" term, defined by the standard). I did use interconvertible (an English word with appropriate meaning). An array object and a pointer to its first element are interconvertible in this context (which is all about defined/legal access to the storage in an array). You can access the full storage of an array, equivalently and in a defined manner, whether you have the named array object in-scope or a pointer to the beginning of that array.Lustrous
@MatthewM. "an English word with appropriate meaning" Maybe, but it's ambiguous in this context. Only now have you made that meaning more clear. Though it is still worth noting that a pointer to the first element of an array is not convertible (as defined in the standard) back to the array.Undirected
@Undirected True, you cannot char array_copy[4] = reinterpret_cast<char[4]>(ptr) or some such absurdity. reinterpret_cast isn't going to call memcpy() for you either. If that needs saying, okay. And yet, void f(char []);, void f(char [3]), and void f(char [4]) are all redefinitions of void f(char*);. As far as programmers should concern themselves, an array is a pointer type. Case in point: void f(char array[4]) takes array by pointer and not "by-value".Lustrous
And you can do this: char array[4] = { ... }; char copy[4]; char* ptr = &array[0]; memcpy(copy, ptr, 4); The sort of "conversion" (within the defined behavior of the standard) that I clearly meant to any good-faith reader! All of this bad-faith "lawyering" is exactly why we have this "aligned_storage_t is UB" nonsense in the first place.Lustrous

© 2022 - 2024 — McMap. All rights reserved.