Is it defined behavior to place exotically aligned objects in the coroutine state?

N

3

12

Edit: Thanks for everyone's answer and replies. Language Lawyer's answer is technically the correct one so that's accepted, but Human-Compiler's answer is the only one that meets the criteria (getting 2+ points) for the bounty, or that is elaborated enough on the question's specific topic.

Full question

Is it defined behavior to have an object b placed in the coroutine state (by e.g. having it as a parameter, or preserving it across a suspension point), where alignof(b) > __STDCPP_DEFAULT_NEW_ALIGNMENT__?

Example:

inline constexpr size_t large_alignment =
    __STDCPP_DEFAULT_NEW_ALIGNMENT__ * 2;

struct alignas(large_alignment) behemoth {
  void attack();
  unsigned char data[large_alignment];
};

task<void> invade(task_queue &q) {
  behemoth b{};
  co_await submit_to(q);
  b.attack();
}

Explanation

When a coroutine is called, heap memory for the coroutine state is allocated via operator new.

This call to operator new may take one of the following forms:

passing all arguments passed to the coroutine following the size requested, or if no such overloads can be found,
passing just the size requested.

Whichever form the call takes, note that it doesn't use the overloads accepting a std::align_val_t, which are necessary to allocate memory that must be aligned more than __STDCPP_DEFAULT_NEW_ALIGNMENT__. Therefore, if an object whose alignment is larger than __STDCPP_DEFAULT_NEW_ALIGNMENT__ must be saved in the coroutine state, there should be no way to guarantee that the object will end up properly aligned in memory.

Experimentation

Godbolt

async f(): Assertion `reinterpret_cast<uintptr_t>(&b) % 32ull == 0' failed.

so it definitely doesn't work on GCC trunk (11.0.1 20210307). Replacing 32 with 16 (which equals __STDCPP_DEFAULT_NEW_ALIGNMENT__) eliminates this assertion failure.

godbolt.org cannot run Windows binaries, but the assertion fires with MSVC on my computer as well.

Nullifidian answered 9/3, 2021 at 12:29 Comment(7)

Related: I'd like to know the same for things like std::variant/any/optional? – Parrott 12/3, 2021 at 21:4

@Parrott optional and variant don't seem to have this problem. [optional.optional.1]: "The contained value shall be allocated in a region... suitably aligned for the type T." [variant.variant.1]: "The contained value shall be allocated in a region... suitably aligned for all types in Types." It is easy to meet the alignment requirement for optional and variant by using either a union or aligned_storage_t<sizeof(T), alignof(T)> as the buffer, and their alignments automatically propagate to the enclosing types. – Nullifidian 13/3, 2021 at 4:22

@Parrott Conversely, I cannot find any alignment requirement for any in [any.class]. However, in practice it would be quite difficult to not meet the right alignment anyway, as the new expression automatically use the align_val_t-accepting overloads for types that have extended alignment. MSVC's, libstdc++'s and libc++'s impls all seem to handle extended alignment correctly (Godbolt). – Nullifidian 13/3, 2021 at 4:57

Dup of anything from stackoverflow.com/… – Semitrailer 13/3, 2021 at 17:14

@LanguageLawyer Just because two questions have the same answer doesn't mean they're duplicates. Example: Why are perpetual motion machines impossible? vs How does a bomb calorimeter work? – Nullifidian 14/3, 2021 at 3:9

So your point is: one can create 10 questions about different contexts: Are over-aligned types allowed in function bodies?, Are over-aligned types allowed in function parameters?, Are over-aligned types allowed as class members?, Are over-aligned types allowed in for loop initializer? etc. and neither of these questions should be closed as a dup? – Semitrailer 14/3, 2021 at 10:45

@LanguageLawyer Short answer: No, they shouldn't. Long answer: No, because [c++] questions do not equal [language-lawyer] questions. Limitations of real-world impls are as important, if not more so, to the topics compared to the overarching standard. This question in particular should not be closed as dup because it illustrates a unique corner case of compiler support from major vendors, and highlights a potential standard defect (unless a rational can be given against permitting support for the extended-alignment alloc functions for allocating coro states). – Nullifidian 15/3, 2021 at 2:10

S

0

From [basic.align]/3:

It is implementation-defined whether any extended alignments are supported and the contexts in which they are supported.

So the answer is: implementation-defined. The second part of the sentence sounds like it is possible that an implementation may support creating objects of over-aligned types in ordinary functions and not support in coroutines.

[basic.align]/9:

If a request for a specific extended alignment in a specific context is not supported by an implementation, the program is ill-formed.

Implementations which do not emit a diagnostic message when they don't support extended alignment are not conforming.

Semitrailer answered 13/3, 2021 at 17:8 Comment(6)

Another interesting context is "passed by value to a function". – Isobaric 13/3, 2021 at 17:25

I disagree with this answer. The paragraphs for the definition of coroutines with storage is clear that it comes from the ::operator new(std::size_t) overload, which does not handle over-alignment. This would make it undefined behavior and not implementation-defined, because there is no way for an implementation to support this without deviating from the standard – Pistareen 13/3, 2021 at 17:26

@Human-Compiler I've already addressed your comment in the comment under your answer. I don't see where the standard says that an implementation shall allocate no more than sum of sizeofs of objects with automatic storage duration or smth like this. An implementation can allocate sizeof(over-aligned) + alignof(over-aligned) and place the over-aligned object at suitably aligned offset within the storage. – Semitrailer 13/3, 2021 at 17:32

There is no place in the C++ language as far as I'm aware where this is, or has been, done -- even though it could be. Additionally this has to happen for the entire state of the stack that may be resumed, and you may have more than one over-aligned object. I'm not sure if you're actually suggesting that each one goes through this transformation, or that the compiler does one large transformation for all stack objects up to that point -- but either case seems ridiculous IMO – Pistareen 13/3, 2021 at 17:39

@Human-Compiler I'm not sure if you're actually suggesting that each one goes through this transformation, or that the compiler does one large transformation for all stack objects up to that point Is there any essential difference? I think this only affects necessary buffer size. – Semitrailer 13/3, 2021 at 17:52

It's not that ridiculous. Just allocate an additional max(alignof(captured-things)) + sizeof(void*) bytes, and position the entire frame at a suitable offset within the allocation. The extra void* is to remember the start of the allocation so it can be freed. – Isobaric 13/3, 2021 at 18:50

P

1

From my reading, this would be undefined behavior.

dcl.fct.def.coroutine/9 covers the lookup order for determining the allocation function that will be used should the coroutine need additional storage. The lookup order is quite clear:

An implementation may need to allocate additional storage for a coroutine. This storage is known as the coroutine state and is obtained by calling a non-array allocation function ([basic.stc.dynamic.allocation]).

The allocation function's name is looked up in the scope of the promise type. If this lookup fails, the allocation function's name is looked up in the global scope. If the lookup finds an allocation function in the scope of the promise type, overload resolution is performed on a function call created by assembling an argument list. The first argument is the amount of space requested, and has type std::size_t. The lvalues p1…pn are the succeeding arguments.

If no viable function is found ([over.match.viable]), overload resolution is performed again on a function call created by passing just the amount of space required as an argument of type std::size_t.

^{(Emphasis mine)}

This explicitly mentions that the new overload it will call must start with a std::size_t argument, and may optionally operate on a list of lvalue references p1, p2, ..., pn (if its found in the scope of the promise).

Since in your above example there is no custom operator new defined for the promise type, that means it must select ::operator new(std::size_t) as the overload.

As you already know, ::operator new is only guaranteed to be aligned to __STDCPP_DEFAULT_NEW_ALIGNMENT__ -- which is below the extended alignment required for the coroutine storage. This effectively makes any extended-aligned type in a coroutine be undefined behavior due to misalignment.

Because of how strict the wording is that it must call ::operator new(std::size_t), this should be consistent on any system that implements c++20 correctly. If an implementation chose to support extended-aligned types, it would technically be violating the standard by calling the wrong new overload (which would be an observable deviation).

Judging by the wording on the overload resolution for the allocation function, I think in a case where you require extended-alignment, you should be defining a member-based operator new for your promise that is aware of the possible alignment requirement.

Pistareen answered 13/3, 2021 at 5:43 Comment(5)

The problem is that the alignment needs of the coroutine state is determined by the coroutine function body, which I have no knowledge or control of even if I define class-scope operator new for my promise type. I can compute the maximum alignment among the coroutine parameters, but not for local variables. – Nullifidian 13/3, 2021 at 9:32

list of lvalue references p1, p2, ..., pn Could you show the word «references» in the standard text? – Semitrailer 13/3, 2021 at 9:47

How the fact that ::operator new(std::size_t) is used proves that over-aligned types are not supported? One can always allocate enough memory to place an over-aligned object at some offset. – Semitrailer 13/3, 2021 at 11:3

@LanguageLawyer Yes, I thought about that solution some hours ago as well. – Nullifidian 13/3, 2021 at 13:35

I also think, that "just the amount of space required" (vs "requested") quirky allows some room for interpretation further on here. – Cult 18/3, 2021 at 14:58

S

0

From [basic.align]/3:

It is implementation-defined whether any extended alignments are supported and the contexts in which they are supported.

So the answer is: implementation-defined. The second part of the sentence sounds like it is possible that an implementation may support creating objects of over-aligned types in ordinary functions and not support in coroutines.

[basic.align]/9:

If a request for a specific extended alignment in a specific context is not supported by an implementation, the program is ill-formed.

Implementations which do not emit a diagnostic message when they don't support extended alignment are not conforming.

Semitrailer answered 13/3, 2021 at 17:8 Comment(6)

Another interesting context is "passed by value to a function". – Isobaric 13/3, 2021 at 17:25

I disagree with this answer. The paragraphs for the definition of coroutines with storage is clear that it comes from the ::operator new(std::size_t) overload, which does not handle over-alignment. This would make it undefined behavior and not implementation-defined, because there is no way for an implementation to support this without deviating from the standard – Pistareen 13/3, 2021 at 17:26

@Human-Compiler I've already addressed your comment in the comment under your answer. I don't see where the standard says that an implementation shall allocate no more than sum of sizeofs of objects with automatic storage duration or smth like this. An implementation can allocate sizeof(over-aligned) + alignof(over-aligned) and place the over-aligned object at suitably aligned offset within the storage. – Semitrailer 13/3, 2021 at 17:32

There is no place in the C++ language as far as I'm aware where this is, or has been, done -- even though it could be. Additionally this has to happen for the entire state of the stack that may be resumed, and you may have more than one over-aligned object. I'm not sure if you're actually suggesting that each one goes through this transformation, or that the compiler does one large transformation for all stack objects up to that point -- but either case seems ridiculous IMO – Pistareen 13/3, 2021 at 17:39

@Human-Compiler I'm not sure if you're actually suggesting that each one goes through this transformation, or that the compiler does one large transformation for all stack objects up to that point Is there any essential difference? I think this only affects necessary buffer size. – Semitrailer 13/3, 2021 at 17:52

It's not that ridiculous. Just allocate an additional max(alignof(captured-things)) + sizeof(void*) bytes, and position the entire frame at a suitable offset within the allocation. The extra void* is to remember the start of the allocation so it can be freed. – Isobaric 13/3, 2021 at 18:50

N

0

Just to recap on the responses this question has received so far:

As @LanguageLawyer has pointed out, compilers have no obligation to support extended-alignment types in any context, and the extent to which those types are supported is allowed to vary among different contexts. Therefore, it is conformant for compilers to not guarantee that b is correctly aligned.
Even though the standard mandates that the non-extended-alignment versions of operator new should always be used, a compiler could still manage to place the coroutine state at the correct alignment by overallocating, and choosing a sufficiently aligned address at run time. But again, it technically does not have to support anything.
In today's status quo, OP's code sample definitely isn't correct, though in a more rigorous sense it is only undefined behavior when at run time a b actually gets misplaced and leads to some form of misaligned access (UB is a runtime property).

Nullifidian answered 18/3, 2021 at 10:49 Comment(0)

Full question

Explanation

Experimentation

Recommended topics

Hot tags