Preamble
This is a description of what I am trying to do with the code, skip to the next section to see the actual issue.
I want to use coroutines in an embedded system, where I can't afford too many dynamic allocations. Therefore, I am trying the following: I have non-copyable, non-movable awaitable types for the various queries to peripherals. When querying a peripheral, I use something like auto result = co_await Awaitable{params}
. The constructor of the awaitable prepares the request to the peripheral, registers its internal buffer
to receive the reply, and registers its ready
flag in the promise. The coroutine is then suspended.
Later, the buffer
will be filled, and the ready
flag will be set to true
. After this, the coroutine knows that it can be resumed, which the causes the awaitable to copy out the result from the buffer before being destroyed.
The awaitable is non-copyable and non-movable to force guaranteed copy elision everywhere, so that I can be sure that the pointers to buffer
and ready
remain valid until the awaitable has been awaited (at least that was the plan...)
The issue
I am encountering an issue with ARM GCC 11.3 in the following code:
#include <cstring>
#include <coroutine>
struct AwaitableBase {
AwaitableBase() = default;
AwaitableBase(const AwaitableBase&) = delete;
AwaitableBase(AwaitableBase&&) = delete;
AwaitableBase& operator=(const AwaitableBase&) = delete;
AwaitableBase& operator=(AwaitableBase&&) = delete;
char buffer[65];
};
struct task {
struct promise_type
{
bool* ready_ptr;
task get_return_object() { return {}; }
std::suspend_never initial_suspend() noexcept { return {}; }
std::suspend_always final_suspend() noexcept { return {}; }
void return_void() {}
void unhandled_exception() {}
};
};
struct Awaitable{
AwaitableBase base;
bool ready{false};
bool await_ready() {return false;}
void await_suspend(std::coroutine_handle<task::promise_type> handle)
{
handle.promise().ready_ptr = &ready;
}
int await_resume() { return 2; }
};
AwaitableBase make_awaitable_base()
{
return AwaitableBase{};
}
task example()
{
co_await Awaitable{make_awaitable_base()};
}
When compiling this with ARM GCC 11.3 without any optimizations, the code contains a memcpy
call that moves around the AwaitableBase
object (excerpt from Godbolt):
ldr r3, [r7, #4]
adds r3, r3, #87
mov r0, r3
bl make_awaitable_base()
ldr r2, [r7, #4]
ldr r3, [r7, #4]
add r0, r2, #21
adds r3, r3, #87
movs r2, #65
mov r1, r3
bl memcpy
ldr r3, [r7, #4]
movs r2, #0
strb r2, [r3, #86]
ldr r3, [r7, #4]
adds r3, r3, #21
mov r0, r3
bl Awaitable::await_ready()
This breaks my code, as I am relying the fact that the object cannot be moved/copied. It was my understanding that making an object non-copyable & non-movable should prevent it from being memcopied.
Observations/Comments
- The
memcpy
is no longer present in 13.1 - unfortunately, I am stuck with 11.3 - The
memcpy
is not present if I remove the aggreate initialization ofAwaitable
wrapped aroundAwaitableBase
(and instead makeAwaitableBase
itself the awaitable) - this doesn't work for me because I'd like to wrap other awaitables withAwaitable
to modify their behavior - The
memcpy
is not present without theco_await
- As noted previously, I need the awaitable to have a stable address, as I rely on the fact that I can look at the
ready_ptr
stored in the promise to check if the awaitable is done.
Question(s)
How can I work around this?
Is it a bug with the compiler, or am I misunderstanding something about guaranteed copy elision? Is it undefined behavior to rely on the fact that the address of the temporary should not change during the duration of the co_await
call?
co_await
expression. If you hoist it out likeco_await []{ return Awaitable{make_awaitable_base()}; }();
it doesn't seem tomemcpy
anymore. Does that work around work in your code base? – Yorkerreturn Awaitable{make_awaitable_base()};
can eliminatememcpy
call: godbolt.org/z/6hqxj8foP Can you use it as a workaround? – Erivan