Consider the following code:
#include <utility>
#include <string>
int bar() {
std::pair<int, std::string> p {
123, "Hey... no small-string optimization for me please!" };
return p.first;
}
(simplified thanks to @Jarod42 :-) ...)
I expect the function to be implemented as simply:
bar():
mov eax, 123
ret
but instead, the implementation calls operator new()
, constructs an std::string
with my literal, then calls operator delete()
. At least - that's what gcc 9 and clang 9 do (GodBolt). Here's the clang output:
bar(): # @bar()
push rbx
sub rsp, 48
mov dword ptr [rsp + 8], 123
lea rax, [rsp + 32]
mov qword ptr [rsp + 16], rax
mov edi, 51
call operator new(unsigned long)
mov qword ptr [rsp + 16], rax
mov qword ptr [rsp + 32], 50
movups xmm0, xmmword ptr [rip + .L.str]
movups xmmword ptr [rax], xmm0
movups xmm0, xmmword ptr [rip + .L.str+16]
movups xmmword ptr [rax + 16], xmm0
movups xmm0, xmmword ptr [rip + .L.str+32]
movups xmmword ptr [rax + 32], xmm0
mov word ptr [rax + 48], 8549
mov qword ptr [rsp + 24], 50
mov byte ptr [rax + 50], 0
mov ebx, dword ptr [rsp + 8]
mov rdi, rax
call operator delete(void*)
mov eax, ebx
add rsp, 48
pop rbx
ret
.L.str:
.asciz "Hey... no small-string optimization for me please!"
My question is: Clearly, the compiler has full knowledge of everything going on inside bar()
. Why is it not "eliding"/optimizing the string away? More specifically:
- At the basic level there's the code between then
new()
anddelete()
, which AFAICT the compiler knows results in nothing useful. - Secondarily, the
new()
anddelete()
calls themselves. After all, small-string-optimization is allowed by the standard AFAIK, so even though clang/gcc hasn't chosen to use that - it could have; meaning that it's not actually required to callnew()
ordelete()
there.
I'm particularly interested in what part of this is directly due to the language standard, and what part is compiler non-optimality.
new
/delete
observable behavior? (andnew
canthrow
) – Villegasfoo()
and therefore forbar()
. – Gussiegussmandelete new int;
– Villegasdelete[] new int[10];
can be optimized out". but "allocation" is indeed too "general", as only new-expressions can be elided. – Villegas