Is this "elision failure" language-mandated?
Asked Answered
G

3

7

Consider the following code:

#include <utility>
#include <string>

int bar() {
    std::pair<int, std::string> p { 
        123, "Hey... no small-string optimization for me please!" };
    return p.first;
}

(simplified thanks to @Jarod42 :-) ...)

I expect the function to be implemented as simply:

bar():   
        mov eax, 123
        ret

but instead, the implementation calls operator new(), constructs an std::string with my literal, then calls operator delete(). At least - that's what gcc 9 and clang 9 do (GodBolt). Here's the clang output:

bar():                                # @bar()
        push    rbx
        sub     rsp, 48
        mov     dword ptr [rsp + 8], 123
        lea     rax, [rsp + 32]
        mov     qword ptr [rsp + 16], rax
        mov     edi, 51
        call    operator new(unsigned long)
        mov     qword ptr [rsp + 16], rax
        mov     qword ptr [rsp + 32], 50
        movups  xmm0, xmmword ptr [rip + .L.str]
        movups  xmmword ptr [rax], xmm0
        movups  xmm0, xmmword ptr [rip + .L.str+16]
        movups  xmmword ptr [rax + 16], xmm0
        movups  xmm0, xmmword ptr [rip + .L.str+32]
        movups  xmmword ptr [rax + 32], xmm0
        mov     word ptr [rax + 48], 8549
        mov     qword ptr [rsp + 24], 50
        mov     byte ptr [rax + 50], 0
        mov     ebx, dword ptr [rsp + 8]
        mov     rdi, rax
        call    operator delete(void*)
        mov     eax, ebx
        add     rsp, 48
        pop     rbx
        ret
.L.str:
        .asciz  "Hey... no small-string optimization for me please!"

My question is: Clearly, the compiler has full knowledge of everything going on inside bar(). Why is it not "eliding"/optimizing the string away? More specifically:

  1. At the basic level there's the code between then new() and delete(), which AFAICT the compiler knows results in nothing useful.
  2. Secondarily, the new() and delete() calls themselves. After all, small-string-optimization is allowed by the standard AFAIK, so even though clang/gcc hasn't chosen to use that - it could have; meaning that it's not actually required to call new() or delete() there.

I'm particularly interested in what part of this is directly due to the language standard, and what part is compiler non-optimality.

Gussiegussman answered 23/3, 2020 at 15:0 Comment(9)
Isn't new/delete observable behavior? (and new can throw)Villegas
@Jarod42: 1. I'm not sure it is, in C++20, if you can have constexpr vectors. 2. Even if it is - if you're allowed, in theory, to use a small-string-optimization with upto 1,000,000 characters, that means you're allowed different observable behaviors for foo() and therefore for bar().Gussiegussman
According to en.cppreference.com/w/cpp/language/new#Allocation, allocation can be elided only since C++14.Villegas
inconsistent-behavior-of-compiler-optimization-of-unused-string show issue with compilers for delete new int;Villegas
I would say that SSO should be consistent, else it is mainly an ODR violation.Villegas
@Jarod42: "I would say" - what does the standard say about that, actually? If there isn't a strict consistence requirement, it won't be an ODR violation, because SSO could depend on how many times you've ever instantiated a string, or the phase of the moon, or whatever.Gussiegussman
@Villegas "allocation can be elided only since C++14" Only if std functions are replaced. Which the compiler doesn't know during separate compilation, of course. For whole program optimization, they could always be elided.Negrillo
@curiousguy: Where do you see that restriction? "delete[] new int[10]; can be optimized out". but "allocation" is indeed too "general", as only new-expressions can be elided.Villegas
@Villegas Maybe my phrasing was problematic: I only replied to "only since C++14" = "no optimization allowed by the std. But it's only relevant when there is potentially user code, that is a replaced (de)allocation function. Which the compiler can know w/ whole program optimization.Negrillo
B
4

Nothing in your code represents "elision" as that term is commonly used in a C++ context. The compiler is not permitted to remove anything from that code on the grounds of "elision".

The only grounds a compiler has to remove the creation of that string is on the basis of the "as if" rule. That is, is the behavior of the string creation/destruction visible to the user and therefore not able to be removed?

Since it uses std::allocator and the standard character traits, the basic_string construction and destruction itself is not being overridden by the user. So there is some basis for the idea that the string's creation is not a visible side-effect of the function call and thus could be removed under the "as if" rule.

However, because std::allocator::allocate is specified to call ::operator new, and operator new is globally replaceable, it is reasonable to argue that this is a visible side effect of the construction of such a string. And therefore, the compiler cannot remove it under the "as if" rule.

If the compiler knows that you have not replaced operator new, then it can in theory optimize the string away.

That doesn't mean that any particular compiler will do so.

Bolanos answered 23/3, 2020 at 15:32 Comment(15)
So a new expression can be elided (ie, skipped, with no call to operator ::new) or combined starting in C++14; but (a) is std::allocator specified in a way compatible with that elision, (I hope so!) (b) does the standard actually require that std::string use that allocator to get memory at all? (with SBO, that wording would be tricky...)Depict
1. I was using the term "elision" more broadly". Do you suggest I remove the term from the question? 2. So, the bottom line of your answer is that a compiler can legitimately compile bar() to my desired two-liner?Gussiegussman
@einpoklum: More details added.Bolanos
" And therefore, the compiler cannot remove it under the "as if" rule." <- But the compiler is not required to allocate anything for an std::string, since SSO is allowed.Gussiegussman
@einpoklum: But it is not required to not call it as well.Bolanos
But what I'm saying is that " And therefore, the compiler cannot remove it under the "as if" rule." is not true, because the compiler is not required to do any allocation in the first place, regardless of whether operator new() has been replaced.Gussiegussman
@Yakk-AdamNevraumont: from cppreference "delete[] new int[10]; can be optimized out, but operator delete(operator new(10)); cannot." and std::allocator<T>::allocate has to call ::operator new... so std::allocator is incompatible with that elision.Villegas
@Villegas Because small str optimization is allowed, it's obvious that doing allocations isn't part of the specification of std::string.Negrillo
Yes, looks like [expr.new/12] "An implementation is allowed to omit a call to a replaceable global allocation function ([new.delete.single], [new.delete.array]). When it does so, the storage is instead provided by the implementation or provided by extending the allocation of another new-expression." is a bit too tightly scoped to be applicable here. A shame really.Outbreak
@Deduplicator: Doesn't the "as if" rule also apply to "providing storage"?Gussiegussman
"Doesn't the "as if" rule also apply to "providing storage"?" Yes, that is not observable. However, it is not mandated that replaced "operator new" only provide storage and not do anything observable. So you need language which allows it not to be called.Toleration
@JeffGarrett: But I wasn't talking about a replaced operator new, but rather on an implementation providing storage without calling any operator new.Gussiegussman
@Villegas I read that too, but I wouldn't trust cppreference for that level of detail. I'd want to check std allocator's specification. And that was just argument 1; argument 2 (that std string isn't saying it allocates) is strong enough to carry it.Depict
You shouldn't try to "read between the lines" of cppreference - we aren't written for language-lawyer-level fine parsing. But if we explicitly say that the compiler isn't allowed to optimize something out, there's a very good chance that it isn't. And if we explicitly say that the implementation is allowed to optimize something out, there's a very good chance that it is. @Yakk-AdamNevraumontBurgonet
@Burgonet Ah nice, I was figuring there was something like that in the standard. Because it would be dumb if not (not because I found it).Depict
G
4

Following the discussion in various answers and comments here, I have now filed the following bugs against GCC and LLVM regarding this issue:

  1. GCC bug 94293: [missed optimization] new+delete of unused local string not removed

    Minimal testcase (GodBolt):

    void foo() {
        int *p = new int[1];
        *p = 42;
        delete[] p;
    }
    
  2. GCC bug 94294: [missed optimization] Useless statements populating local string not removed

    Minimal testcase (GodBolt):

    void foo() {
        std::string s { "This is not a small string" };
    }
    
  3. LLVM bug 45287: [missed optimization] failure to drop unused libstdc++ std::string.

    Minimal testcase (GodBolt):

    void foo() {
        std::string s { "This is not a small string" };
    }
    

Thanks goes to: @JeffGarret, @NicolBolas, @Jarod42, Marc Glisse .

Update, August 2021: With recent versions of clang++, g++ and libstc++, all of these minimal testcases eschew memory allocation as one would expect. clang++ also has this behavior for OP's program in the question, but GCC still allocates and deallocates.

Gussiegussman answered 23/3, 2020 at 22:33 Comment(0)
T
2

The question is can the program

int bar() {
    std::pair<int, std::string> p { 
        123, "Hey... no small-string optimization for me please!" };
    return p.first;
}

be validly be optimized to

int bar() {
    return 123;
}

tldr, yes, I think.

And clang does with libc++: godbolt

About std::string, the standard says string.require/3

Every object of type basic_­string uses an object of type Allocator to allocate and free storage for the contained charT objects as needed.

"as needed". std::string is allowed to decide when to use the allocator (which is I believe the justification for SSO being valid). Its member functions do not mandate an allocation. Therefore the allocation may be elided.

Toleration answered 23/3, 2020 at 19:54 Comment(5)
First, I wasn't aware GodBolt's clang++ defaulted to libstdc++, so thanks for that. But, why is it that it's willing to optimize the string away with libc++, but not with libstdc++? I mean, the implementations can't be that different, can they?Gussiegussman
When researching this, I found implications in the bug trackers that (1) gcc is unable to optimize away allocations generally and (2) clang is thwarted by the simplest of try/catch. libstdc++ does have a little try/catch in between the new and delete, as far as I can tell.Toleration
I say implications, because I was unable to find multiple sources or corroborate them. For example, in this gcc bug (gcc.gnu.org/bugzilla/show_bug.cgi?id=71258), it says "GCC cannot eliminate calls to 'new' currently." ... But it's old (2016). But it's also still open.Toleration
This bug for LLVM shows even the most trivial of try/catch's, handled in the same function, with complete knowledge available to clang, cannot be optimized away (and one could reason that this would interfere with collapsing logic before and after such a construct): bugs.llvm.org/show_bug.cgi?id=35052Toleration
Apparently, it's not the exceptions that are the cause of the problem. See the links in my "answer".Gussiegussman

© 2022 - 2024 — McMap. All rights reserved.