Do rvalue references have the same overhead as lvalue references?
Asked Answered
O

2

6

Consider this example:

#include <utility>

// runtime dominated by argument passing
template <class T>
void foo(T t) {}

int main() {
    int i(0);
    foo<int>(i); // fast -- int is scalar type
    foo<int&>(i); // slow -- lvalue reference overhead
    foo<int&&>(std::move(i)); // ???
}

Is foo<int&&>(i) as fast as foo<int>(i), or does it involve pointer overhead like foo<int&>(i)?

EDIT: As suggested, running g++ -S gave me the same 51-line assembly file for foo<int>(i) and foo<int&>(i), but foo<int&&>(std::move(i)) resulted in 71 lines of assembly code (it looks like the difference came from std::move).

EDIT: Thanks to those who recommended g++ -S with different optimization levels -- using -O3 (and making foo noinline) I was able to get output which looks like xaxxon's solution.

Oleaster answered 14/8, 2018 at 3:0 Comment(11)
Premature optimization?Pyrrolidine
@Rakete1111: Yes, this is just a curiosity.Oleaster
Measure and find out.Frobisher
Well, semantically it is possible to do double copy in case of rvalue reference, but for real cases I would expect compiler to use pointers - after all, code with real (not made by std::move) rvalues (and big rvalues - say, std::vector constructed on-the-fly) would be better off with pass-by-non-const-pointerBehalf
"running g++ -S gave me the same 51-line assembly" - try it with different optimization levels (-O1 vs --O2 vs -O3 vs -Os).Hockett
I second @JesperJuhl: -S is virtually never meaningful without -Os, and especially not with -O0 or -O3. Only -S -Os produces near-readable assembler code that shows what's actually going on. That said, your template foo<> is not even trying to actually use its parameter. The optimizer will throw out what you try to look at. For proper analysis, define three non-template functions like int foo_noref(int arg) { return arg; } in a separate file and compile with -S -Os. Then do the same for the calls void bar_noref() { int i = 0; foo_noref(i); }.Kiely
Yes it adds the pointer, so all uses of the referred object will involve a pointer indirection. On the other hand, if the object you are referencing was bigger, then passing by value could invoke all the cost of making a copy, even if you then only accessed 1 member within the function. Pass the object on to another method by value and you make another copy. Sometimes this is what you want to do, but generally it is good policy to pass simple values by value and larger objects by const ref.Chaussure
@GemTaylor: Well put. I've been looking for some template metaprogramming tool to pass simple values by value and larger objects by const ref for arbitrary types, part of why I brought this question up.Oleaster
@TaylorNichols When it comes to TMP everything is inlined, and the compiler optimiser should reduce most reference parameters back to the original declaration, so it shouldn't matter whether you use const references or value copies.Chaussure
If it does make a difference, you can always add conditional SFINAE and have 2 versions, but I suspect mainly it won't make much difference. is_integral will be your friend here.Chaussure
@GemTaylor: I'm mainly thinking about non-temporary variables, such as class members, which won't get inlined. Also if functions take n parameters I'd have 2^n versions so I'm still considering the cleanest implementation.Oleaster
C
6

In your specific situation, it's likely they are all the same. The resulting code from godbolt with gcc -O3 is https://godbolt.org/g/XQJ3Z4 for:

#include <utility>

// runtime dominated by argument passing
template <class T>
int foo(T t) { return t;}

int main() {
    int i{0};
    volatile int j;
    j = foo<int>(i); // fast -- int is scalar type
    j = foo<int&>(i); // slow -- lvalue reference overhead
    j = foo<int&&>(std::move(i)); // ???
}

is:

    mov     dword ptr [rsp - 4], 0 // foo<int>(i);
    mov     dword ptr [rsp - 4], 0 // foo<int&>(i);
    mov     dword ptr [rsp - 4], 0 // foo<int&&>(std::move(i)); 
    xor     eax, eax
    ret

The volatile int j is so that the compiler cannot optimize away all the code because it would otherwise know that the results of the calls are discarded and the whole program would optimize to nothing.

HOWEVER, if you force the function to not be inlined, then things change a bit int __attribute__ ((noinline)) foo(T t) { return t;}:

int foo<int>(int):                           # @int foo<int>(int)
        mov     eax, edi
        ret
int foo<int&>(int&):                          # @int foo<int&>(int&)
        mov     eax, dword ptr [rdi]
        ret
int foo<int&&>(int&&):                          # @int foo<int&&>(int&&)
        mov     eax, dword ptr [rdi]
        ret

above: https://godbolt.org/g/pbZ1BT

For questions like these, learn to love https://godbolt.org and https://quick-bench.com/ (quick bench requires you to learn how to properly use google test)

Crus answered 14/8, 2018 at 3:15 Comment(4)
I like the volatile int trick. Theoretically, could the compiler also optimize away the calls to foo(i) because it knows the input is discarded?Oleaster
@TaylorNichols without the volatile, the whole program is optimized to nothing: godbolt.org/g/e3n6BA. With volatile, it means the compiler doesn't know that something doesn't happen between each assignment (something that's not present in the code), so it has to actually do "the right thing" which means setting a value "as if" it had called the function.Crus
That makes sense, so foo never even gets called and we just get j = i; in each case, hence the mov statements. I suppose my question would be more relevant if foo was actually called.Oleaster
well, then it looks like ref vs no-ref are a little different: godbolt.org/g/pbZ1BTCrus
P
5

Efficiency of parameter passing depends on the ABI.

For example, on linux the Itanium C++ ABI specifies that references are passed as pointers to the referred object:

3.1.2 Reference Parameters

Reference parameters are handled by passing a pointer to the actual parameter.

This is independent of the reference category (rvalue/lvalue reference).

For a broader view, I have found this quote in a document from the Technical University of Denmark, calling convention, which analyzes most of the compilers:

References are treated as identical to pointers in all respects.

So rvalue and lvalue reference involve pointer overhead on all ABI.

Peignoir answered 14/8, 2018 at 10:43 Comment(1)
Thanks -- I was hoping to find some official docs on this. I suppose rvalue references would have to use something like pointers, as moving from an object often involves resetting it's variables from a different scope.Oleaster

© 2022 - 2024 — McMap. All rights reserved.