Should I always move on `sink` constructor or setter arguments?
Asked Answered
E

2

21
struct TestConstRef {
    std::string str;
    Test(const std::string& mStr) : str{mStr} { }
};

struct TestMove {
    std::string str;
    Test(std::string mStr) : str{std::move(mStr)} { }
};

After watching GoingNative 2013, I understood that sink arguments should always be passed by value and moved with std::move. Is TestMove::ctor the correct way of applying this idiom? Is there any case where TestConstRef::ctor is better/more efficient?


What about trivial setters? Should I use the following idiom or pass a const std::string&?

struct TestSetter {
    std::string str;
    void setStr(std::string mStr) { str = std::move(str); }
};
Esau answered 7/9, 2013 at 13:12 Comment(4)
That claim seems dubious to me. Passing by const& and then initializing will call a single copy constructor. Passing by value and moving will call a copy constructor followed by a move assignment operator.Santalaceous
@Yuushi: In general, the move constructor of most classes is nearly free (equivalent to a swap). Also, you are forgetting the cases where you initialize the argument from a temporary (or a moved-from variable).Preadamite
@MatthieuM. I realize that the move constructor is generally nearly free. However, if you are initializing from a temporary/moved from variable, why not declare it to take an rvalue reference explicitly?Santalaceous
@Santalaceous Then it doesn't work for anything else. Sure, you can overload, but it's and extra code (even if you don't type it out twice, it can lead to the same problems as excessive inlining or template bloat). Just to save a single move, which is usually as cheap as handing over a reference (perhaps it has to touch two words instead of one, but that's like one clock cycle).Twentieth
P
22

The simple answer is: yes.


The reason is quite simple as well, if you store by value you might either need to move (from a temporary) or make a copy (from a l-value). Let us examine what happens in both situations, with both ways.

From a temporary

  • if you take the argument by const-ref, the temporary is bound to the const-ref and cannot be moved from again, thus you end up making a (useless) copy.
  • if you take the argument by value, the value is initialized from the temporary (moving), and then you yourself move from the argument, thus no copy is made.

One limitation: a class without an efficient move-constructor (such as std::array<T, N>) because then you did two copies instead of one.

From a l-value (or const temporary, but who would do that...)

  • if you take the argument by const-ref, nothing happens there, and then you copy it (cannot move from it), thus a single copy is made.
  • if you take the argument by value, you copy it in the argument and then move from it, thus a single copy is made.

One limitation: the same... classes for which moving is akin to copying.

So, the simple answer is that in most cases, by using a sink you avoid unnecessary copies (replacing them by moves).

The single limitation is classes for which the move constructor is as expensive (or near as expensive) as the copy constructor; in which case having two moves instead of one copy is "worst". Thankfully, such classes are rare (arrays are one case).

Preadamite answered 7/9, 2013 at 13:30 Comment(10)
Thanks for the clear reply! Does this apply to trivial setters as well? (second example in the original post, edited in later)Esau
"yes" is the answer to which question? :)Asmodeus
@VittorioRomeo: to any method which makes a copy of the argument, be it constructor, setter, or just some destructive computation.Preadamite
@MatthieuM. Have you profiled this? Before searching SO I was considering this myself, and in every instance that I've tested, passing by value is consistently slower on both g++ 4.8.1 and clang++ 3.4. I tested with a string member and in the case of passing by reference to the setter where the member already had enough space allocated (non empty string), the pass by value was drastically slower. This was all passing lvalues, I've not looked at rvalues yet.Carbamate
@Troy: I suspect that the gain is due to the member having enough space allocated, and the assignment operator taking advantage of it by overwriting the existing string in place without allocation. On the other hand when you use the pass-by-value idiom, the copy made need allocate new storage; memory allocation is not cheap (especially when using the default allocator). Indeed, I had not considered that in the case of a setter, the copy could be made via a copy assignment operator, which has different dynamics than a copy constructor.Preadamite
I didn't explain myself well, every call passing by reference was faster (altho only very slightly), only the pass by reference to the setter was a lot faster when the space was already allocated, which you lose obviously on pass by value. Even if the space is cleared at each set, its on par with pass by value. See here: pastebin.com/vAdLCUeX Both gcc 4.8.1 and clang++ give identical output. It seems there may be room for some of these moves to be elided in the future, but gcc/clang are not doing it right now, so pass by reference continues to be faster in most cases.Carbamate
@Troy: please note that when you clear a string (or vector) it retains its underlying memory buffer so subsequent assignments need not allocate. I am surprised at the results you get, however to investigate would require checking how the compiler interprets the test case (AST/ABT) and the generated IR (or assembly, but IR is more readable). Did you use libc++ with clang, or kept using libstdc++ ?Preadamite
libstdc++ only. I don't have libc++ configured on my system.Carbamate
@Troy: Ah, that may be the issue. libstdc++ use a non-conforming implementation of std::string based on COW (Copy On Write). It means that internally all copies of a string share the same underlying buffer (kinda like a shared_ptr). As a result copies are extremely cheap (but accessing a character for modification is not O(1) as it may trigger a real copy of the underlying buffer).Preadamite
No. I'm not using a standard type at all. I'm using a user-defined type called 'TestString'. I'll tidy the code up in the morn and paste it somewhere. I had to do some horrible hacks to benchmark passing rvalues without having to repeatedly construct a new object, so I wasn't planning on sharing it ;) It's simply a wrapper around char*.Carbamate
K
11

A bit late, as this question already has an accepted answer, but anyways... here's an alternative:

struct Test {
    std::string str;
    Test(std::string&& mStr) : str{std::move(mStr)} { } // 1
    Test(const std::string& mStr) : str{mStr} { } // 2
};

Why would that be better? Consider the two cases:

From a temporary (case // 1)

Only one move-constructor is called for str.

From an l-value (case // 2)

Only one copy-constructor is called for str.

It probably can't get any better than that.

But wait, there is more:

No additional code is generated on the caller's side! The calling of the copy- or move-constructor (which might be inlined or not) can now live in the implementation of the called function (here: Test::Test) and therefore only a single copy of that code is required. If you use by-value parameter passing, the caller is responsible for producing the object that is passed to the function. This might add up in large projects and I try to avoid it if possible.

Kilovolt answered 7/9, 2013 at 14:27 Comment(6)
As far as I understand, this is the best possible approach in terms of performance and codegen, but requires one extra constructor - correct?Esau
@VittorioRomeo: Yes, that is the drawback - you always need two overloads. If a method has multiple parameters, this can become quite a burden and its worth to consider going back to by-value parameters.Kilovolt
Sounds like a language defect, to be honest. The number of required ctors may grow exponentially... is there any proposal to fix this?Esau
@VittorioRomeo: Not that I'm aware of but that doesn't mean much. :)Kilovolt
I agree, it does seem like a language defect. Would be nice to be able to have a single constructor such as you can when using templates and perfect forwarding.Carbamate
What if I need to accept two parameters in constructor - should there be 4 overloads of constructor?Ablate

© 2022 - 2024 — McMap. All rights reserved.