Why is value taking setter member functions not recommended in Herb Sutter's CppCon 2014 talk (Back to Basics: Modern C++ Style)?
Asked Answered
B

4

75

In Herb Sutter's CppCon 2014 talk Back to Basics: Modern C++ Style he refers on slide 28 (a web copy of the slides are here) to this pattern:

class employee {
  std::string name_;
public:
  void set_name(std::string name) noexcept { name_ = std::move(name); }
};

He says that this is problematic because when calling set_name() with a temporary, noexcept-ness isn't strong (he uses the phrase "noexcept-ish").

Now, I have been using the above pattern pretty heavily in my own recent C++ code mainly because it saves me typing two copies of set_name() every time - yes, I know that can be a bit inefficient by forcing a copy construction every time, but hey I am a lazy typer. However Herb's phrase "This noexcept is problematic" worries me as I don't get the problem here: std::string's move assignment operator is noexcept, as is its destructor, so set_name() above seems to me to be guaranteed noexcept. I do see a potential exception throw by the compiler before set_name() as it prepares the parameter, but I am struggling to see that as problematic.

Later on on slide 32 Herb clearly states the above is an anti-pattern. Can someone explain to me why lest I have been writing bad code by being lazy?

Bruno answered 8/10, 2014 at 15:41 Comment(18)
If I remember correctly, Herb says that noexcept is a myth here, because the allocation that happens on callee's side can throw (e.g. when building a std::string from raw string literal or other std::string). So the body does not throw, but calling that function may still result in an exception being thrown (std::bad_alloc)Russelrussell
Sure, but noexcept makes no guarantees about the implementation using it, just about its own implementation. If your comment is indeed true, I then don't get how a set_name(const std::string &) + set_name(std::string &&) noexcept is any better, you just change the location of the potential exception throw, nothing more.Bruno
The function is marked noexcept, but calling it can throw. The fact that the exception is thrown before the function body itself is entered starts to get into "how many angels can dance on the head of a pin" territory.Eldoneldora
I honestly had always considered that as a given. If you do something which makes the compiler call a throwing function for you, you accept it throws.Bruno
I mostly agree with you, but I can see Herb's point. The noexcept really only says "If calling this function doesn't throw, I promise you won't get an exception", which is arguably less useful than it could be.Klemperer
So, you are saying that users see noexcept on a function, and therefore assume that no matter how you call that function it will never throw? Is that a normal interpretation?Bruno
@NiallDouglas noexcept is more for the compiler than the users.Edea
I'm not sure how normal it is, and of course it's wrong if they forget to account for any functions that are called implicitly before the function starts, but I would not be surprised if many programmers are less conscious of every implicit function call than you are :)Klemperer
I have to admit I see every single line of C++ as always potentially throwing exceptions, and code accordingly. Regarding noexcept being more for the compiler, not all - I have a number of times written special loops if std::is_nothrow_move_constructible returns true because I can skip exception safety, and that can lead to enormous simplifications.Bruno
I think it's like any time you are constructing from anything that may throw on construction (anything?), it may throw, but the function does not in itself throw. I see this like using a safe thing in an unsafe environment then the weakest link wins. I still see this as a noexcept method though and not "noexeptish"?Eversole
Why would you be making two copies of set_name()?Teacart
@VaughnCato: Herb's recommended solution is two set_name overloads, one being set_name(const std::string &) and the other being set_name(std::string &&) noexcept. I, since C++ 11 came along, know it is the right thing to type out both but I am fundamentally too lazy.Bruno
@NiallDouglas: He only recommended creating two methods for special optimization cases.Teacart
IMO what Herb does in that talk is derive general advice from some benchmark he ran on std::string. That's dumb, and makes the whole exercise rather pointless. Don't make lone const& overloads in general. Other than the thing about noexcept, Herb's arguments for a lone const& should not be convincing at all for anything other than std::string.Leodora
@VaughnCato: That's a fair point, but equally he didn't (clearly) say what the default first form you write ought to be other than it should be clear and simple. For example, I like most programmers tend to forget to spatter noexcept all over the place. It currently only really matters for move constructors and assignment, but I suspect I am not alone in wishing that the compiler would do it for me.Bruno
@NiallDouglas: I thought he was quite clear in stating that the default form should be passing by const reference. He explicitly stated that the default was the same as it was in C++98.Teacart
@VaughnCato: Good god no, that is definitely not what he was claiming. He gave generally good rules on when reference passing vs value passing is a good idea - though, incidentally, he left out the major one of "if a parameter passes through an externally linked function" where you generally are best advised to always use const lvalue references. Passing by value, where the compiler could get a chance to optimise by assuming no-alias or where ownership safety is improved, is always strongly recommended.Bruno
@NiallDouglas: At the bottom of slide 23, he shows a table of "C++98: Reasonable Default Advice", and at the top of slide 24, he shows a table of "Modern C++: Reasonable Default Advice", and both tables are the same. I'm not sure what isn't clear there.Teacart
M
47

Others have covered the noexcept reasoning above.

Herb spent much more time in the talk on the efficiency aspects. The problem isn't with allocations, its with unnecessary deallocations. When you copy one std::string into another the copy routine will reuse the allocated storage of the destination string if there's enough space to hold the data being copied. When doing a move assignment the destination string's existing storage must be deallocated as it takes over the storage from the source string. The "copy and move" idiom forces the deallocation to always occur, even when you don't pass a temporary. This is the source of the horrible performance that is demonstrated later in the talk. His advice was to instead take a const ref and if you determine that you need it have an overload for r-value references. This will give you the best of both worlds: copy into existing storage for non-temporaries avoiding the deallocation and move for temporaries where you're going to pay for a deallocation one way or the other (either the destination deallocates prior to the move or the source deallocates after the copy).

The above doesn't apply to constructors since there's no storage in the member variable to deallocate. This is nice since constructors often take more than one argument and if you need to do const ref/r-value ref overloads for each argument you end up with a combinatorial explosion of constructor overloads.

The question now becomes: how many classes are there that reuse storage like std::string when copying? I'm guessing that std::vector does, but outside of that I'm not sure. I do know that I've never written a class that reuses storage like this, but I have written a lot of classes that contain strings and vectors. Following Herb's advice won't hurt you for classes that don't reuse storage, you'll be copying at first with the copying version of the sink function and if you determine that the copying is too much of a performance hit you'll then make an r-value reference overload to avoid the copy (just as you would for std::string). On the other hand, using "copy-and-move" does have a demonstrated performance hit for std::string and other types that reuse storage, and those types probably see a lot of use in most peoples code. I'm following Herb's advice for now, but need to think through some of this a bit more before I consider the issue totally settled (there's probably a blog post that I don't have time to write lurking in all this).

Mollie answered 9/10, 2014 at 19:45 Comment(2)
This is a great answer. If you can slightly modify it to mention that this "anti-pattern" label only applies because std::string maintains capacity (as Herb mentioned), and if we weren't using a complex type capable of reusing its existing capacity and would always have to deallocate no matter what, I'll mark this as the accepted answer. Thank, by the way, you really hit the nail on the head for me.Bruno
I edited the last paragraph to make what reservations I have about Herb's advice more explicit. I was being a bit vague about them before.Mollie
T
12

There were two reasons considered for why passing by value might be better than passing by const reference.

  1. more efficient
  2. noexcept

In the case of setters for members of type std::string, he debunked the claim that passing by value was more efficient by showing that passing by const reference typically produced fewer allocations (at least for std::string).

He also debunked the claim that it allows the setter be noexcept by showing that the noexcept declaration is misleading, since an exception can still occur in the process of copying the parameter.

He thus concluded that passing by const reference was to be preferred over passing by value, at least in this case. However, he did mention that passing by value was a potentially good approach for constructors.

I do think that the example for std::string alone is not sufficient to generalize to all types, but it does call into question the practice of passing expensive-to-copy but cheap-to-move parameters by value, at least for efficiency and exception reasons.

Teacart answered 8/10, 2014 at 16:43 Comment(8)
You've basically just described the slides. We know all this (from the slides). And it is incorrect to say that the value taking set_name() can generate an exception, because it cannot due to the noexcept.Bruno
@NiallDouglas: I agree that it is all covered in the slides, but how does it not answer your question?Teacart
I find the efficiency argument quite compelling, but you brush it aside in the question :) Howard dislikes the copy-and-swap idiom for similar (in)efficiency reasons. It performs an allocation (and therefore is slower and could throw) even in cases where name_.capacity() > name.length() and so you only really need a memcpy. That alone seems enough to label it an anti-pattern and argue for providing two overloadsKlemperer
@JonathanWakely I also find the efficiency argument compelling, but not enough for me to bother writing two overloads for every single setter function I write. If benchmarking shows it to be a problem, or I am trying to get a Boost library past review, then I bother, otherwise not.Bruno
@VaughnCato: My question asked why are value taking setter methods an anti-pattern? I suppose this hinges on what you or I consider an anti-pattern. For me personally, potentially reduced performance in some situations does not make something an anti-pattern so if there is no loss of noexcept, I am currently thinking Herb is overegging his claim here.Bruno
@nialldouglas: I agree that he may be over-generalizing, based on a single example. Maybe it should be said that passing by value is a non-pattern instead of an anti-pattern. Clearly if you have a good reason to pass by value in certain situations, you should do it.Teacart
Passing by value can be noexcept, but as Herb points out in the talk, "not really", because in many cases, a copy must be made to be passed, and creating that copy can throw. Sure, the exception happens before the call is made, but it happens nonetheless.Loran
@NiallDouglas If you don't want to write two overloads, then why would you prefer the copy version to the const reference one? Do your benchmarks show that it is better in any way?Chyme
U
6

Herb has a point, in that taking by-value when you already have storage allocated within can be inefficient and cause a needless allocation. But taking by const& is almost as bad, as if you take a raw C string and pass it to the function, a needless allocation occurs.

What you should take is the abstraction of reading from a string, not a string itself, because that is what you need.

Now, you can do this as a template:

class employee {
  std::string name_;
public:
  template<class T>
  void set_name(T&& name) noexcept { name_ = std::forward<T>(name); }
};

which is reasonably efficient. Then add some SFINAE maybe:

class employee {
  std::string name_;
public:
  template<class T>
  std::enable_if_t<std::is_convertible<T,std::string>::value>
  set_name(T&& name) noexcept { name_ = std::forward<T>(name); }
};

so we get errors at the interface and not the implementation.

This isn't always practical, as it requires exposing the implementation publically.

This is where a string_view type class can come in:

template<class C>
struct string_view {
  // could be private:
  C const* b=nullptr;
  C const* e=nullptr;

  // key component:
  C const* begin() const { return b; }
  C const* end() const { return e; }

  // extra bonus utility:
  C const& front() const { return *b; }
  C const& back() const { return *std::prev(e); }

  std::size_t size() const { return e-b; }
  bool empty() const { return b==e; }

  C const& operator[](std::size_t i){return b[i];}

  // these just work:
  string_view() = default;
  string_view(string_view const&)=default;
  string_view&operator=(string_view const&)=default;

  // myriad of constructors:
  string_view(C const* s, C const* f):b(s),e(f) {}

  // known continuous memory containers:
  template<std::size_t N>
  string_view(const C(&arr)[N]):string_view(arr, arr+N){}
  template<std::size_t N>
  string_view(std::array<C, N> const& arr):string_view(arr.data(), arr.data()+N){}
  template<std::size_t N>
  string_view(std::array<C const, N> const& arr):string_view(arr.data(), arr.data()+N){}
  template<class... Ts>
  string_view(std::basic_string<C, Ts...> const& str):string_view(str.data(), str.data()+str.size()){}
  template<class... Ts>
  string_view(std::vector<C, Ts...> const& vec):string_view(vec.data(), vec.data()+vec.size()){}
  string_view(C const* str):string_view(str, str+len(str)) {}
private:
  // helper method:
  static std::size_t len(C const* str) {
    std::size_t r = 0;
    if (!str) return r;
    while (*str++) {
      ++r;
    }
    return r;
  }
};

such an object can be constructed directly from a std::string or a "raw C string" and nearly costlessly store what you need to know in order to produce a new std::string from it.

class employee {
  std::string name_;
public:
  void set_name(string_view<char> name) noexcept { name_.assign(name.begin(),name.end()); }
};

and as now our set_name has a fixed interface (not a perfect forward one), it can have its implementation not visible.

The only inefficiency is that if you pass in a C-style string pointer, you somewhat needlessly go over its size twice (first time looking for the '\0', second time copying them). On the other hand, this gives your target information about how big it has to be, so it can pre-allocate rather than re-allocate.

Usurer answered 14/10, 2014 at 18:1 Comment(4)
Regarding your comment about inefficiency when passing a null-terminated string pointer via string_view: even if you were to hand that pointer directly to std::string::assign, the implementation would almost certainly take two passes to scan the length and copy anyway. For strings larger than the SSO size, anyway.Stumpf
@Stumpf Imagine a push_back based implementation that clears the target, then pushes each character, checking for '\0' before stopping. This only reads once. The inability to preallocate the target buffer is a cost: but if you are constantly assigning similar length strings such an implementation could be faster. A string_view<char> cannot duplicate that: a template implementation could. In fact, a std::basic_string<char>::operator= overload for char const* could probably walk the input character looking for \0 copying until out of room, then search for size & resize mayhap.Usurer
One can indeed use trampoline types to save on writing two overloads for every member function taking a capacity capable parameter. It does start to feel a bit like asking how many angels can dance on the head of a pin by this stage though ...Bruno
@NiallDouglas However, the range_view, array_view and string_view classes have lots of utility outside of this problem: the ability to take non-copy subsections of ranges, contiguous arrays or strings is very powerful. This happens to be just another use of them.Usurer
M
2

You have two ways of calling that methods.

  • With rvalue parameter, as long as the move constructor of the parameter type is noexcept there is no problem (in case of std::string most probably is noexcept), in any case would be better use a conditional noexcept (to be sure the parameters is noexcept)
  • With lvalue parameter, in this case the copy constructor of the parameter type would be called and almost sure it will need some allocation (that could throw).

In cases like this that the use could be miss used, it's better to avoid. The client of the class assume that no exception is throw as specified, but in valid, compilable, not suspicious C++11 could throw.

Minor answered 8/10, 2014 at 15:56 Comment(4)
std::string's move assignment operator and destructor are guaranteed to be noexcept. And its move constructor, though curiously not if that move constructor takes a reference to an allocator.Bruno
That's a defect: cplusplus.github.io/LWG/lwg-active.html#2063 string move assignment should not be unconditionally noexcept, because the allocators might be unequal and not propagate, requiring reallocation. For move construction the allocator always propagates, so the storage can be pilfered. For the allocator-extended move constructor you must reallocate if the supplied allocator doesn't match the one in the rvalue string.Klemperer
Ach, damn allocators again. Yes, I saw the exact same problem with my proposed boost::concurrent_unordered_map, that makes a lot of sense. Thanks Jonathan.Bruno
Damn allocators indeed! Although I see that you said std::string, and as that uses std::allocator for which propagate_on_container_move_assignment is true, it's true that std::string's move assignment will never throw in practice, even if the noexcept guarantee is removed in a future standard. That doesn't apply to std::basic_string in general though.Klemperer

© 2022 - 2024 — McMap. All rights reserved.