value semantics vs output params with large data structures
Asked Answered
B

3

11

2013 Keynote: Chandler Carruth: Optimizing the Emergent Structures of C++

  • 42:45
    You don't need output parameters, we have value semantics in C++. ... Anytime you see someone arguing that nonono I'm not going to return by value because copy would cost too much, someone working on an optimizer says they're wrong. All right? I have never yet seen a piece of code where that argument was correct. ... People don't realize how important value semantics are to the optimizer because it completely clarifies the aliasing scenarios.

Can anyone put this in the context of this answer: https://stackoverflow.com/a/14229152

I hear that being repeated on and on but, well, for me a function returning something is a source. Output parameters by reference take that characteristic out of a function, and removing such hard coded characteristic from a function allows one to manage outside instead, how output will be stored/reused.

My question is, even in the context of that SO answer, is there a way to tell, restructuring the code some other equivalent way, "ok now see, value semantics in this way doesn't lose for the output param version", or Chandler comments were specific to some contrived situations? I had even seen Andrei Alexandrescu arguing this in a talk and telling you can't escape using by ref output for better performance.

For another take on Andrei's comments see Eric Niebler: Out Parameters, Move Semantics, and Stateful Algorithms.

Blitzkrieg answered 10/2, 2014 at 4:34 Comment(4)
Chandler statement might not be accurate for the current compilers, because C++11 compilers are not mature enough, as they've just finished the C++11 features (MSVC++ still working on them!); but they will soon improve on them, making the resulting code much faster, decreasing the gap between return-by-value and output-parameter idioms. So I believe within few years (2 or so), the gap will be so small that you can forget output-parameter idiom completely.Rhinehart
My take on this is that you can't enforce a single technique on all circumstances. Although return values should be preferred (RVO or move optimizations; also more intuitive in some sense), in cases where output params lead to more efficient usage (like in getline), then by all means use them.Sloat
@MarkGarcia: An interesting read regarding getline and value semantics: ericniebler.com/2013/10/13/out-parameters-vs-move-semanticsBaum
@BenjaminLindley oh, thanks for remembering that, I did follow it when it was released, and it's related to the Alexandrescu talk I refer to. I'm gonna revisit it.Blitzkrieg
S
3

It's either an overstatement, generalization, a joke, or Chandler's idea of "Perfectly Reasonable Performance" (using modern C++ toolchains/libs) is unacceptable to my programs.

I find it a fairly narrow scope of optimizations. Penalties exist beyond that scope which cannot be ignored due to the actual complexities and designs found in programs - heap allocations were an example for the getline example. The specific optimizations may or may not always be applicable to the program in question, despite your attempts to reduce them. Real world structures will reference memory which may alias. You can reduce that, but it's not practical to believe that you can eliminate aliasing (from an optimizer's perspective).

Of course, RBV can be a great thing - it's just not appropriate for all cases. Even the link you referenced pointed out how one can avoid a ton of allocations/frees. Real programs and the data structures found in them are far more complex.

Later in the talk, he goes on to criticize the use of member functions (ref: S::compute()). Sure, there is a point to take away, but is it really reasonable to avoid using these language features entirely because it makes the optimizer's job easier? No. Will it always result in more readable programs? No. Will these code transformations always result in measurably faster programs? No. Are the changes required to transform your codebase worth the time you invest? Sometimes. Can you take away some points and make better informed decisions which impact some of your existing or future codebase? Yes.

Sometimes it helps to break down how exactly your program would execute, or what it would look like in C.

The optimizer will not solve all performance problems, and you should not rewrite programs with the assumption that the programs you are dealing with are "completely brain dead and broken designs", nor should you believe that using RBV will always result in "Perfectly Reasonable Performance". You can utilize new language features and make the optimizer's job easier, though there is much to gain there are often more important optimizations to invest your time on.

It's fine to consider the proposed changes; ideally you would measure the impact of such changes in real world execution times and impact on your source code before adopting these suggestions.

For your example: Even copying+assigning large structures by value can have significant costs. Beyond the cost of running constructors and destructors (along with their associated creation/cleanup of the resources they acquire and own, as pointed out in the link you reference), even things as simple as avoiding unnecessary structure copies can save you a ton of CPU time if you use references (where appropriate). The structure copy may be as simple as a memcpy. These aren't contrived problems; they appear in actual programs and the complexity can increase greatly with your program's complexity. Is reducing aliasing of some memory and other optimizations worth the costs, and does it result in "Perfectly Reasonable Performance"? Not always.

Swiftlet answered 11/2, 2014 at 17:33 Comment(0)
E
1

I was about to implement a solution using string_view and ranges and then found this:

This endorses value semantics output by return and I accept it as prettier than the current choice in the referred SO answer. This design also would take the characteristic of being a source out of a function even though it is returning. Simple illustration: one could have an outer reserved vector being repetitively filled by a returned range.

In any case, I'm not sure whether such version of split would help the optimizer in any sense (I'm talking in context to Chandler's talk here).

Notice

An output param version enforces the existence of a named variable at call site, which may be ugly to the eyes, but may turn debugging the call site always easier.

Sample solution

While std::split doesn't arrive, I've exercised the value semantics output by return version this way:

#include <string>
#include <string_view>
#include <boost/regex.hpp>
#include <boost/range/iterator_range.hpp>
#include <boost/iterator/transform_iterator.hpp>

using namespace std;
using namespace std::experimental;
using namespace boost;

string_view stringfier(const cregex_token_iterator::value_type &match) {
    return {match.first, static_cast<size_t>(match.length())};
}

using string_view_iterator =
    transform_iterator<decltype(&stringfier), cregex_token_iterator>;

iterator_range<string_view_iterator> split(string_view s, const regex &r) {
    return {
        string_view_iterator(
            cregex_token_iterator(s.begin(), s.end(), r, -1),
            stringfier
        ),
        string_view_iterator()
    };
}

int main() {
    const regex r(" +");
    for (size_t i = 0; i < 1000000; ++i) {
        split("a b c", r);
    }
}

I have used Marshall Clow's string_view libc++ implementation found at https://github.com/mclow/string_view.

I have posted the timings at the bottom of the referred answer.

Erasme answered 10/2, 2014 at 4:34 Comment(2)
"I'm not sure whether such version of split would help the optimizer in any sense (I'm talking in context to Chandler's talk here)". the std::split() proposal vends string_views - a reference to a non-terminated, externally owned character sequence. although you would avoid copying the substrings, it would not help much in the way of optimizations wrt aliasing.Swiftlet
@Swiftlet yes... something like this is what I was foreseeing, thanks for the comment.Blitzkrieg
K
1

The problem with output parameters as described in the linked question is that they generally make the common calling case (i.e., you don't have a vector storage to use already) much more verbose than normal. For example, if you used return by value:

auto a = split(s, r);

if you used output parameters:

std::vector<std::string> a;
split(s,r,a);

The second one looks much less pretty to my eyes. Also, as Chandler mentioned, the optimizer can do much more with the first one than the second, depending on the rest of your code.

Is there a way we can get the best of both worlds? Emphatically yes, using move semantics:

std::vector<std::string> split(const std::string &s, const std::regex &r, std::vector<std::string> v = {})
{
    auto rit = std::sregex_token_iterator(s.begin(), s.end(), r, -1);
    auto rend = std::sregex_token_iterator();
    v.clear();
    while(rit != rend)
    {
        v.push_back(*rit);
        ++rit;
    }

    return v;
}

Now, in the common case, we can call split as normal (i.e., the first example) and it will allocate a new vector storage for us. In the important but rare case when we have to split repeatedly and we want to re-use the same storage, we can just move in a storage that persists between calls:

int main()
{
    const std::regex r(" +");
    std::vector<std::string> a;
    for(auto i=0; i < 1000000; ++i)
      a = split("a b c", r, std::move(a));
    return 0;
}

This runs just as fast as the output-argument method and what's going on is quite clear. You don't have to make your function hard to use all the time just to get good performance some of the time.

Kanazawa answered 11/2, 2014 at 21:35 Comment(3)
Sorry, but for me at last, this option is worse at call site, for the case that matters here, than both solutions I refer to, using ranges or using simple output parameters... It looks like easier to misuse too, by users, which I don't like.Blitzkrieg
If you're only using a function in this way, then yes, I agree that this trick is silly. However, it is at least semantically valid, and it's better for the optimizer. One of my pet peeves is output references for functions that almost never re-use storage. This is quite common among C++ programmers and makes code much more verbose everywhere just so the rare case might be faster.Kanazawa
even though I agree code may get less pretty and more verbose, you may check the upside about this point I refer to in my answer as a notice... I think that's worth pondering at last.Blitzkrieg

© 2022 - 2024 — McMap. All rights reserved.