Function argument pass-by-value faster than pass-by-reference?

Asked 3/6, 2022 at 9:2 Answered 3/6, 2022 at 9:43

I had an interview, where I get the following function declaration:

int f1(const std::vector<int> vec)

I suggested that instead of copying the vector, we should use a const-reference(not to mention that const copy does not makes much sense), but the interviewer claimed that the compiler copy-elision will handle it. I couldn't come up with any strong argument at the spot, but today I did some research.

I implemented the following two simple example functions:

int f1(const std::vector<int> vec) {
    const auto num = vec.size();
    return num * num;
}


int f2(const std::vector<int>& vec) {
    const auto num = vec.size();
    return num * num;
}

From the godbolt assembly, it is clear that the f2 function has 2 additional instructions so it should be slower. (I think 2 mov is technically almost free in modern CPUs)

I also used quick-bench to measure the two solutions, but it is confirmed my suspicion, that passing const-ref is faster. (even for 1 element)

I suspected that maybe copy-elision is not allowed because of benchmark::DoNotOptimize(result);, but after removing it, I received a similar result.

Now I have these results, but I think it is still not convincing enough.

What do you think?

Do you have any good argument for using one over the other?

Iodoform answered 3/6, 2022 at 9:2 Comment(5)

unclear how copy-elision should come into play here. – Rijeka 3/6, 2022 at 9:3

It shouldn't matter once the functions are inlined. You should test them in real-world code that does something useful and compiler may change inlining behavior. – Undertaker 3/6, 2022 at 9:5

Without compiler optimisations viewing the assembly for performance analysis is fairly meaningless – Casefy 3/6, 2022 at 9:11

Surprisingly for me, in this simple case gcc indeed optimizes out vector copying, but just because it is smart enough to understand that we only care about the size of the vector. In general case, where you actually need to do some complex operation using the vector passed as arg, I would rather avoid passing it by value. – Litta 3/6, 2022 at 9:22

1. You should have told to your interviewer that very smart people hadn't implemented std::string_view, and std::span if he/she was true. 😉 – Trahurn 8/6, 2022 at 17:53

Looking at the assembly of the functions f1 and f2 won't work. With optimizations enabled they are likely to look completely identical. (Except that in some ABIs it is the callee's responsibility to destroy function parameters, in which case f1 will look much longer. But that doesn't mean anything by itself. There will be destruction of any created object somewhere anyway, whether in the caller or the callee.)

The performance difference here doesn't lie inside the function, but in the potential caller.

Function parameters are constructed in the context of the caller, not in the context of the callee. So, here if f1 is called like

std::vector<int> vec{/*...*/};
auto res = f1(vec);

will cause vec to be copied into the function parameter in the caller. It is not possible to elide the copy. (Of course the compiler is free to optimize the copy away if it can see that doing so won't change the observable behavior of the program, but that generally can't happen if the function isn't inlined, for example because it's definition is in another translation unit and there is no link-time optimization.)

With

std::vector<int> vec{/*...*/};
auto res = f2(vec);

no copy is done, even conceptually. Copy elision is not relevant.

It is true that e.g.

std::vector<int> vec{/*...*/};
auto res = f1(std::move(vec));

will use the move constructor and will generally be cheap enough that it might be faster than using f2 under some circumstances where the function can't be inlined. (However that will depend on how the ABI specifies that a std::vector is passed to functions as well.)

It is also true that e.g.

auto res = f1(std::vector<int>{/*...*/});

will construct only one std::vector<int> if copy elision is applied (guaranteed since C++17 and optionally before). However, the same is true if f2 was used.

In some situations f1 might be a better choice, because the compiler can be sure that it doesn't modify vec from the caller. (A const reference does not strictly mean that the referenced object can't be modified.) This may allow the compiler to make some optimizations in the caller without having to analyze the body of f1 that it might not be able to do with f2.

So all in all I would say, in some circumstances they are equally good, in some circumstances f2 is definitively the better choice and in some more rare situations it might be that f1 is the better choice. I would go with f2 by default.

However, looking at the bigger picture, I would try to avoid having a function which iterates over a vector take the vector itself as parameter. Functions like this can typically be written more generally to apply to arbitrary ranges and hence it would probably be easy to make them templates and use either the iterator interface or the C++20 range interface used by standard library algorithms. This way, if in the future someone decides to use some other container than std::vector, the function will still just work.

Kristalkristan answered 3/6, 2022 at 9:43 Comment(1)

just as examples: In godbolt.org/z/n9d88nGGn you see that f1() will call new twice while in godbolt.org/z/KKTxev63r f2() will only call new once. And finally in godbolt.org/z/xP3jfY1nK you see f1() called with std::move and the compiler optimizing away one of the vectors. But that's because in constructor/destructor the new and delete cancel each other and the constructor/destructor calls are trvial. Note: It's the original vector that gets optimized away, the copy for f1() remains I believe. – Selfsuggestion 3/6, 2022 at 15:43

Just assuming copy-elision is not a good idea: copying a vector is very expensive involving memory management and potentially copying a lot. Risking this copy over some marginal gain that you could have is not a good bet unless you know the behavior of your current and future compilers (copy-elision may be done, but is not mandated). One way to check whether copy elision was actually done is to check the pointer to the vector data on the caller side and in the function. Even if speed is not the greatest concern, more programmers expect the pass-by-const-ref over pass-by-value and thus don't have to question the reason for the decision.

Mind you that copy-elision is mandatory for initialization (from the link: when the initializer expression is a prvalue of the same class type (ignoring cv-qualification) as the variable type)

Recommended topics

Hot tags