Why can lambdas be better optimized by the compiler than plain functions?
Asked Answered
S

3

195

In his book The C++ Standard Library (Second Edition) Nicolai Josuttis states that lambdas can be better optimized by the compiler than plain functions.

In addition, C++ compilers optimize lambdas better than they do ordinary functions. (Page 213)

Why is that?

I thought when it comes to inlining there shouldn't be any difference any more. The only reason I could think of is that compilers might have a better local context with lambdas and such can make more assumptions and perform more optimizations.

Stogy answered 5/12, 2012 at 11:38 Comment(5)
Related.Yoshikoyoshio
Basically, the statement applies to all function objects, not just lambdas.Hampson
That would be incorrect because function pointers are function objects too.Dispensable
@litb: I think I disagree with that.^W^W^W^W^W^W (after looking into standard) I wasn't aware of that C++ism, though I think in common parlance (and according to wikipedia), people mean instance-of-some-callable-class when they say function object.Tresatrescha
Some compilers can better optimize lambdas than plain functions, but not all :-(Shears
C
199

The reason is that lambdas are function objects so passing them to a function template will instantiate a new function specifically for that object. The compiler can thus trivially inline the lambda call.

For functions, on the other hand, the old caveat applies: a function pointer gets passed to the function template, and compilers traditionally have a lot of problems inlining calls via function pointers. They can theoretically be inlined, but only if the surrounding function is inlined as well.

As an example, consider the following function template:

template <typename Iter, typename F>
void map(Iter begin, Iter end, F f) {
    for (; begin != end; ++begin)
        *begin = f(*begin);
}

Calling it with a lambda like this:

int a[] = { 1, 2, 3, 4 };
map(begin(a), end(a), [](int n) { return n * 2; });

Results in this instantiation (created by the compiler):

template <>
void map<int*, _some_lambda_type>(int* begin, int* end, _some_lambda_type f) {
    for (; begin != end; ++begin)
        *begin = f.operator()(*begin);
}

… the compiler knows _some_lambda_type::operator () and can inline calls to it trivially. (And invoking the function map with any other lambda would create a new instantiation of map since each lambda has a distinct type.)

But when called with a function pointer, the instantiation looks as follows:

template <>
void map<int*, int (*)(int)>(int* begin, int* end, int (*f)(int)) {
    for (; begin != end; ++begin)
        *begin = f(*begin);
}

… and here f points to a different address for each call to map and thus the compiler cannot inline calls to f unless the surrounding call to map has also been inlined so that the compiler can resolve f to one specific function.

Clementinaclementine answered 5/12, 2012 at 11:43 Comment(8)
Perhaps it's worth mentioning that instantiating the same function template with a different lambda expression will create a whole new function with a unique type, which might well be a drawback.Parental
FWIW: if the function passed to map (via function pointer) is small and defined in the same source file, it's quite possible that it will get inlined when the loop is expanded.Dainedainty
@Dainedainty Absolutely. The problem is when handling functions which cannot be inlined (because they are too large). Here the call to the callback can still be inlined in the case of a lambda, but not in the case of a function pointer. std::sort is the classical example of this using lambdas instead of a function pointer here brings up to seven-fold (probably more, but I have no data on that!) performance increases.Clementinaclementine
@KonradRudolph I get that point, but if the function is so large or tricky that the compiler won't inline it, then you probably wouldn't write that in a lambda anyway, so the comparison doesn't really hold. Unless I'm missing something - does putting a large operation into a lambda force inlining? Because it seems to me the compiler still has the option of generating function code for a lambda rather than inlining, if it's a big operation, and it's converted to a function pointer before use. Certainly it seems that lambdas are a good way to ensure simple operations are inlined.Dainedainty
@Dainedainty You’re confusing two functions here: the one we are passing the lambda to (e.g. std::sort, or map in my example) and the lambda itself. The lambda is usually small. The other function – not necessarily. We’re concerned with inlining calls to the lambda inside the other function.Clementinaclementine
@KonradRudolph Clarification: I meant this: if you write int dblit(int x){return x*2;} at file scope, and pass '&dblit' as the last parm to map() in your example, it can and likely will get inlined in the expansion of map, in the same way as your lambda. Depending on optimization options.Dainedainty
@Dainedainty I know. This is literally what the last sentence of my answer says, though.Clementinaclementine
What I find curious (having just stumbled upon it) is that given a simple boolean function pred whose definition is visible, and using gcc v5.3, std::find_if(b, e, pred) does not inline pred, but std::find_if(b, e, [](int x){return pred(x);}) does. Clang manages to inline both, but doesn't produce code as fast as g++ with the lambda.Wye
G
29

Because when you pass a "function" to an algorithm you are in fact passing in a pointer to function so it has to do an indirect call via the pointer to the function. When you use a lambda you are passing in an object to a template instance specially instantiated for that type and the call to the lambda function is a direct call, not a call via a function pointer so can much more likely be inlined.

Gurtner answered 5/12, 2012 at 11:44 Comment(1)
"the call to the lambda function is a direct call" - indeed. And the same thing is true for all function objects, not just lambdas. It's just function pointers that can't be inlined as easily, if at all.Atronna
K
-2

Lambdas are not faster or slower than usual functions. Please correct me if wrong.

First, what is a the difference between lambda and usual function:

  1. Lambda can have capture.
  2. Lambda with high possibility will be simply removable during compilation from object file, due it has internal linkage.

Let's speak about capture. It doesn't give any performance to the function, due compiler has to pass additional object with data needed to handle a capture. Anyway, if you will just use lambda function in place, it will be easily optimized. Also if lambda won't use capture, you can cast your lambda to a function pointer. Why? Because it is just a usual function if it has no capture.

void (*a1)() = []() {
    // ...
};
void _tmp() {
    // ...
}
void (*a2)() = _tmp;

Both examples above are valid.

Speaking about the removal of function from object file. You can simply put your function to anonymous namespace, and it will make a deal. Function will be a little bit more happy to be inlined, due it is not used anywhere except your file.

auto a1 = []() {
    // ...
};

namespace {
    auto a2() {
        // ...
    }
}

Functions above will be the same by performance.

Also I noticed that function pointers and lambdas are compared. It is not a good thing to do because they are different. When you have a pointer to function, it can point to a variety of different functions, and it can be changed in runtime, because it is just a pointer to memory. Lambda can't do that. It always operates with only one function, due information which function to call is stored in type itself.

You can write code with function pointers like that:

void f1() {
    // ...
}
void f2() {
    // ...
}
int main() {
    void (*a)();
    a = f1;
    a = f2;
}

It is absolutely fine. And you can't write code with lambdas in this way:

int main() {
    auto f1 = []() {
        // ...
    };
    auto f2 = []() {
        // ...
    };
    f2 = f1; // error: no viable overloaded '='
}

If some libraries accept function pointers, it does not mean that lambdas can be better optimized by the compiler than plain functions, because the question is not about common libraries and function pointers.

Kithara answered 23/12, 2020 at 15:47 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.