How can the use of C++11's 'auto' improve performance?

Asked 10/9, 2015 at 19:30 Answered 24/9, 2015 at 10:3

245

I can see why the auto type in C++11 improves correctness and maintainability. I've read that it can also improve performance (Almost Always Auto by Herb Sutter), but this part lacks a good explanation.

How can auto improve performance?
Can anyone give an example?

Evenings answered 10/9, 2015 at 19:30 Comment(4)

See herbsutter.com/2013/06/13/… which talks about avoiding accidental implicit conversions, e.g. from gadget to widget. It's not a common problem. – Rhenium 10/9, 2015 at 19:37

Do you accept “makes it less likely to unintentionally pessimize” as a performance improvement? – Vendue 10/9, 2015 at 19:37

Perfomance of code cleaning up in the future only, maybe – Jellyfish 10/9, 2015 at 21:52

We need a short answer: No if you're good. It can prevent 'noobish' mistakes. C++ has a learning curve which kills those who do not make it after all. – P 11/9, 2015 at 16:44

329

auto can aid performance by avoiding silent implicit conversions. An example I find compelling is the following.

std::map<Key, Val> m;
// ...

for (std::pair<Key, Val> const& item : m) {
    // do stuff
}

See the bug? Here we are, thinking we're elegantly taking every item in the map by const reference and using the new range-for expression to make our intent clear, but actually we're copying every element. This is because std::map<Key, Val>::value_type is std::pair<const Key, Val>, not std::pair<Key, Val>. Thus, when we (implicitly) have:

std::pair<Key, Val> const& item = *iter;

Instead of taking a reference to an existing object and leaving it at that, we have to do a type conversion. You are allowed to take a const reference to an object (or temporary) of a different type as long as there is an implicit conversion available, e.g.:

int const& i = 2.0; // perfectly OK

The type conversion is an allowed implicit conversion for the same reason you can convert a const Key to a Key, but we have to construct a temporary of the new type in order to allow for that. Thus, effectively our loop does:

std::pair<Key, Val> __tmp = *iter;       // construct a temporary of the correct type
std::pair<Key, Val> const& item = __tmp; // then, take a reference to it

(Of course, there isn't actually a __tmp object, it's just there for illustration, in reality the unnamed temporary is just bound to item for its lifetime).

Just changing to:

for (auto const& item : m) {
    // do stuff
}

just saved us a ton of copies - now the referenced type matches the initializer type, so no temporary or conversion is necessary, we can just do a direct reference.

Quezada answered 10/9, 2015 at 19:42 Comment(4)

@Quezada Can you explain why the compiler would happily make copies instead of complain about trying to treat an std::pair<const Key, Val> const & as an std::pair<Key, Val> const &? New to C++11, not sure how range-for and auto plays into this. – Photocurrent 10/9, 2015 at 20:15

@Quezada Thanks for the explanation. That's the piece I was missing - for some reason, I thought that you couldn't have a constant reference to a temporary. But of course you can - it'll just cease to exist at the end of its scope. – Photocurrent 10/9, 2015 at 20:30

@barry I get you, but the problem is that then there isn't an answer that covers all of the reasons to use auto that increase performance. So I'll write it up in my own words below. – Charlottecharlottenburg 10/9, 2015 at 20:31

I still don't think this is proof that "auto improves performance". It's just an example that "auto helps prevent programmer mistakes that destroy performance". I submit that there is a subtle yet important distinction between the two. Still, +1. – Dilapidated 11/9, 2015 at 15:17

Because auto deduces the type of the initializing expression, there is no type conversion involved. Combined with templated algorithms, this means that you can get a more direct computation than if you were to make up a type yourself – especially when you are dealing with expressions whose type you cannot name!

A typical example comes from (ab)using std::function:

std::function<bool(T, T)> cmp1 = std::bind(f, _2, 10, _1);  // bad
auto cmp2 = std::bind(f, _2, 10, _1);                       // good
auto cmp3 = [](T a, T b){ return f(b, 10, a); };            // also good

std::stable_partition(begin(x), end(x), cmp?);

With cmp2 and cmp3, the entire algorithm can inline the comparison call, whereas if you construct a std::function object, not only can the call not be inlined, but you also have to go through the polymorphic lookup in the type-erased interior of the function wrapper.

Another variant on this theme is that you can say:

auto && f = MakeAThing();

This is always a reference, bound to the value of the function call expression, and never constructs any additional objects. If you didn't know the returned value's type, you might be forced to construct a new object (perhaps as a temporary) via something like T && f = MakeAThing(). (Moreover, auto && even works when the return type is not movable and the return value is a prvalue.)

Burgas answered 10/9, 2015 at 19:38 Comment(3)

So this is "avoid type erasure" reason to use auto. Your other variant is "avoid accidental copies", but needs embellishment; why does auto give you speed over simply typing the type there? (I think the answer is "you get the type wrong, and it silently converts") Which makes it a less-well explained example of Barry's answer, no? Ie, there are two basic cases: auto to avoid type erasure, and auto to avoid silent type errors that accidentally convert, both of which have a run time cost. – Charlottecharlottenburg 10/9, 2015 at 20:16

"not only can the call not be inlined" -- why's that then? Do you mean that in principle something prevents the call being devirtualized after data flow analysis if the relevant specialisations of std::bind, std::function and std::stable_partition have all been inlined? Or just that in practice no C++ compiler will inline aggressively enough to sort out the mess? – Judicatory 10/9, 2015 at 22:22

@SteveJessop: Mostly the latter - after you go through the std::function constructor, it'll be very complex to see through the actual call, especially with small-function optimisations (so you don't actually want devirtualization). Of course in principle everything is as-if... – Burgas 10/9, 2015 at 22:44

There are two categories.

auto can avoid type erasure. There are unnamable types (like lambdas), and almost unnamable types (like the result of std::bind or other expression-template like things).

Without auto, you end up having to type erase the data down to something like std::function. Type erasure has costs.

std::function<void()> task1 = []{std::cout << "hello";};
auto task2 = []{std::cout << " world\n";};

task1 has type erasure overhead -- a possible heap allocation, difficulty inlining it, and virtual function table invocation overhead. task2 has none. Lambdas need auto or other forms of type deduction to store without type erasure; other types can be so complex that they only need it in practice.

Second, you can get types wrong. In some cases, the wrong type will work seemingly perfectly, but will cause a copy.

Foo const& f = expression();

will compile if expression() returns Bar const& or Bar or even Bar&, where Foo can be constructed from Bar. A temporary Foo will be created, then bound to f, and its lifetime will be extended until f goes away.

The programmer may have meant Bar const& f and not intended to make a copy there, but a copy is made regardless.

The most common example is the type of *std::map<A,B>::const_iterator, which is std::pair<A const, B> const& not std::pair<A,B> const&, but the error is a category of errors that silently cost performance. You can construct a std::pair<A, B> from a std::pair<const A, B>. (The key on a map is const, because editing it is a bad idea)

Both @Barry and @KerrekSB first illustrated these two principles in their answers. This is simply an attempt to highlight the two issues in one answer, with wording that aims at the problem rather than being example-centric.

Charlottecharlottenburg answered 10/9, 2015 at 20:25 Comment(0)

The existing three answers give examples where using auto helps “makes it less likely to unintentionally pessimize” effectively making it "improve performance".

There is a flip side to the the coin. Using auto with objects that have operators that don't return the basic object can result in incorrect (still compilable and runable) code. For example, this question asks how using auto gave different (incorrect) results using the Eigen library, i.e. the following lines

const auto    resAuto    = Ha + Vector3(0.,0.,j * 2.567);
const Vector3 resVector3 = Ha + Vector3(0.,0.,j * 2.567);

std::cout << "resAuto = " << resAuto <<std::endl;
std::cout << "resVector3 = " << resVector3 <<std::endl;

resulted in different output. Admittedly, this is mostly due to Eigens lazy evaluation, but that code is/should be transparent to the (library) user.

While performance hasn't been greatly affected here, using auto to avoid unintentional pessimization might be classified as premature optimization, or at least wrong ;).

Northeastward answered 24/9, 2015 at 10:3 Comment(1)

Added the opposite question: #38416331 – Eurhythmics 16/7, 2016 at 21:34

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags