Why is it not good to use recursive inheritance for std::tuple implementations?
Asked Answered
P

3

26

In this question, Howard Hinnant said

Some implementations of std::tuple use recursive inheritance. But the good ones don't. ;-)

Can someone please shed some light on that?

Playback answered 9/3, 2012 at 22:11 Comment(3)
maybe it has to do with inefficient forwarding of arguments?Ruthanneruthe
would probably be helpful if someone could outline the difference between those good ones and the some ones.Uncommunicative
@Cheers: I would think inlining would deal with that -- but then again, I have to admit I have run into compilers that flatly decided to stop inlining to a depth of more than 3 or 4.Marbut
M
34

A non-recursive implementation has better compile-time performance. Believe it or not, in a heavily used library facility like std::tuple, how it is implemented can impact (for better or worse), the compile times the client sees. Recursive implementations tend to produce compile times that are linear in the depth of recursion (or can be even worse).

This impacts more than just the instantiation of the tuple itself. std::get<I>(tuple) for example will take a linear amount of compile time for one implementation and a constant amount of compile time for another implementation. This impact can rapidly deteriorate (or not) when dealing with tuples of tuples. I.e. the recursive implementation could result in O(N^2) compile time while the non-recursive implementation is still O(1).

Fwiw, the libc++ implementation lays the objects out in the order specified by the client, but optimizes away space for empty components using the compiler's empty base class optimization facility.

Maighdiln answered 10/3, 2012 at 2:39 Comment(3)
This question (and your answer) prompted me to explore the non-recursive tuple implementation. I wrote a post about it: mitchnull.blogspot.com/2012/06/…. I'd be grateful if you could spare a bit of time to read it and point out any mistakes. ThanksWorship
I used Howards implementation to improve a smaller program of mine. I also included index tricks in the tests to improve tuple_element regarding compile time and executable size. To my surprise things got not better but worse. I am using gcc 4.8.2. Does the newest gcc something that optimized tail-head template recursion? Or do the mentioned tricks only work for corner cases? Is there any example or success story that proves the effectiveness of the optimizations?Apothem
I made some measurements which suggest that libc++ compile time scales much better than the libstdc++ and msvc implementations. However gcc appears to be generally faster to compile than clang, hence the advantages of scaling only become apparent for large tuples (100s of elements).Catercorner
A
3

I don't recall Andrei Alexandrescu's GoingNative 2012 talk exactly, but he talked about this point, and one of the points he mentioned was memory layout. If I have a std::tuple<int, short, char, char>, it will be in memory as char, short, int and this layout will take (on my system) 4 bytes more than if they were laid out as int, short, char. R. Martinho Fernandes has reminded me that the best thing to do would be to order these in memory in an order that minimizes padding, which is neither the order given nor reverse order. (Naïve inheritance does reverse order).

If I write std::tuple<int, char, short, char>, a tuple that works by naïve inheritance would put these in the order char, short, int in memory, using 3 bytes of padding, when optimal has zero bytes of padding. (Either int, short, char, char or char, char, short, int).

Assuming I'm right that it's about padding, then R. Martinho Fernandes said "[my argument] doesn't preclude the use of recursive inheritance for the actual implementation in the optimal order.", so that is why I specify that naïve inheritance is bad.

(The order in memory does not mean that get<0> will give a different object, and R. Martinho Fernandes correctly notes that the order should be invisible to the user. However, these were the points as I have been reminded from the GoingNative event.)

The video is at http://channel9.msdn.com/Events/GoingNative/GoingNative-2012/Variadic-Templates-are-Funadic and the slides are at http://ecn.channel9.msdn.com/events/GoingNative12/GN12VariadicTemplatesAreFunadic.pdf.

Armillas answered 9/3, 2012 at 22:13 Comment(12)
Where does he talk about anything else in his comment?Playback
For some reason I thought Johannes was talking about GoingNative, dunno why I thought that.Armillas
It wouldn't be hard to get things in the desired order. The real problem, I'm pretty sure, is that deeply nested templates puts a heavy burden on the compiler, and makes error messages much more difficult.Marbut
If a tuple is "deeply nested" enough to burden the compiler, you may want to look at your code again. I don't think there's a way to do it without difficult error messages.Armillas
ah so it is not recursive inheritance that is bad, but naive recursive inheritance? I wonder whether @Howard was onto something else and has another point in favor of another technique.Playback
Wow. My edit says I deleted 90 characters! I don't think so... I just fixed the spelling of naïve. Silly Stack Overflow. Counting is hard!Claque
@JohannesSchaub-litb: Assuming I'm right that it's about padding, then R.Martinho Fernandes said "[my argument] doesn't preclude the use of recursive inheritance for the actual implementation in the optimal order."Armillas
@JamesMcNellis: It probably says that because I edited at the same time and your change overwrote mine, and removed links (which are longish) (I almost said "sliced" instead, wierd)Armillas
@MooingDuck: if you have such deeply nested tuples then I'd say the last thing you want to do is look at your code again... ;)Hodess
Assuming we have a tuple_impl that has a similar interface to std::tuple and assuming a metafunction that given a variadic pack T... computes a layout-optimized tuple_impl<U...> then I think that's a good headstart in having an std::tuple implementation that performs better than a naive layout regardless how tuple_impl is actually implemented (i.e. nested inheritance or not). I'd say the argument is a red herring.Rrhagia
@MooingDuck Hey, I wrote a very crude implementation of a tuple that always gives you the optimal layout: ideone.com/Z5hfo. It computes the optimal order (#ifdefed for libc++ and libstdc++'s implementations), and wraps a tuple from the standard library taking care of hiding the real order from the user. I still can't get all the get<> calls to compile on GCC, but all the static_asserts are passing, and it works fine on clang with libc++. I have no idea how to do compile-time performance analysis, so it's possible this is inefficient. Ping me in chat if you have any questions :)Tiepolo
@R.MartinhoFernandes Very impressive! I'll have to check that out Monday!Armillas
J
3

One reason not to use a chain of base classes is that there is no chain of constructors involved: the arguments are directly forwarded to the appropriate subobject. Also, it seems that a non-recursive implementation puts a lot less strain on the compiler and creates a lot less [internal] symbols. Not to mention that it is actually easier not to a chain of base classes.

Joellenjoelly answered 9/3, 2012 at 23:7 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.