Why is it not good to use recursive inheritance for std::tuple implementations?

Asked 9/3, 2012 at 22:11 Answered 10/3, 2012 at 2:39

In this question, Howard Hinnant said

Some implementations of std::tuple use recursive inheritance. But the good ones don't. ;-)

Can someone please shed some light on that?

Playback answered 9/3, 2012 at 22:11 Comment(3)

maybe it has to do with inefficient forwarding of arguments? – Ruthanneruthe 9/3, 2012 at 22:40

would probably be helpful if someone could outline the difference between those good ones and the some ones. – Uncommunicative 9/3, 2012 at 22:52

@Cheers: I would think inlining would deal with that -- but then again, I have to admit I have run into compilers that flatly decided to stop inlining to a depth of more than 3 or 4. – Marbut 9/3, 2012 at 23:15

A non-recursive implementation has better compile-time performance. Believe it or not, in a heavily used library facility like std::tuple, how it is implemented can impact (for better or worse), the compile times the client sees. Recursive implementations tend to produce compile times that are linear in the depth of recursion (or can be even worse).

This impacts more than just the instantiation of the tuple itself. std::get<I>(tuple) for example will take a linear amount of compile time for one implementation and a constant amount of compile time for another implementation. This impact can rapidly deteriorate (or not) when dealing with tuples of tuples. I.e. the recursive implementation could result in O(N^2) compile time while the non-recursive implementation is still O(1).

Fwiw, the libc++ implementation lays the objects out in the order specified by the client, but optimizes away space for empty components using the compiler's empty base class optimization facility.

Maighdiln answered 10/3, 2012 at 2:39 Comment(3)

This question (and your answer) prompted me to explore the non-recursive tuple implementation. I wrote a post about it: mitchnull.blogspot.com/2012/06/…. I'd be grateful if you could spare a bit of time to read it and point out any mistakes. Thanks – Worship 3/6, 2012 at 11:17

I used Howards implementation to improve a smaller program of mine. I also included index tricks in the tests to improve tuple_element regarding compile time and executable size. To my surprise things got not better but worse. I am using gcc 4.8.2. Does the newest gcc something that optimized tail-head template recursion? Or do the mentioned tricks only work for corner cases? Is there any example or success story that proves the effectiveness of the optimizations? – Apothem 3/12, 2013 at 19:9

I made some measurements which suggest that libc++ compile time scales much better than the libstdc++ and msvc implementations. However gcc appears to be generally faster to compile than clang, hence the advantages of scaling only become apparent for large tuples (100s of elements). – Catercorner 23/9, 2023 at 17:12

I don't recall Andrei Alexandrescu's GoingNative 2012 talk exactly, but he talked about this point, and one of the points he mentioned was memory layout. If I have a std::tuple<int, short, char, char>, it will be in memory as char, short, int and this layout will take (on my system) 4 bytes more than if they were laid out as int, short, char. R. Martinho Fernandes has reminded me that the best thing to do would be to order these in memory in an order that minimizes padding, which is neither the order given nor reverse order. (Naïve inheritance does reverse order).

If I write std::tuple<int, char, short, char>, a tuple that works by naïve inheritance would put these in the order char, short, int in memory, using 3 bytes of padding, when optimal has zero bytes of padding. (Either int, short, char, char or char, char, short, int).

Assuming I'm right that it's about padding, then R. Martinho Fernandes said "[my argument] doesn't preclude the use of recursive inheritance for the actual implementation in the optimal order.", so that is why I specify that naïve inheritance is bad.

(The order in memory does not mean that get<0> will give a different object, and R. Martinho Fernandes correctly notes that the order should be invisible to the user. However, these were the points as I have been reminded from the GoingNative event.)

The video is at http://channel9.msdn.com/Events/GoingNative/GoingNative-2012/Variadic-Templates-are-Funadic and the slides are at http://ecn.channel9.msdn.com/events/GoingNative12/GN12VariadicTemplatesAreFunadic.pdf.

Armillas answered 9/3, 2012 at 22:13 Comment(12)

Where does he talk about anything else in his comment? – Playback 9/3, 2012 at 22:13

For some reason I thought Johannes was talking about GoingNative, dunno why I thought that. – Armillas 9/3, 2012 at 22:18

It wouldn't be hard to get things in the desired order. The real problem, I'm pretty sure, is that deeply nested templates puts a heavy burden on the compiler, and makes error messages much more difficult. – Marbut 9/3, 2012 at 22:34

If a tuple is "deeply nested" enough to burden the compiler, you may want to look at your code again. I don't think there's a way to do it without difficult error messages. – Armillas 9/3, 2012 at 22:39

ah so it is not recursive inheritance that is bad, but naive recursive inheritance? I wonder whether @Howard was onto something else and has another point in favor of another technique. – Playback 9/3, 2012 at 22:40

Wow. My edit says I deleted 90 characters! I don't think so... I just fixed the spelling of naïve. Silly Stack Overflow. Counting is hard! – Claque 9/3, 2012 at 22:41

@JohannesSchaub-litb: Assuming I'm right that it's about padding, then R.Martinho Fernandes said "[my argument] doesn't preclude the use of recursive inheritance for the actual implementation in the optimal order." – Armillas 9/3, 2012 at 22:42

@JamesMcNellis: It probably says that because I edited at the same time and your change overwrote mine, and removed links (which are longish) (I almost said "sliced" instead, wierd) – Armillas 9/3, 2012 at 22:43

@MooingDuck: if you have such deeply nested tuples then I'd say the last thing you want to do is look at your code again... ;) – Hodess 9/3, 2012 at 23:2

Assuming we have a tuple_impl that has a similar interface to std::tuple and assuming a metafunction that given a variadic pack T... computes a layout-optimized tuple_impl<U...> then I think that's a good headstart in having an std::tuple implementation that performs better than a naive layout regardless how tuple_impl is actually implemented (i.e. nested inheritance or not). I'd say the argument is a red herring. – Rrhagia 10/3, 2012 at 2:53

@MooingDuck Hey, I wrote a very crude implementation of a tuple that always gives you the optimal layout: ideone.com/Z5hfo. It computes the optimal order (#ifdefed for libc++ and libstdc++'s implementations), and wraps a tuple from the standard library taking care of hiding the real order from the user. I still can't get all the get<> calls to compile on GCC, but all the static_asserts are passing, and it works fine on clang with libc++. I have no idea how to do compile-time performance analysis, so it's possible this is inefficient. Ping me in chat if you have any questions :) – Tiepolo 7/4, 2012 at 3:4

@R.MartinhoFernandes Very impressive! I'll have to check that out Monday! – Armillas 7/4, 2012 at 6:6

One reason not to use a chain of base classes is that there is no chain of constructors involved: the arguments are directly forwarded to the appropriate subobject. Also, it seems that a non-recursive implementation puts a lot less strain on the compiler and creates a lot less [internal] symbols. Not to mention that it is actually easier not to a chain of base classes.

Joellenjoelly answered 9/3, 2012 at 23:7 Comment(0)

Recommended topics

Hot tags