Can Tail Call Optimization and RAII Co-Exist?

Asked 22/7, 2013 at 16:37 Answered 22/7, 2013 at 17:19

Solved c++recursion compiler-construction tail-recursion raii

I can't think of a true RAII language that also has tail call optimization in the specs, but I know many C++ implementations can do it as an implementation-specific optimization.

This poses a question for those implementations that do: given that destructors are invoked at the end of a automatic variable's scope and not by a separate garbage collection routine, doesn't it violate TCO's constraint that a recursive call must be the last instruction at the end of a function?

For example:-

#include <iostream>

class test_object {
public:
    test_object() { std::cout << "Constructing...\n"; }
    ~test_object() { std::cout << "Destructing...\n"; }
};

void test_function(int count);

int main()
{
    test_function(999);
}

void test_function(int count)
{
    if (!count) return;
    test_object obj;
    test_function(count - 1);
}

"Constructing..." would be written 999 times and then "Destructing..." another 999 times. Ultimately, 999 test_object instances would be automatically allocated before the unwind. But assuming an implementation had TCO, would 1000 stack frames exist or just 1?

Does the destructor after the recursive call collide with the defacto TCO implementation requirements?

Basketball answered 22/7, 2013 at 16:37 Comment(7)

Interesting though actually. I was going to say "of course it doesn't prevent TCO", but the more I look at it, the more I think it does... – Eo 22/7, 2013 at 16:43

There is no tail call here (and check your code, it doesn't compile as is). Though in theory a good compiler could notice the pattern and turn this into 2 loops. – Seep 22/7, 2013 at 16:44

Short answer: yes, deterministic destructors prevent TCO. Long answer: actually some compilers (such as LLVM I believe) implement a more permissive form of TCO and might tolerate more cases... – Piet 22/7, 2013 at 16:45

Yeah, it's kinda hard to tell. I was wondering at what stage of execution the destruction actually happens. I'd assumed it would happen just before the instruction pointer being put back and the stack frame removed, but surely that would happen after the last function statement? I think I may have to get a disassembler out... – Basketball 22/7, 2013 at 16:46

Thanks for the code fix. Serves me right for some inattentive last-minute class renaming... – Basketball 22/7, 2013 at 16:51

@ljackman: actually, returning in C++ is supposed to be a two-steps process, and destruction intervenes between those steps. So the timeline is: 1. "Callee" evaluate the return expression and put the return result in a specific "return slot" (may involve a copy/move), 2. "Callee" execute destructors, 3. "Caller" copy/move result from "return slot" into the variable (or temporary) in its own frame. Any two of the copy/move may be optimized away, but semantically the moment the destructors are scheduled is well defined. – Piet 22/7, 2013 at 17:13

You learn something new everyday... I suppose it depends whether the implementation allows TCO before or after step 2, given that a void function doesn't need to return anything anyway. – Basketball 22/7, 2013 at 17:17

Taken at face-value, it would certainly seem like RAII works against TCO. However, remember that there are a number of ways in which the compiler can "get away with it", so to speak.

The first and most obvious case is if the destructor is trivial, meaning that it is the default destructor (compiler-generated) and all the sub-objects have trivial destructors too, then the destructor is effectively non-existent (always optimized away). In that case, TCO can be performed as usual.

Then, the destructor could be inlined (it's code is taken and put directly in the function as opposed to being called like a function). In that case, it boils down to just having some "clean-up" code after the return statement. The compiler is allowed to re-order operations if it can determine that the end-result is the same (the "as-if" rule), and it will do so (in general) if the re-ordering leads to better code, and I would assume TCO is one of the considerations being applied by most compilers (i.e., if it can re-order things such that the code becomes suitable for TCO, then it will do it).

And for the rest of the cases, where the compiler cannot be "smart enough" to do it on its own, then it becomes the responsibility of the programmer. The presence of this automatic destructor call does make it a bit harder for the programmer to see the TCO-inhibiting clean-up code after the tail call, but it doesn't make any difference in terms of the ability of the programmer to make the function a candidate for TCO. For example:

void nonRAII_recursion(int a) {
  int* arr = new int[a];
  // do some stuff with array "arr"
  delete[] arr;
  nonRAII_recursion(--a);  // tail-call
};

Now, a naive RAII_recursion implementation might be:

void RAII_recursion(int a) {
  std::vector<int> arr(a);
  // do some stuff with vector "arr"
  RAII_recursion(--a);  // tail-call
};  // arr gets destroyed here, not good for TCO.

But a wise programmer can still see that this won't work (unless the vector destructor is inlined, which is likely in this case), and can rectify the situation easily:

void RAII_recursion(int a) {
  {
    std::vector<int> arr(a);
    // do some stuff with vector "arr"
  }; // arr gets destroyed here
  RAII_recursion(--a);  // tail-call
};

And I'm pretty sure you could demonstrate that there are essentially no cases where this kind of trick could not be used to ensure that TCO can be applied. So, RAII merely makes it a bit harder to see if TCO can be applied. But I think programmers that are wise enough to design TCO-capable recursive calls are also wise enough to see those "hidden" destructor calls that would need to be forced to occur before the tail-call.

ADDED NOTE: Look at it this way, the destructor hides away some automatic clean-up code. If you need the clean-up code (i.e., non-trivial destructor), you will need it whether you use RAII or not (e.g., C-style array or whatever). And then, if you want TCO to be possible, it must be possible to do the cleaning up before doing the tail-call (with or without RAII), and it is possible, then it is possible be force the RAII objects to be destroyed before the tail-call (e.g., by putting them inside an extra scope).

Shortfall answered 22/7, 2013 at 17:19 Comment(7)

If the vector's destructor has an externally-visible side-effect, I think the C++ standard would require that they all be constructed, and then all be destructed in the reverse order, would it not? Knowing whether the destructor of arr could have any side-effects would require analysis of all code that ever receives a pointer or reference to arr. Maybe possible in some cases, but not all. – Resonator 24/7, 2013 at 16:40

@supercat: The standard specifies observable behavior (i.e., behaves "as if" destructed in reverse order), that's a bit of a Schrodinger's cat problem, as soon as you add code to make the order of execution observable, the standard guarantees the order, but when you remove that code, you can't be sure of the order of things you can't observe. – Shortfall 24/7, 2013 at 17:56

@supercat: As for destructing the vector having "externally-visible side-effects", that's of course dependent on the compiler and the situation. It certainly cannot always be determined. Also, how smart the compiler is w.r.t. mem-allocations is also critical (e.g., does it see it as just a function-call with possible side-effects, does it consider changes to the heap as a visible side-effect, etc.), and compilers do have some freedom in that. I agree that clean-up code is hard to re-order safely. – Shortfall 24/7, 2013 at 17:57

I would expect that most compilers would not be smart enough to do TCO to the second code snippet even with an inlined destructor, but haven’t tested it. The reason is that the compiler doesn’t know if // do some stuff with vector "arr" includes putting a pointer to arr (or a pointer to one of the elements of arr) in a place that the recursive call can find (static local variable, global variable, a, etc.). Even if the compiler knows this didn’t happen, I think the delete[] call in arr’s destructor is still not reorderable unless the compiler uses strict pointer safety. – Francois 22/2, 2015 at 2:30

If the TCO is all for implementing the PTC (proper tail call) guarantee, which essentially requires O(1) space complexity of the nested active calls, then the cleanup calls can be deferred to the enclosing non-tail context and merged into one instance when the effects are idempotent: there is no difference between one call and more than one calls of the cleanup. Note the space consumption itself is not a visible side effect contributing to observable behaviors (in almost every language I've seen). This is still difficult to prove, though. – Garber 29/3, 2021 at 20:21

Such idempotent cleanups in C++ including release of the space of the automatic storage (technically, not counted as "deallocation"), and calls of default (not user-defined) ::operator delete calls (subject to deallocation merging since C++14) determined statically (as if consteval-able) with no leak of information of the arguments. There need more assumptions provided by the language spec for other cases, otherwise it is at least impossible to assume it is safe to ignore the side effects when the definition of the deallocation function is not visible in the C++ translation units. – Garber 29/3, 2021 at 20:30

Also note not only the cleanup may violate the requirements of TCO. Conversions of the returned value can take effect even later than the cleanup calls. For example, when returning an lvalue from a function with an object type as its return type, there is at least a copy initialization (optionally with calls to some user-defined conversion functions). It is still sometimes possible to do TCO because the conversion can be idempotent, with or without the copy elision. – Garber 29/3, 2021 at 20:36

If the compiler perform the TCO then the order in which destructors are called is changed with respect to when it doesn't do TCO.

If the compiler can prove that this reordering doesn't matter (e.g. if the destructor is trivial) then as per the as-if rule it can perform TCO. However, in your example the compiler can't prove that and will not do TCO.

Quianaquibble answered 22/7, 2013 at 17:1 Comment(4)

In the case of RAII, it will never be able to prove that the reordering doesn't matter, because almost by definition, it does matter. – Mammal 22/7, 2013 at 17:58

@James Kanze. Indeed. If we see RAII in the sense that the destructor releases a resource (which is the most accepted sense of the term) then it's impossible to do TCO. If we see RAII in a broader sense, meaning that destructors are called when leaving the current scope, then, in some cases, it's still possible to do TCO. – Quianaquibble 22/7, 2013 at 20:30

Destructors which do nothing are not RAII. And in general, the presence of non-trivial constructors or destructors does inhibit TCO. – Mammal 23/7, 2013 at 7:54

Additional note: as of C++14, the standard explicitly allows merging of some new-expressions without the aid of as-if rules. So the case gets more interesting. – Garber 3/4, 2016 at 9:58

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags