Quicksort and tail recursive optimization

Asked 30/9, 2013 at 12:27 Answered 8/6, 2022 at 18:3

recursion language-agnostic quicksort tail-recursion

In Introduction to Algorithms p169 it talks about using tail recursion for Quicksort.

The original Quicksort algorithm earlier in the chapter is (in pseudo-code)

Quicksort(A, p, r)
{
 if (p < r)
 {
  q: <- Partition(A, p, r)
  Quicksort(A, p, q)
  Quicksort(A, q+1, r)
 }
}

The optimized version using tail recursion is as follows

Quicksort(A, p, r)
{
 while (p < r)
 {
  q: <- Partition(A, p, r)
  Quicksort(A, p, q)
  p: <- q+1
 }
}

Where Partition sorts the array according to a pivot.

The difference is that the second algorithm only calls Quicksort once to sort the LHS.

Can someone explain to me why the 1st algorithm could cause a stack overflow, whereas the second wouldn't? Or am I misunderstanding the book.

Quixotic answered 30/9, 2013 at 12:27 Comment(5)

The complexity is obviously different. Does the book say anything about that? – Vedetta 30/9, 2013 at 12:29

does that mean just the 2nd algorithm is more efficient but wouldn't necessarily stop an overflow – Quixotic 30/9, 2013 at 12:30

I'm having a hard time seeing how the second version is tail-recursive. A recursive function is tail recursive if the last computation performed is the recursive call (or more generally doesn't need keep any local data around for after the call). This is what allows the new stack frame to be overlaid on top of the existing one. However in the second version above, there is a computation after the recursive call, and in particular q is still required. This means that at least part of the stack frame has to be retained. It might work if q was passed to and returned from the recursive call – Wilton 21/2, 2015 at 23:37

This optimized version does not have better memory worst time. You need to tail optimize on the longest side for that like at: https://mcmap.net/q/531550/-how-to-optimize-quicksort – Emphatic 28/2, 2015 at 10:30

The answers just explain tail recursion but say almost nothing about the code in the question. – Feeley 27/6, 2016 at 9:47

First let's start with a brief, probably not accurate but still valid, definition of what stack overflow is.

As you probably know right now there are two different kind of memory which are implemented in too different data structures: Heap and Stack.

In terms of size, the Heap is bigger than the stack, and to keep it simple let's say that every time a function call is made a new environment(local variables, parameters, etc.) is created on the stack. So given that and the fact that stack's size is limited, if you make too many function calls you will run out of space hence you will have a stack overflow.

The problem with recursion is that, since you are creating at least one environment on the stack per iteration, then you would be occupying a lot of space in the limited stack very quickly, so stack overflow are commonly associated with recursion calls.

So there is this thing called Tail recursion call optimization that will reuse the same environment every time a recursion call is made and so the space occupied in the stack is constant, preventing the stack overflow issue.

Now, there are some rules in order to perform a tail call optimization. First, each call most be complete and by that I mean that the function should be able to give a result at any moment if you interrupts the execution, in SICP this is called an iterative process even when the function is recursive.

If you analyze your first example, you will see that each iteration is defined by two recursive calls, which means that if you stop the execution at any time you won't be able to give a partial result because you the result depends of those calls to be finished, in this scenario you can't reuse the stack environment because the total information is split between all those recursive calls.

However, the second example doesn't have that problem, A is constant and the state of p and r can be locally determined, so since all the information to keep going is there then TCO can be applied.

Injustice answered 30/9, 2013 at 15:1 Comment(5)

I'm still somewhat confused, because the line p: <- q+1 depends on results from further recursions. from what i understood about TRO is it works when it gives to future recursions and doesn't take, meaning the solution is iterative and can exit at any point without storing return calls on the stack. – Quixotic 30/9, 2013 at 15:15

@Quixotic Not quite, that line depends on the result of Partition. The implementation using while doesn't make it obvious but the stack is being reused on each iteration. I am at work right now, but later I will try to post here a version using only if statements so the optimization will be easier to understand. – Injustice 30/9, 2013 at 16:57

that would be great if you could. are you saying though that it would never cause a stackoverflow or just less likely? – Quixotic 1/10, 2013 at 21:55

see above to MK answer, he seems to be saying that stack overflow is possible with a stack depth of log n. seems to contradict what you are saying. – Quixotic 6/11, 2013 at 12:43

log(number of elements in an array) is always smaller than 64 in a 64-bit architecture, so you would need a very small stack. Either way, tail recursion optimization or moving to a loop isn't enough. You need to also always give the smaller slice to the recursive call that does add a new stack frame. – Pestilent 16/7, 2020 at 18:15

The essence of the tail recursion optimization is that there is no recursion when the program is actually executed. When the compiler or interpreter is able to kick TRO in, it means that it will essentially figure out how to rewrite your recursively-defined algorithm into a simple iterative process with the stack not used to store nested function invocations.
The first code snippet can't be TR-optimized because there are 2 recursive calls in it.

Tb answered 30/9, 2013 at 12:40 Comment(5)

just to clarify, aside from the obvious fact that 2nd algorithm has less complexity and less likely to overflow, are you saying that the 2nd will never overflow?? thanks – Quixotic 30/9, 2013 at 15:5

read this en.wikipedia.org/wiki/Quicksort it explains that the tail recursive version of QuickSort limits the stack depths to logn. So you can still overflow, just less likely. – Tb 30/9, 2013 at 15:13

@Tb "The first code snippet can't be TR-optimized because there are 2 recursive calls in it." <- This sentence is at least misleading, because clearly it can be TRO'ed (the optimized version is in the original post), although of course one call can't be optimized away. – Layla 16/6, 2016 at 0:18

@ManuelJacob I mean can not be optimized by the compiler/interpreter. Humans are different. – Tb 16/6, 2016 at 0:33

@Tb That depends on the compiler. In a quick test both clang and gcc eliminate the second recursive call in a straightforward C translation of the code. – Layla 16/6, 2016 at 0:42

Tail recursion by itself is not enough. The algorithm with the while loop can still use O(N) stack space, reducing it to O(log(N)) is left as exercise in that section of CLRS.

Assume we are working in a language with array slices and tail call optimization. Consider the difference between these two algorithms:

Bad:

Quicksort(arraySlice) {
 if (arraySlice.length > 1) {
  slices = Partition(arraySlice)
  (smallerSlice, largerSlice) = sortBySize(slices)
  Quicksort(largerSlice) // Not a tail call, requires a stack frame until it returns. 
  Quicksort(smallerSlice) // Tail call, can replace the old stack frame.
 }
}

Good:

Quicksort(arraySlice) {
 if (arraySlice.length > 1){
  slices = Partition(arraySlice)
  (smallerSlice, largerSlice) = sortBySize(slices)
  Quicksort(smallerSlice) // Not a tail call, requires a stack frame until it returns. 
  Quicksort(largerSlice) // Tail call, can replace the old stack frame.
 }
}

The second one is guarenteed to never need more than log2(length) stack frames because smallerSlice is less than half as long as arraySlice. But for the first one, the inequality is reversed and it will always need more than or equal to log2(length) stack frames, and can require O(N) stack frames in the worst case where smallerslice always has length 1.

If you don't keep track of which slice is smaller or larger, you will have similar worst cases to the first overflowing case, even though it will require O(log(n)) stack frames on average. If you always sort the smaller slice first, you will never need more than log_2(length) stack frames.

If you are using a language that doesn't have tail call optimization, you can write the second (not stack-blowing) version as:

Quicksort(arraySlice) {
 while (arraySlice.length > 1) {
  slices = Partition(arraySlice)
  (smallerSlice, arraySlice) = sortBySize(slices)
  Quicksort(smallerSlice) // Still not a tail call, requires a stack frame until it returns. 
 }
}

Another thing worth noting is that if you are implementing something like Introsort which changes to Heapsort if the recursion depth exceeds some number proportional to log(N), you will never hit the O(N) worst case stack memory usage of quicksort, so you technically don't need to do this. Doing this optimization (popping smaller slices first) still improves the constant factor of the O(log(N)) though, so it is strongly recommended.

Pestilent answered 21/3, 2017 at 13:30 Comment(0)

Well, the most obvious observation would be:

Most common stack overflow problem - definition

The most common cause of stack overflow is excessively deep or infinite recursion.

The second uses less deep recursion than the first (n branches per call instead of n^2) hence it is less likely to cause a stack overflow..

(so lower complexity means less chance to cause a stack overflow)

But somebody would have to add why the second can never cause a stack overflow while the first can.

source

Vedetta answered 30/9, 2013 at 12:34 Comment(0)

Well If you consider the complexity of the two methods the first method obviously has more complexity than the second since it calls Recursion on both LHS and RHS as a result there are more chances of getting stack overflow

Note: That doesnt mean that there are absolutely no chances of getting SO in second method

Montemayor answered 30/9, 2013 at 12:36 Comment(0)

In the function 2 that you shared, Tail Call elimination is implemented. Before proceeding further let us understand what is tail recursion function?. If the last statement in the code is the recursive call and does do anything after that, then it is called tail recursive function. So the first function is a tail recursion function. For such a function with some changes in the code one can remove the last recursion call like you showed in function 2 which performs the same work as function 1. This process is called tail recursion optimization or tail call elimination and following are the result of it

Optimizing in terms of auxiliary space
Optimizing in terms of recursion call overhead

Last recursive call is eliminated by using the while loop. The good thing is that for function 2, no auxiliary space is used for the right call as its recursion is eliminated using p: <- q+1 and the overall function does not have recursion call overhead. So whatever way partition happens maximum space needed is theta(log n)

Skeens answered 8/6, 2022 at 18:3 Comment(0)

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags