Is Quicksort in-place or not? [duplicate]
Asked Answered
H

4

41

So the space efficiency of Quicksort is O(log(n)). This is the space required to maintain the call stack.

Now, according to the Wikipedia page on Quicksort, this qualifies as an in-place algorithm, as the algorithm is just swapping elements within the input data structure.

According to this page however, the space Efficiency of O(log n) disqualifies Quicksort from being in place, as it's space efficiency is greater than O(1). According to this definition any algorithm with space efficiency greater than O(1) is not in-place. So I assume this means that all recursive algorithms are by definition not in place?

So obviously there are two different definitions of in-place here. Wikipedia is not always an entirely reliable source, so I consulted with one of my professors.

My professor agreed with the second definition. He said Quicksort is not in-place. Even though the data remains in the input array, the extra space needed for the stack disqualifies it. My professor wrote a popular book on Algorithms so I respect his opinion greatly, but this answer just does not seem correct to me.

I see the property of in-place as being quite literal. The data stays in place. It does not move from it's original data structure. To me, this definition is more useful, because it means you can perform your algorithm with pointers instead of requiring you to copy data. This seems like a valuable property of an algorithm and worthy of the name "in-place".

Horse answered 25/2, 2014 at 23:2 Comment(4)
This question reminds me of something I read recently about the ambiguous meanings of duck-typing and strong typing. We can know the benefits of Quicksort without labeling it as in-place or not in-place. I guess I'm arguing for the why-does-it-matter viewpoint. shrugsWelltimed
The data does not stay in place. It leaks to the stack in form of the indices with which the quicksort is invoked.Bouffard
Related, one can write a highly modified version of the quick-sort that is in-place according to all definitions. However, there is some debate over weather or not it is still a quick-sort at this point. ideone.com/cv6VsTTeeming
An alternative viewpoint is that log(N) is effectively O(1) for large values of N. I'm only partially tongue-in-cheek here - since 10^100, which is approximately 2^300, is far larger than the estimated number of subatomic particles in the universe, I'd feel pretty safe saying that for any realistic sorting problem log(N) is bounded by the constant 300.Cyclades
M
23

Intro to Algorithms from MIT Press qualifies QuickSort as in-place - it sorts the elements within the array with at most a constant amount of them outside the array at any given time.

At the end of the day, people will always have differing opinions (is Top-Down Memoization considered Dynamic Programming? Not to some "classical" folks), side with who you trust the most. I trust the editors and the authors at MIT Press (along with my professor, who qualifies it as in-place).

Typically, the problem with QuickSort is not that it doesn't sort in-place, but that it is not stable - satellite data is not kept in order.

Edit

Kuroi's point highlights part of this argument I think is very important.

Many argue that it requires O(log(n)) extra memory for recursive calls and the data is leaked in the form of the indices on the stack, however this is negligible for very large sizes of N (log(1,000,000,000) =~ 30) and it ignores the fact that typically moving data on the heap takes much, much longer, when size(data) >> size(index).

The indices of the data are not the elements themselves - so the elements of the data are not stored outside the array, other than a constant amount of them (in each call).

Methylene answered 25/2, 2014 at 23:9 Comment(6)
Many people consider quick sort to not be in-place because it requires O(log n) additional data outside of the array.Teeming
"however this is negligible for very large sizes of N (log(1,000,000,000) =~ 30) " That statement completely undermines the entire point of standard algorithmic notation of calling it O(log n) to start with. If it's negligible, then quicksort is an O(n) algorithm!Teeming
@MooingDuck my point is the cost associated is not in terms of accessing indices on the stack which is of relatively small sizeMethylene
In asymptotic analysis, you can't just dismiss logarithmic terms! Anyway, the worst case complexity is $\Theta(n)$ levels of recursion. Finally, if its expensive to move the actual data items, don't move them; move pointers to them instead. This is standard, and so I don't think you can distinguish between the elements and their indices.Mirna
@Mirna I think you raise some good points. 1) I wish Latex was on stackoverflow as well as cs :) 2) yes and the worst-case complexity is O(n^2), but in practice it is on average Theta(nlogn) which is why its preferred over and outperforms mergesort, heapsort. (Because of lower order terms which asymptotic analysis ignores) 3) Again in practice, I think pjs explains it well why you shouldn't ever have to worry about recursion stack size in his comment. 4) I'm not performing asymptotic analysis, but rather providing an 'in-practice' view of log(n) recursive calls. Did I say 'in-practice' enough?Methylene
I think this is a better answer than one in the duplicate question, because it gives both perspectives and not just one.Antidisestablishmentarianism
D
7

Strictly speaking Quicksort has a space efficiency of O(n) since the degenerate case would require an index on the stack for each element of the array. Though on average it will be O(log(n)). Given that I don't think there is any way you could call it an "in place" algorithm, unless you go with a degenerate definition of "in place" meaning that the original data is not stored outside of the original array bounds (excluding copy/swap operations).

This definition of "in place" would be degenerate because you could take any "out of place" algorithm and make it satisfy this "in place" requirement by having it do all of its extra data storage use pointers to the original array. Then when the answer is found you could reorder the original array "in place" using the pointer data.

Doer answered 26/2, 2014 at 3:18 Comment(4)
One can ensure that Quicksort uses O(log n) space in the worst case by always executing the branch with less elements before the branch with most elements.Taiwan
Because quicksort chooses an element to split the array into two parts it is possible (though not likely) that the element chosen to split the array will always split it into an empty set and a set of the remaining elements. This worst case behaviour will lead to a worst case time and space efficiency of O(n).Doer
Most quicksort implementations I have seen will have this behaviour when an array is sorted that contains elements that are all equal. This is because most quicksorts don't take the time to divide the elements equally among the two sets that are equal to the spliting element .Doer
"This worst case behaviour will lead to" no additional memory needed as you process the empty array first and then (non-recursively) continue to the other part. So it's actually the best case w.r.t. the memory. But I agree that many implementations are stupid.Acuity
R
4

qsort does indeed swap data in place, but the stack space used for recursion is in log2(N).

These two properties are not contradictory. Usually "in place" refers to heap memory, i.e. what you must explicitely allocate for the algorithm to work.

However, a logarithmic space complexity is basically neglectible, except in pathological cases (say you want to do a quicksort on a 8 bits microcontroller with 512 bytes of stack :)).

Relief answered 25/2, 2014 at 23:16 Comment(4)
How is the stack memory not explicitly allocated? And what is the difference between "stack" and "heap" in terms of Turing machines? Though I agree that in that particular case it doesn't matter, but it is still important from the theoretical point of view. =)Bouffard
Bah, let's say I am more inclined to consider the practical point of view, then. If your stack can handle 1000 items, you will require only twice the size to handle 1 million. That makes o(log(N)) neglectible for all practical purpose. And since qsort is a terminal algorithm (it won't call any more complex treatment, unless you do really weird things to compare two objects), the o(log(N)) consumption will be shadowed by anything of greater order.Relief
@kuroi: It's absurd to talk in Big-O notation but not use a proper computational model. In the RAM model there is no distinction between heap and stack and I know of no other model where there is one.Newby
@NiklasB if you say so...Relief
B
1

It all depends on the definition of "in-place" algorithm.

If you define "in-place" as requiring a constant amount of memory, then quicksort is not "in-place" as it requires log(N) memory for the recursion.

If you define "in-place" as more human-friendly "does not move the data outside the input structure", then quicksort is again not "in-place". The data leaks to the memory in form of indices with which the quicksort method is invoked, which are necessary for algorithm to work. The contents of this additional memory directly depend on the input, how it is not leaking?

If you define "in-place" as not-copying, then what about a stupid algorithm to find a sum of an array: create another array of length (n - 1) with elements like b[i] = a[i + 1] + a[0] / n. This is kinda copying, though the contents are different, but the contents of this additional memory are a function of the input data, just like the indices held on stack in quicksort algorithm.

I think that the wikipedia definition of "in-place" is the most useful, and according to it the quicksort is not "in-place" as it uses non-constant amount of memory.

Bouffard answered 25/2, 2014 at 23:40 Comment(1)
I see. I think my confusion was stemming from the difference between the formal definition of in-place, and the casual definition of in-place.Horse

© 2022 - 2024 — McMap. All rights reserved.