Tree sort: time complexity
Asked Answered
R

2

6

Why is the average case time complexity of tree sort O(n log n)?

From Wikipedia:

Adding one item to a binary search tree is on average an O(log n) process (in big O notation), so adding n items is an O(n log n) process

But we don't each time add an item to a tree of n items. We start with an empty tree, and gradually increase the size of the tree.

So it looks more like

log1 + log2 + ... + logn = log (1*2*...*n) = log n!

Am I missing something?

Rodina answered 2/8, 2016 at 3:49 Comment(0)
C
5

The reason why O(log(n!)) = O(nlog(n)) is a two-part answer. First, expand O(log(n!)),

log(1) + log(2) + ... + log(n)

We can both agree here that log(1), log(2), and all the numbers up to log(n-1) are each less than log(n). Therefore, the following inequality can be made,

log(1) + log(2) + ... + log(n) <= log(n) + log(n) + ... + log(n)

Now the other half of the answer depends on the fact that half of the numbers from 1 to n are greater than n/2. This means that log(n!) would be greater than n/2*log(n/2) aka the first half of the sum log(n!),

log(1) + log(2) + ... + log(n) => log(n/2) + log(n/2) + ... + log(n/2)

The reason being that the first half of log(1) + log(2) + ... + log(n) is log(1) + log(2) + ... + log(n/2), which is less than n/2*log(n/2) as proven by the first inequality so by adding the second half of the sum log(n!), it can be shown that it is greater than n/2*log(n/2).

So with these two inequalities, it can be proven that O(log(n!)) = O(nlog(n))

Cleotildeclepe answered 2/8, 2016 at 4:44 Comment(3)
Why is the second inequality required for the O defintion? Also, I think the explanation in Wikipedia is misleading: in practice, the complexity is log(n!). The fact that it cannot get worse than log(n^n) is cool, but we can also say it cannot get worse than log(n^n^n).Rodina
@Rodina because according to big O notation, the lower bound n/2*log(n/2) is the same as nlog(n) so then you have the inequality O(nlog(n)) <= O(log(n!) <= O(nlog(n))Cleotildeclepe
this is true, but when I look at the formal definition of big O notation, it looks like the first inequality is all what it takes to show that log(n!) = O(n log(n)). Anyway it's a good a simple explanation that has reminded me some details; and confirmed that the current Wikipedia "Tree sort" entry is not so accurate.Rodina
C
5

O(log(n!)) = O(nlog(n)).

https://en.wikipedia.org/wiki/Stirling%27s_approximation

(Answers must be 30 characters.)

Chasechaser answered 2/8, 2016 at 4:7 Comment(3)
You don't need Stirling for that? Simply logarithm rules and O definition. But I was wondering why it's not common to specify in this case that the complexity is O(log n!). Often people use the O notation to mean "that's roughly the number of operations", although it's not accurate. In other words, in common use an algorithm of O(log n!) may be better than one of O(n log n).Rodina
@Rodina I think what you are getting at is that log(n!) maybe a sharper bound than n log(n). But in the world if Big-O, this is quite simply not the case. O(log(n!)) does indeed equal O(n log(n)) as proven here: #8118721 I guess the reason O(n log(n)) is preferred is because people find it easier to grasp.Chasechaser
Yes. But as I said many times the big O notation is used informally to show the actual time complexity, which is not so accurate since it's only an upper bound of the time complexity. Anyway it's a good answer. Thanks.Rodina
C
5

The reason why O(log(n!)) = O(nlog(n)) is a two-part answer. First, expand O(log(n!)),

log(1) + log(2) + ... + log(n)

We can both agree here that log(1), log(2), and all the numbers up to log(n-1) are each less than log(n). Therefore, the following inequality can be made,

log(1) + log(2) + ... + log(n) <= log(n) + log(n) + ... + log(n)

Now the other half of the answer depends on the fact that half of the numbers from 1 to n are greater than n/2. This means that log(n!) would be greater than n/2*log(n/2) aka the first half of the sum log(n!),

log(1) + log(2) + ... + log(n) => log(n/2) + log(n/2) + ... + log(n/2)

The reason being that the first half of log(1) + log(2) + ... + log(n) is log(1) + log(2) + ... + log(n/2), which is less than n/2*log(n/2) as proven by the first inequality so by adding the second half of the sum log(n!), it can be shown that it is greater than n/2*log(n/2).

So with these two inequalities, it can be proven that O(log(n!)) = O(nlog(n))

Cleotildeclepe answered 2/8, 2016 at 4:44 Comment(3)
Why is the second inequality required for the O defintion? Also, I think the explanation in Wikipedia is misleading: in practice, the complexity is log(n!). The fact that it cannot get worse than log(n^n) is cool, but we can also say it cannot get worse than log(n^n^n).Rodina
@Rodina because according to big O notation, the lower bound n/2*log(n/2) is the same as nlog(n) so then you have the inequality O(nlog(n)) <= O(log(n!) <= O(nlog(n))Cleotildeclepe
this is true, but when I look at the formal definition of big O notation, it looks like the first inequality is all what it takes to show that log(n!) = O(n log(n)). Anyway it's a good a simple explanation that has reminded me some details; and confirmed that the current Wikipedia "Tree sort" entry is not so accurate.Rodina

© 2022 - 2024 — McMap. All rights reserved.