What are the rules for the "Ω(n log n) barrier" for sorting algorithms?
Asked Answered
P

2

29

I wrote a simple program that sorts in O(n). It is highly memory inefficient, but that's not the point.

It uses the principle behind a HashMap for sorting:

public class NLogNBreak {
    public static class LinkedListBack {
        public LinkedListBack(int val){
            first = new Node();
            first.val = val;
        }
        public Node first = null;
        public void insert(int i){
            Node n = new Node();
            n.val = i;
            n.next = first;
            first = n;
        }
    }

    private static class Node {
        public Node next = null;
        public int val;
    }

    //max > in[i] > 0
    public static LinkedListBack[] sorted(int[] in, int max){
        LinkedListBack[] ar = new LinkedListBack[max + 1];
        for (int i = 0; i < in.length; i++) {
            int val = in[i];
            if(ar[val] == null){
                ar[val] = new LinkedListBack(val);
            } else {
                ar[val].insert(val);
            }
        }
        return ar;
    }
}

So does this count as a sort of O(n), even though it returns the result in a funky format?

Pigling answered 23/8, 2011 at 1:7 Comment(8)
i think "nlog(n) barrier" is only for sorting where you comparisonBreakneck
@Breakneck Are there any other sorts where you don't compare? I can only think of the bucket sort, but that's n^2, I think.Pigling
take a look at counting sort, radix sort; sometimes radix sort can be faster than quicksort, it's worth knowing itBreakneck
Further, the nlog(n) barrier applies only when the range of the sorted list can be arbitrarily large (otherwise radix sort would be O(n)). @Ryan checking out Radix sort's computational complexity might give a good perspective. Edit: looks like titus already recommended radix sort.Hillier
take a look at this video and other lectures youtube.com/watch?v=0VqawRl3XzsBreakneck
@alex c Mine will take any size array, but a number greater than or equal to the largest value in the array must be given.Pigling
@Ryan technically, the range is still pre-bounded or this will not work. The nlog(n) barrier applies to arbitrary reals, for instance, while this applies to a fixed range of integers. The distinction is important -- many algorithms have different computational complexity for integers vs. reals (for instance, there is a P knapsack-problem solution assuming integral values and quantities, but the general knapsack problem is NP-complete.Hillier
This is essentially a counting sort. Counting sorts are not comparison-based sorts so the O(n log n) bound does not apply.Silden
P
156

To directly answer your question:

  1. Your sorting algorithm is technically not O(n) but rather O(n + max), since you need to create an array of size max, which takes O(max) time.
  2. This isn't a problem; in fact, it's a special case of a well-known sorting algorithm that breaks the Ω(n log n) barrier.

So what is this Ω(n log n) barrier? Where does it come from? And how do you break it?

The Ω(n log n) Barrier

The Ω(n log n) barrier is the information-theoretical lower bound on the average-case speed of any comparison-based sorting algorithm. If the only operations you are permitted to apply to array elements to distinguish them is to perform some sort of comparison, then your sorting algorithm can't do better than Ω(n log n) in the average-case.

To understand why this is, let's think about the state of the algorithm at any point during its execution. As the algorithm is running, it can gain some amount of information about the way that the input elements were ordered. Let's say that if the algorithm has some set of information X about the original ordering of the input elements, then the algorithm is in state X.

The crux of the Ω(n log n) argument (and several related arguments, as I'll discuss later) is that the algorithm has to have the ability to get into a large number of different states based on what the input is. Let's assume for now that the input to the sorting algorithm is an array of n distinct values. Because the algorithm can't tell anything about those elements other than the way that they're ordered, it doesn't really matter what the values being sorted are. All that matters is the relative ordering of those n elements relative to one another.

Now for the key step - let's suppose that there are f(n) unique ways of ordering the n input elements and suppose that our sorting algorithm can't get into at least f(n) different states. If this is the case, then there has to be two different orderings of the elements in the array that the algorithm always groups together into the same state. If this happens, then the sorting algorithm can't possibly correctly sort both of the two input arrays correctly. The reasoning behind this is that because the algorithm treats the two arrays identically, whatever steps it uses to reorder the elements of the first array will be the same as the steps it uses to reorder the elements of the second array. Since the two arrays aren't the same, there has to be at least one element that will be out of place in one of the two cases. Consequently, we know that the sorting algorithm has to be able to get into f(n) different states.

But how can the algorithm get into these different states? Well, let's think about this. Initially, the algorithm has no information at all about the ordering of the elements. When it makes its first comparison (say, between elements A[i] and A[j]), the algorithm can get into one of two states - one where A[i] < A[j] and one where A[i] > A[j]. More generally, every comparison that the algorithm makes can, in the best case, put the algorithm into one of two new states based on the result of the comparison. We can therefore think of a large binary tree structure describing the states that the algorithm can be in - each state has up to two children describing what state the algorithm gets into based on the result of the comparison that's made. If we take any path from the root of the tree down to a leaf, we get the series of comparisons that end up getting made by the algorithm on a particular input. In order to sort as quickly as possible, we want to make the fewest number of comparisons possible, and so we want this tree structure to have the smallest height possible.

Now, we know two things. First, we can think of all of the states the algorithm can get into as a binary tree. Second, that binary tree has to have at least f(n) different nodes in it. Given this, the smallest possible binary tree we can build has to have height at least Ω(log f(n)). This means that if there are f(n) different possible ways of ordering the array elements, we have to make at least Ω(log f(n)) comparisons on average, since otherwise we can't get into enough differing states.

To conclude the proof that you can't beat Ω(n log n), note that if the array has n distinct elements in it, then there are n! different possible ways of ordering the elements. using Stirling's approximation, we have that log n! = Ω(n log n), and so we have to make at least Ω(n log n) comparisons in the average case to sort the input sequence.

Exceptions to the Rule

In what we just saw above, we saw that if you have n array elements that are all distinct, you cannot sort them with a comparison sort any faster than Ω(n log n). However, this starting assumption isn't necessarily valid. Many arrays that we'd like to sort may have duplicated elements in them. For example, suppose that I want to sort arrays that are composed solely of zeros and ones, such as this array here:

 0 1 0 1 1 1 0 0 1 1 1

In this case, it is not true that there are n! different arrays of zeros and ones of length n. In fact, there are only 2n of them. From our result above, this means that we should be able to sort in Ω(log 2n) = Ω(n) time using a purely comparison-based sorting algorithm. In fact, we absolutely can do this; here's a sketch of how we'd do it:

  1. Look at the first element.
  2. Copy all elements less than the first element into an array called 'less'
  3. Copy all elements equal to the first element into an array called 'equal'
  4. Copy all elements greater than the first element into an array called 'greater'
  5. Concatenate all three of these arrays together in the order less, equal, greater.

To see that this works, if 0 is our first element, then the 'less' array will be empty, the 'equal' array will have all the zeros, and the 'greater' array will have all the ones. Concatenating them then puts all the zeros before all the ones. Otherwise, if 1 is our first element, then the less array will hold the zeros, the equal array will hold the ones, and the greater array will be empty. Their concatenation is thus all zeros followed by all ones, as required.

In practice, you wouldn't use this algorithm (you'd use a counting sort, as described below), but it shows that you can indeed beat Ω(n log n) with a comparison-based algorithm if the number of possible inputs to the algorithm is small.

Some comparison-based sorting algorithms are known to work very quickly on inputs that have multiple duplicated values. For example, it is known that Quicksort with a special partitioning step can take advantage of duplicated elements in the input array.

Non-Comparison Sorts

All of this discussion has assumed that we're talking about comparison-based sorting, where the only permitted operation on array elements is a comparison. However, if we know more about what elements we're going to be sorting and can perform operations on those elements beyond simple comparisons, then none of the above bounds hold any more. We're breaking the starting assumptions that led us to construct a binary tree of all the states of the algorithm, and so there's no reason to suspect that those bounds will still hold.

For example, if you know that the input values are drawn from a universe that only has |U| elements in it, then you can sort in O(n + |U|) time using a clever algorithm. Start off by creating |U| different buckets into which we can place the elements from the original array. Then, iterate across the array and distribute all of the array elements into the corresponding bucket. Finally, visit each of the buckets, starting with the bucket holding copies of the smallest element and end with the bucket containing copies of the largest element, then concatenate together all of the values you find. For example, let's see how to sort arrays consisting of the values 1 - 5. If we have this starting array:

1 3 4 5 2 3 2 1 4 3 5

Then we can put those elements into buckets like this:

Bucket     1  2  3  4  5
           -------------
           1  2  3  4  5
           1  2  3  4  5
                 3

Iterating across the buckets and concatenating their values together yields this:

1 1 2 2 3 3 3 4 4 5 5

which, sure enough, is a sorted version of our original array! The runtime here is O(n) time to go and distribute the original array elements into the buckets, then O(n + |U|) time to iterate across all the buckets putting the elements back together. Notice that if |U| = O(n), this runs in O(n) time, breaking the Ω(n log n) sorting barrier.

If you are sorting integers, you can do much better than this by using radix sort, which runs in O(n lg |U|). If you're dealing with primitive ints, lg |U| is usually 32 or 64, so this is extremely fast. If you're willing to implement a particularly tricky data structure, you can use a van Emde Boas Tree to sort integers from 0 to U - 1 in time O(n lg lg U), again by exploiting the fact that integers consist of groups of bits that can be manipulated in blocks.

Similarly, if you know that your elements are strings, you can sort very quickly by building a trie out of the strings, then iterating across the trie to rebuild all the strings. Alternatively, you could consider the strings as numbers written in a large base (say, base 128 for ASCII text) and then use one of the integer sorting algorithms from above.

In each of these cases, the reason that you can beat the information-theoretic barrier is that you're breaking the barrier's starting assumption, namely that you can only apply comparisons. If you can treat the input elements as numbers, or as strings, or as anything else that reveals more structure, all bets are off and you can sort extremely efficiently.

Possie answered 23/8, 2011 at 1:35 Comment(4)
Why do you need to know how many buckets there are? HashMap doesn't need to know. Why can't they be arbitrarily bucketed like hash map does?Soniasonic
You could indeed use a hash table here. That would introduce some randomness into the algorithm (whether that’s an issue for you depends on context). While that will reduce the asymptotic space usage, it won’t change the amount of time the algorithm needs. To see why, after distributing items into buckets, we need some mechanism to loop over the buckets in sorted order. Counting sort does this by counting upward from 0 to |U|. Using a hash table won’t speed that step up because hash tables don’t store items in sorted order.Possie
But wouldn't it allow creating buckets on the fly, instead of having to know how many there are at the beginning?Soniasonic
@Soniasonic Sure, you can do that. Or you could do a linear scan over the input array to see what values are there, then use that to size an array.Possie
Q
7

That is called Radix Sort, and yes it breaks the nlog(n) barrier, which is only a barrier on the Comparison Model. On the wikipedia page linked for Comparison Model you can see a list of sorts that use it, and a few that do not.

Radix sort sorts by putting each element in a bucket, based on it's value and then concatenating all the buckets together again at the end. It only works with types like integers that have a finite number of possible values.

Normally a radix sort is done one byte or nibble at a time to reduce the number of buckets. See the wikipedia article on it, or search for more info.

Your's can also be made to sort negative numbers and only allocate memory for the buckets it uses to improve on it.

Quassia answered 23/8, 2011 at 1:23 Comment(12)
Technically this is a bucket sort. Radix sort works by iteratively applying bucket sort on all the bits, while bucket sort dumps every value into its own slot.Possie
@template, thanks for that. I learnt it the other way: (radix sort is a bucket sort where the number of buckets and range is a power of 2). In this case it can be called both a bucket sort (with a bucket size of 1) and a radix sort with a radix of 31 (Since he can only sort positive numbers).Quassia
@Possie Come to think of it, this is a bucket sort >.> I always though bucket sorts for O(n^2), so I never even thought this could be one. Turns out it is.Pigling
You're both wrong -- this is actually a counting sort, only it uses a somewhat inefficient linked list to keep track of the counts.Hornback
@Hornback This is not a Counting sort, counting sort keeps track of how many times each number appears in the list rather than storing the numbers themselves in a linked list.Quassia
@Gabe- You can think of a counting sort with chaining as a bucket sort where each bucket is large enough to hold just one element.Possie
@Hornback I was actually doing counts before I posted this, but swapped it with a linked list so that you were actually conserving the elements instead of just the information.Pigling
it's counting sort and uses en.wikipedia.org/wiki/Unary_numeral_system to count the elementsBreakneck
Except he's not since the algorithm doesn't implement counting sortQuassia
@Paul: it's a modified counting sort. Titus is correct: it makes one "tick" each time it hits a number, but adding to the beginning of the linked list. As I said, it used to be a counting sort, till I changed it slightly to append to a linked list instead of count++Pigling
@Ryan Simply counting how many of each element there are doesn't make it a counting sort. Counting sort requires 3 loops that run one after the other each taking O(n).Quassia
@Paul: It's the first loop of a counting sort.Pigling

© 2022 - 2024 — McMap. All rights reserved.