QuickSort and Hoare Partition
Asked Answered
U

7

18

I have a hard time translating QuickSort with Hoare partitioning into C code, and can't find out why. The code I'm using is shown below:

void QuickSort(int a[],int start,int end) {
    int q=HoarePartition(a,start,end);
    if (end<=start) return;
    QuickSort(a,q+1,end);
    QuickSort(a,start,q);
}

int HoarePartition (int a[],int p, int r) {
    int x=a[p],i=p-1,j=r;
    while (1) {
        do  j--; while (a[j] > x);
        do  i++; while (a[i] < x);

        if  (i < j)
            swap(&a[i],&a[j]);
        else
            return j;
    }
}

Also, I don't really get why HoarePartition works. Can someone explain why it works, or at least link me to an article that does?

I have seen a step-by-step work-through of the partitioning algorithm, but I don't have an intuitive feel for it. In my code, it doesn't even seem to work. For example, given the array

13 19  9  5 12  8  7  4 11  2  6 21

It will use pivot 13, but end up with the array

 6  2  9  5 12  8  7  4 11 19 13 21 

And will return j which is a[j] = 11. I thought it was supposed to be true that the array starting at that point and going forward should have values that are all larger than the pivot, but that isn't true here because 11 < 13.

Here's pseudocode for Hoare partitioning (from CLRS, second edition), in case this is useful:

Hoare-Partition (A, p, r)
    x ← A[p]
    i ← p − 1
    j ← r + 1
    while  TRUE
        repeat   j ←  j − 1
            until     A[j] ≤ x
        repeat   i ←  i + 1
            until     A[i] ≥ x
        if  i < j
            exchange  A[i] ↔ A[j]
        else  return   j 

Thanks!

EDIT:

The right C code for this problem will end up being:

void QuickSort(int a[],int start,int end) {
    int q;
    if (end-start<2) return;
    q=HoarePartition(a,start,end);
    QuickSort(a,start,q);
    QuickSort(a,q,end);
}

int HoarePartition (int a[],int p, int r) {
    int x=a[p],i=p-1,j=r;
    while (1) {
        do  j--; while (a[j] > x);
        do  i++; while (a[i] < x);
        if  (i < j) 
            swap(&a[i],&a[j]);
        else 
            return j+1;
    }
}
Uncounted answered 25/8, 2011 at 22:51 Comment(4)
Check your data samples, you go from 12 to 11 elements (with 13 missing). That can't be.Scope
that is according to the link umiacs.umd.edu/~joseph/classes/enee641/assign5-solution.pdfUncounted
Just for clarification, what is "CLRS, second edition"? Please provide a standard citation or link to that book.Bloodletting
I do not understand why it doesn't work for you for two elements? if (end-start<2) return;Deep
I
10

To answer the question of "Why does Hoare partitioning work?":

Let's simplify the values in the array to just three kinds: L values (those less than the pivot value), E values (those equal to the pivot value), and G value (those larger than the pivot value).

We'll also give a special name to one location in the array; we'll call this location s, and it's the location where the j pointer is when the procedure finishes. Do we know ahead of time which location s is? No, but we know that some location will meet that description.

With these terms, we can express the goal of the partitioning procedure in slightly different terms: it is to split a single array into two smaller sub-arrays which are not mis-sorted with respect to each other. That "not mis-sorted" requirement is satisfied if the following conditions are true:

  1. The "low" sub-array, that goes from the left end of the array up to and includes s, contains no G values.
  2. The "high" sub-array, that starts immediately after s and continues to the right end, contains no L values.

That's really all we need to do. We don't even need to worry where the E values wind up on any given pass. As long as each pass gets the sub-arrays right with respect to each other, later passes will take care of any disorder that exists inside any sub-array.

So now let's address the question from the other side: how does the partitioning procedure ensure that there are no G values in s or to the left of it, and no L values to the right of s?

Well, "the set of values to the right of s" is the same as "the set of cells the j pointer moves over before it reaches s". And "the set of values to the left of and including s" is the same as "the set of values that the i pointer moves over before j reaches s".

That means that any values which are misplaced will, on some iteration of the loop, be under one of our two pointers. (For convenience, let's say it's the j pointer pointing at a L value, though it works exactly the same for the i pointer pointing at a G value.) Where will the i pointer be, when the j pointer is on a misplaced value? We know it will be:

  1. at a location in the "low" subarray, where the L value can go with no problems;
  2. pointing at a value that's either an E or a G value, which can easily replace the L value under the j pointer. (If it wasn't on an E or a G value, it wouldn't have stopped there.)

Note that sometimes the i and j pointer will actually both stop on E values. When this happens, the values will be switched, even though there's no need for it. This doesn't do any harm, though; we said before that the placement of the E values can't cause mis-sorting between the sub-arrays.

So, to sum up, Hoare partitioning works because:

  1. It separates an array into smaller sub-arrays which are not mis-sorted relative to each other;
  2. If you keep doing that and recursively sorting the sub-arrays, eventually there will be nothing left of the array that's unsorted.
Ie answered 26/7, 2014 at 15:34 Comment(2)
Where will the i pointer be, when the j pointer is on a misplaced value? We know it will be: Won't there be a possibility of i crossing j?Deep
"Won't there be a possibility of i crossing j?" Yes; at some point either i and j will cross, or they will stop on the same value. But when either of those two things happen, the condition (i < j) fails and the procedure returns the value j.Ie
J
5

I believe that there are two problems with this code. For starters, in your Quicksort function, I think you want to reorder the lines

 int q=HoarePartition(a,start,end);
 if (end<=start) return;

so that you have them like this:

 if (end<=start) return;
 int q=HoarePartition(a,start,end);

However, you should do even more than this; in particular this should read

 if (end - start < 2) return;
 int q=HoarePartition(a,start,end);

The reason for this is that the Hoare partition fails to work correctly if the range you're trying to partition has size zero or one. In my edition of CLRS this isn't mentioned anywhere; I had to go to the book's errata page to find this. This is almost certainly the cause of the problem you encountered with the "access out of range" error, since with that invariant broken you might run right off the array!

As for an analysis of Hoare partitioning, I would suggest starting off by just tracing through it by hand. There's also a more detailed analysis here. Intuitively, it works by growing two ranges from the ends of the range toward one another - one on the left-hand side containing elements smaller than the pivot and one on the right-hand side containing elements larger than the pivot. This can be slightly modified to produce the Bentley-McIlroy partitioning algorithm (referenced in the link) that scales nicely to handle equal keys.

Hope this helps!

Jacklin answered 25/8, 2011 at 23:5 Comment(20)
First off all thanks, that did help secondly i still get an error :Run-Time Check Failure #2 - Stack around the variable 'a' was corrupted.Uncounted
No on the j=r+1 thing. As you can tell from the calling pattern this uses the true C [inclusiveStart, ExclusiveEnd) convention.Scope
this explains the run time error,something weird heppenning- with just j it stops with the sorting one step b4 its done, and with j+1 its sorting the array but going out of range... what is wrong?Uncounted
@Henk Holterman- Are you sure about that? I tested this out on my machine and it definitely seems like he's using inclusive endpoints; if you provide it an array of ten elements and specify the range [0, 9] it correctly sorts it.Jacklin
@Ofek Ron- With that change the code is working on my machine. How are you specifying the endpoints of the range to sort?Jacklin
The line QuickSort(a,start,q); definitely indicates an exclusive end. You should be calling QuickSort(0,10) from the top. Make sure you test multiple datasets.Scope
thats exactly what i am doing notice the way i call the quicksort is: QuickSort(a,0,N);Uncounted
@Ofek Ron- Got it! There are two bugs in the code, and I've updated my answer to point you at what they are. Can you take another look at this?Jacklin
@Henk Holterman- Thank you for pointing out the flaw in my reasoning! I've updated my answer to correctly report what was going on. Can you double-check this to make sure it looks right?Jacklin
i have changed what u mentioned and still it does the sorting, yet going out of range...Uncounted
@Ofek Ron- On my system this works just fine even under Valgrind. Are you sure that you're starting with the code you've posted above and making the changes? Also, can you post the code where you're invoking quicksort? Perhaps you have the argument wrong?Jacklin
@Ofek Ron- What was the other minor change?Jacklin
@Jacklin - oh and btw the change QuickSort(a,start,q+1); you suggested is wrong, as it leads us to infinate loop... i would publish our conclusion if i could but it wont let meUncounted
If you combine returning j+1 from the Hoare step and iterating over that range as well, then I could see it looping infinitely. I'm not sure what's up with us getting different values, since when I'm running this on my machine with just my changes with Valgrind running there aren't any problems reported. We must be doing something subtle differently...Jacklin
I will put the final code on the original question, this is the only way that we can compare our results i guessUncounted
seems like im wrong again... it doesnt sort for all N now, help me look into that if you want toUncounted
Looks great! Passing all the tests on my end.Jacklin
i think we should come up with an explanation for why those changes needed to be done, im gonna try and get down to the bottom of it, if you do so too, share your thoughts please.Uncounted
Why do you say to rewrite the second line to QuickSort(a,start,q+1); Wouldn't this give an array out of bounds error? Even the note you linked to have QuickSort(a,start,q+1); and QuickSort(a,q-z,end);Solomon
@Jacklin I've come across two variations in the way quicksort calls itself. One way is QuickSort(a,start,q+1) and QuickSort(a,start,q-1). The other way is QuickSort(a,start,q+1) and QuickSort(a,start,q) (the -1 is missing). My current theory is this difference can be attributed to using Hoare's versus Lomuto's partition algorithm.Solomon
W
3

Your final code is wrong, since the initial value of j should be r + 1 instead of r. Otherwise your partition function always ignore the last value.

Actually, HoarePartition works because for any array A[p...r] which contains at least 2 elements(i.e. p < r), every element of A[p...j] is <= every element of A[j+1...r] when it terminates. So the next two segments that the main algorithm recurs on are [start...q] and [q+1...end]

So the right C code is as follows:

void QuickSort(int a[],int start,int end) {
    if (end <= start) return;
    int q=HoarePartition(a,start,end);
    QuickSort(a,start,q);
    QuickSort(a,q + 1,end);
}

int HoarePartition (int a[],int p, int r) {
    int x=a[p],i=p-1,j=r+1;
    while (1) {
        do  j--; while (a[j] > x);
        do  i++; while (a[i] < x);
        if  (i < j) 
            swap(&a[i],&a[j]);
        else 
            return j;
    }
}

More clarifications:

  1. partition part is just the translation of the pseudocode. (Note the return value is j)

  2. for the recursive part, note that the base case checking (end <= start instead of end <= start + 1 otherwise you will skip the [2 1] subarray )

Wrest answered 23/5, 2016 at 16:38 Comment(0)
B
2

First of all u misunderstood the Hoare's partition algorithm,which can be see from the translated code in c, Since u considered pivot as leftmost element of subarray.

Ill explain u considering the leftmost element as pivot.

int HoarePartition (int a[],int p, int r) 

Here p and r represents the lower and upper bound of array which can be part of a larger array also(subarray) to be partitioned.

so we start with the pointers(marker) initially pointing to before and after end points of array(simply bcoz using do while loop).Therefore,

i=p-1,

j=r+1;    //here u made mistake

Now as per partitioning we want every element to the left of pivot to be less than or equal to pivot and greater than on right side of pivot.

So we will move 'i' marker untill we get element which is greaterthan or equal to pivot. And similarly 'j' marker untill we find element less than or equal to pivot.

Now if i < j we swap the elements bcoz both the elements are in wrong part of array. So code will be

do  j--; while (a[j] <= x);                 //look at inequality sign
do  i++; while (a[i] >= x);
if  (i < j) 
    swap(&a[i],&a[j]);

Now if 'i' is not less than 'j',that means now there is no element in between to swap so we return 'j' position.

So now the array after partitioned lower half is from 'start to j'

upper half is from 'j+1 to end'

so code will look like

void QuickSort(int a[],int start,int end) {
    int q=HoarePartition(a,start,end);
    if (end<=start) return;
    QuickSort(a,start,q);
    QuickSort(a,q+1,end);
}
Burris answered 20/2, 2017 at 6:44 Comment(0)
E
2

Straightforward implementation in java.

public class QuickSortWithHoarePartition {

    public static void sort(int[] array) {
        sortHelper(array, 0, array.length - 1);
    }

    private static void sortHelper(int[] array, int p, int r) {
        if (p < r) {
            int q = doHoarePartitioning(array, p, r);
            sortHelper(array, p, q);
            sortHelper(array, q + 1, r);
        }
    }

    private static int doHoarePartitioning(int[] array, int p, int r) {
        int pivot = array[p];
        int i = p - 1;
        int j = r + 1;

        while (true) {

            do {
                i++;
            }
            while (array[i] < pivot);

            do {
                j--;
            }
            while (array[j] > pivot);

            if (i < j) {
                swap(array, i, j);
            } else {
                return j;
            }
        }

    }

    private static void swap(int[] array, int i, int j) {
        int temp = array[i];
        array[i] = array[j];
        array[j] = temp;
    }
}
Entrant answered 9/4, 2017 at 8:48 Comment(0)
S
1

You last C code works. But it's not intuitive. And now I'm studying CLRS luckily. In my opinion, The pseudocode of CLRS is wrong.(At 2e) At last, I find that it would be right if changing a place.

 Hoare-Partition (A, p, r)
 x ← A[p]
     i ← p − 1
     j ← r + 1
 while  TRUE
        repeat   j ←  j − 1
            until     A[j] ≤ x
    repeat   i ←  i + 1
            until     A[i] ≥ x
    if  i < j
              exchange  A[i] ↔ A[j]
    else  
              exchnage  A[r] ↔ A[i]  
              return   i

Yes, Add a exchange A[r] ↔ A[i] can make it works. Why? Because A[i] is now bigger than A[r] OR i == r. So We must exchange to guarantee the feature of a partition.

Sverre answered 18/10, 2013 at 14:8 Comment(1)
CLRS presents the Lomuto partitioning scheme, not Hoare.Perri
X
0
  1. move pivot to first. (eg, use median of three. switch to insertion sort for small input size.)
  2. partition,
    • repetitively swap currently leftmost 1 with currently rightmost 0.
      0 -- cmp(val, pivot) == true, 1 -- cmp(val, pivot) == false.
      stop if not left < right.
    • after that, swap pivot with rightmost 0.
Xl answered 14/1, 2017 at 14:52 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.