How to implement merge sort from "The Introduction to Algorithms" by Cormen and Co
Asked Answered
F

6

13

I'm learning algorithms from Cormen and Co. and I have problem with implementation of merge sort from their pseudocode. I compiled it by:

$ gcc -Wall -g merge_sort.c

I have a problem because for numbers:

2 4 5 7 1 2 3 6

The result is:

1 2 2 3 3 4 5 5 

I tried to read carefully the pseudo code but this does not help me. I want to know what I'm doing wrong. Below is my code:

#include <stdio.h>

#define SIZE 8

void merge(int *array_of_integers, int p, int q, int r) {
    int n1 = q - p + 1;
    int n2 = r - q; 
    int i, j, k;
    int left_array[n1 + 1];
    int right_array[n2 + 1];

    for (i = 0; i < n1; i++)
        left_array[i] = array_of_integers[p + i];
    for (j = 0; j < n2; j++)
        right_array[j] = array_of_integers[q + j];

    i = 0;
    j = 0;

    for (k = p; k < r; k++){
        if (left_array[i] <= right_array[j]) {
            array_of_integers[k] = left_array[i];
            i++;
        } else {
            array_of_integers[k] = right_array[j];
            j++;
        }   
    }
}

void merge_sort(int *array_of_integers, int p, int r) {
    if (p < r) {
        int q = (p + r) / 2;
        merge_sort(array_of_integers, p, q);
        merge_sort(array_of_integers, q + 1, r);
        merge(array_of_integers, p, q, r);
    }
}

void print_array(int *array_of_integers, int amout_of_integers) {
    int i;
    for(i = 0; i < amout_of_integers; i++)
        printf("%d ", array_of_integers[i]);
    puts("");
}

int main(void) {
    int dataset[] = { 2, 4, 5, 7, 1, 2, 3, 6 };

    print_array(dataset, SIZE);
    merge_sort(dataset, 0, SIZE);
    print_array(dataset, SIZE);

    return 0;
}

Edit: (Correct solution)

 void merge(int *array_of_integers, int p, int q, int r) {
     int n1 = q - p + 1;
     int n2 = r - q; 
     int i, j, k;
     int left_array[n1 + 1];
     int right_array[n2 + 1];

     left_array[n1] = 123456798;
     right_array[n2] = 123456798;

     for (i = 0; i < n1; i++)
         left_array[i] = array_of_integers[p + i];
     for (j = 0; j < n2; j++)
         right_array[j] = array_of_integers[q + j + 1];

     i = 0;
     j = 0;

     for (k = p; k <= r; k++) {
         if (left_array[i] <= right_array[j]) {
             array_of_integers[k] = left_array[i];
             i++;
         } else {
             array_of_integers[k] = right_array[j];
             j++;
         }
     }
 }

 void merge_sort(int *array_of_integers, int p, int r) {
     if(p < r) {
         int q = (p + r) / 2;
         merge_sort(array_of_integers, p, q);
         merge_sort(array_of_integers, q + 1, r);
         merge(array_of_integers, p, q, r);
     }
 }
Federalism answered 21/8, 2012 at 14:19 Comment(2)
There is a code review stack exchange site that might be of interest here: codereview.stackexchange.com . I don't use it myself though so am not sure whether this is more on topic there or not...Gary
This algorithm is broken: using sentinel values to avoid testing index values against array lengths is a doomed approach. What if the arrays contains values greater of equal to 123456798? You should probably disregard this reference book.Overstrain
T
9

There are two problems in your code.

One, you need to clarify what the parameters you are passing mean. Inside merge_sort, it looks like p is the first element to be sorted, and r is the last element to be sorted. But, where merge_sort is called, in main, it is passed 0 and SIZE. Here, 0 is the first element to be sorted, but SIZE cannot be the last element, because it is (presumably) the number of elements to be sorted. In your example, you are passing 8, but the last element to be sorted is 7. So decide whether you want to change merge_sort so that r is the number of elements or whether you want to change main to pass SIZE-1. Similarly, in merge, p seems to be the first element to merge, q is the last element of the first range (so q+1 is the first of the second), and r is the last element of the second range. But when you copy from array_of_integers to right_array, you copy from q+j. When j is zero, this copies the last element of the first range, but you want the first element of the second range. So you need to clear up these uses of the indices. (Also, you only need n1 and n2 elements for left_array and right_array, not n1+1 and n2+1.) Also check the loop on k, for(k = p; k < r; k++). What should the continuation condition on that loop be?

Two, when you merge left_array and right_array, you do not account for the fact that an array might be empty (because all elements have been copied out of it previously), so comparing left_array[i] to right_array[j] does not work because i or j is indicating an element outside of the left_array or of the right_array, respectively. For example, if i has reached its limit (n1), then you should not compare. Instead, you should just take an element from right_array.

Tantalite answered 21/8, 2012 at 14:45 Comment(5)
Thanks Eric! With your help, I solved the problem! Thanks, again. I pushed a correct code to github.Federalism
I think your new code does not properly handle all the cases that occur when merging left_array and right_array. Consider what happens when left_array is exhausted but right is not, and consider vice-versa.Tantalite
In the Cormen's book, they use infinity when they compare integers so I would used a big integer so I think now it should be better solution.Federalism
Of course, it isn't a beautiful solution but I want only solve a simple exercise.Federalism
@MateuszCzerwiński The github link is broken. Please post a link to a working solution if it isn't any trouble.Marital
M
6

this one works though its implemented in Java, the logic is the same obviously . I have taken care of all the points suggested in the answer by Eric. Please check out the code, it's self explanatory.

import java.util.*;
class MergeSort
{

    public static void main(String args[])
    {
        int testArray[] = {1,3,5,3,1,7,8,9};
        mergeSort(testArray,0,testArray.length-1);
        System.out.println(Arrays.toString(testArray));
    }

    protected static void mergeSort(int arr[], int p, int r)
    {
        int q;
        if (p<r)
        {
            q = (p+r)/2;
            mergeSort(arr,p,q);
            mergeSort(arr, q+1, r);
            merge(arr,p,q,r);   
        }   
    }

    protected static void merge(int arr[], int p, int q, int r)
    {    
        int n = q-p+1;
        int m = r-q;

        int L[] = new int[n+1];
        int R[] = new int[m+1];
        int i,j,k;

        for(i=0; i< n; i++)
        {
            L[i] = arr[p+i];    
        }
        for(j=0; j< m; j++)
        {
            R[j] = arr[q+j+1];    
        }

        L[n] = Integer.MAX_VALUE;
        R[m] = Integer.MAX_VALUE;

        i = 0;
        j = 0;
        for(k = p; k<= r; k++)
        {

            if( L[i]<=R[j])
            {
                arr[k] = L[i];
                i = i+1;
            }
            else
            {
                arr[k] = R[j];
                j = j+1;

            }           
        }
    }
}
Microgram answered 4/4, 2015 at 18:3 Comment(1)
Edit your code and comments to the code, it is better than self explanatory.Ludlow
D
1

Python implement

from math import inf


def merge(A, p, q, r):

    n1 = q - p + 1
    n2 = r - q

    L = [0] * (n1+1)
    R = [0] * (n2+1)

    for i in range(0, n1):
        L[i] = A[p + i]

    for j in range(0, n2):
        R[j] = A[q + j + 1]

    L[n1] = inf
    R[n2] = inf

    i = 0
    j = 0
    for k in range(p, r+1):
        if L[i] <= R[j]:
            A[k] = L[i]
            i = i + 1
        else:
            A[k] = R[j]
            j = j + 1


def mergesort(A, p, r):

    if p < r:
        q = (p + r)//2
        mergesort(A, p, q)
        mergesort(A, q + 1, r)
        merge(A, p, q, r)


A = [00, 11, 12, 13, 14, 15, 16, 17, 18, 2, 4,
     5, 7, 1, 2, 3, 6, 22, 23, 34, 56, 78, 77]

merge(A, 9, 12, 16)

print(A)

mergesort(A, 9, 16)
print(A)

print(A)
Dialyze answered 1/2, 2022 at 14:28 Comment(2)
Great Example! However, in the book, the left and right arrays are created by the following pseudocode, for i in range(0, n1): L[i] = A[p + i - 1] and for j in range(0, n2): R[j] = A[q + j] The value of i and j is also set to 1 in the book. Finally, the range for k is mentioned from p to r in the book. Could you please explain why you did it differently?Locris
array start from 0Dialyze
F
0
This one worked for me

    // MergeSortRevisionAgain.cpp : Defines the entry point for the console application.
//Understanding merge sort
#include <iostream>

using std::cout;
using std::endl;


//The declaration of the merge sort function
void merge(int A[], int p, int q, int r);
int* mergeSort(int A[], int p, int r);


int main()
{

    /*My Code to test for the merge sort*/
    int myArray[]{ 2,3,5,7,1,4,7,9};
    int lengthOfArray = sizeof(myArray) / sizeof(myArray[1]);
    int* sortedOutput = mergeSort(myArray, 0, lengthOfArray-1);

    for (int i = 0; i <lengthOfArray; i++)
    {
        cout << sortedOutput[i] << " ";
    }

    cout << endl;


    return 0;
}


void merge(int A[], int p, int q, int r)
{
    //Declaration of number of variable in each half
    int n1 = q - p + 1;                                                             //1. n1 = q - p + 1
    int n2 = r - q;                                                                 //2. n2 = r-q

    //Declaration of left and right part of the array
    int* leftArray= new int[n1+1] ;                                                 //3. Let L[1...n1+1] and ... 
    int* rightArray= new int[n2+1] ;                                                //... R[1...n2+1] be new arrays

    //Entering the for loop for the left side
    for (int i = 0; i < n1; i++)                                                    //4.for i = 1 to n1 NB(change i to 0 since index in c++ starts from 0)
    {
        leftArray[i] = A[p + i ];                                                   //5. L[i] = A[p+i-1] NB(change to A[p+i] since "i" was changed to 0 hence A[p,...,p+i)
    }

    //Entering the for loop for the right side
    for (int  j = 0; j < n2; j++)                                                   //6. for j = 1 to n2 NB(change j j= 0 since index in c++ starts from 0)
    {
        rightArray[j] = A[q + j+1];                                                 //7. R[i] = A[q + j ] NB(change to A[q+j+1] since "j" was changed to 0  hence A[q+1,...q+1+j]
    }

    leftArray[n1] = 999;                                                            //8. Set L[n1+1] = sentinel NB last value in leftArray will be the sentinel
    rightArray[n2] = 999;                                                           //9. Set L[n2 + 2] = sentinel NB last value in rightArray will be the sentinel

    int i = 0;                                                                      //10. i = 1 change to i = 0 since index starts from 0 in c++
    int j = 0;                                                                      //11. j = 1 change to j = 0 since index starts from 0 in c++

    for (int k = p; k <= r; k++)                                                    //12. for k = p to r - change as specified in code since index of array p = 0, r = lengthofArray - 1
    {
        if (leftArray[i] <= rightArray[j])                                          //13. L[i] <= R[j]
        {
            A[k] = leftArray[i];                                                    //14. A[k] = L[i]
            i = i + 1;                                                              //15. i = i + 1
        }
        else
        {
            A[k] = rightArray[j];                                                   //16. A[k] = R[j]
            j = j + 1;                                                              //17. j = j+1;
        }
    }

    delete leftArray;                                                               //18. Free allocated dynamic memory for leftArray
    leftArray = nullptr;                                                            //19. Set pointer to nullptr to prevent access to deleted memory
    delete rightArray;                                                              //20. Free allocated dynamic memory for rightArray              
    rightArray = nullptr;                                                           //21. Set pointer to nullptr to prevent access to deleted memory
}

int* mergeSort(int A[], int p, int r)
{
    if (p < r)
    {
        int q = floor((p + r) / 2);
        mergeSort(A, p, q );
        mergeSort(A, q + 1, r);
        merge(A, p, q, r);
    }

    return A;
}
Ferdinand answered 15/6, 2019 at 10:56 Comment(1)
Explain where the problem in the OP is.Osugi
D
0

Here is my attempt. Known bugs: since INT_MAX is used as a sentinel, sorting an array containing INT_MAX may cause the pointers to overflow during the merge.

#include <stdio.h>
#include <limits.h>
void merge(int A[], unsigned int p, unsigned int q, unsigned int r){
    unsigned int n1 = q - p; //differs from book because C indexes from 0
    unsigned int n2 = r - q;


    int L[n1 + 1]; // L contains the first elem of A, up to the midpoint (not including the midpoint)
    int R[n2 + 1]; // R contains the elems including the midpoint of A all the way to the end.

    L[n1] = INT_MAX; //INT_MAX is our sentinel, which will be used in the merge step. No possible int will be greater than INT_MAX, so during the merge,
    R[n2] = INT_MAX; // INT_MAX is similar to the infinity used in the book

    for (unsigned int i = 0; i < n1; i++){
        L[i] = A[p + i];
    }

    for (unsigned int i = 0; i < n2; i++){
        R[i] = A[q + i];
    }
    // Now we just need to merge L and R and sort A
    // The sorting occurs here, during the merge.
    unsigned int i = 0;
    unsigned int j = 0;

    for (unsigned int k = p; k < r; k++){
        if (L[i] <= R[j]){
            A[k] = L[i];
            i++;
        }
        else{
            A[k] = R[j];
            j++;
        }
    }
}
void merge_sort(int A[], unsigned int p, unsigned int r) { // input is array A, first elem p, and last elem + 1 r

    if (p < r - 1) { //differs from book... since C indexes from 0, if we have an array of size 1, we will subtract 1 to get 0 and then hit the base case

        // Otherwise, find the midpoint and divide and conquer
        unsigned int q = (p + r) / 2; //q is the midpoint of A
        merge_sort(A, p, q); //this must process the midpoint
        merge_sort(A, q, r); //this must process the elem after the midpoint to the last elem
        merge(A, p, q, r);
        return;


    }

}


int main(){

    int A[] = {432, 5, 99, 101, 43};
    unsigned int len_A = sizeof(A)/sizeof(A[0]);

    printf("original order of elems in A: \n");

    for (unsigned int i = 0; i < len_A; i++){
        printf("%d ", A[i]);
    }

    merge_sort(A, 0, len_A);

    printf("\n\n");
    printf("after performing merge_sort: \n");


    for (unsigned int i = 0; i < len_A; i++){
        printf("%d ", A[i]);
    }

    printf("\n\n");

return 0;
}
Deerstalker answered 10/1, 2020 at 23:15 Comment(0)
I
0
// merge.cpp
#include <iostream>
#include <vector>
using namespace std;

template <typename T>
void merge( vector<T> &arr, int p, int q, int r ) {

    int nl = q - p + 1;
    int nr = r - q;
    vector<T> larr(nl), rarr(nr);
    for ( int i = 0; i < nl; i++ )
        larr[i] = arr[p + i];
    for ( int j = 0; j < nr; j++ )
        rarr[j] = arr[q + j + 1];

    int i = 0, j = 0, k = p;

    while( i < nl && j < nr ) {
        if ( larr[i] <= rarr[j] ) {
            arr[k] = larr[i];
            i++;
        }
        else {
            arr[k] = rarr[j];
            j++;
        }
        k++;
    }
    while ( i < nl ) {
        arr[k] = larr[i];
        i++;
        k++;
    }
    while ( j < nr ) {
        arr[k] = rarr[j];
        j++;
        k++;
    }
}

template <typename T>
void merge_sort( vector<T> &arr, int p, int r ) {

    if ( p >= r )
        return;
    int q = ( p + r ) / 2;
    merge_sort( arr, p, q );
    merge_sort( arr, q + 1, r );
    merge( arr, p, q, r );
}

template <typename T>
void display( vector<T> &arr ) {

    int p = 0;
    int r = arr.size()-1;
    cout << "Before sorting: ";
    for ( size_t i = 0; i < arr.size(); i++ )
        cout << arr[i] << " ";
    cout << endl;
    merge_sort( arr, p, r );
    cout << "After sorting:  ";
    for ( size_t i = 0; i < arr.size(); i++ )
        cout << arr[i] << " ";
    cout << endl;
}

int main( void ) {

    vector<int>    a{ 12, 3, 7, 9, 14, 6, 11, 2 };
    vector<double> b{ 7.29, 7.25, -12.39, -12.875, 0.375, 0.37, 19.1, 5.76, 1.85 };
    vector<char>   c{ 'q', 'c', 'z', 'r', 'b', 'a', 's', 'j', 'm', 'w', 'f', 'p', 'g', 'f' };
    vector<string> d{ "Arvid", "Thirl", "Loki", "Athena", "Nimrod", "Zima", "Kirin" };
   
    display( a );
    display( b );
    display( c );
    display( d );
   
    return 0;
}
c++ -O3 -Wall -std=c++11 -o merge merge.cpp
./merge
Before sorting: 12 3 7 9 14 6 11 2 
After sorting:  2 3 6 7 9 11 12 14 
Before sorting: 7.29 7.25 -12.39 -12.875 0.375 0.37 19.1 5.76 1.85 
After sorting:  -12.875 -12.39 0.37 0.375 1.85 5.76 7.25 7.29 19.1 
Before sorting: q c z r b a s j m w f p g f 
After sorting:  a b c f f g j m p q r s w z 
Before sorting: Arvid Thirl Loki Athena Nimrod Zima Kirin 
After sorting:  Arvid Athena Kirin Loki Nimrod Thirl Zima 
Irruption answered 7/3 at 16:0 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.