Algorithms: Hybrid MergeSort and InsertionSort Execution Time
Asked Answered
C

1

11

Good day SO community,

I am a CS student currently performing an experiment combining MergeSort and InsertionSort. It is understood that for a certain threshold, S, InsertionSort will have a quicker execution time than MergeSort. Hence, by merging both sorting algorithms, the total runtime will be optimized.

However, after running the experiment many times, using a sample size of 1000, and varying sizes of S, the results of the experiment does not give a definitive answer each time. Here is a picture of the better results obtained (Note that half of the time the result is not as definitive):

Execution Time for varying sizes of S, with sample size of 1000. Original Mergesort vs Hybrid Mergesort

Now, trying the same algorithm code with a sample size of 3500:

enter image description here

Finally, trying the same algorithm code with a sample size of 500,000 (Note that the y-axis is in milliseconds:

enter image description here

Although logically, the Hybrid MergeSort will be faster when S<=10, as InsertionSort does not have recursive overhead time. However, the results of my mini experiment says otherwise.

Currently, these are the Time Complexities taught to me:

MergeSort: O(n log n)

InsertionSort:

  • Best Case: θ(n)
  • Worst Case: θ(n^2)

Finally, I have found an online source: https://cs.stackexchange.com/questions/68179/combining-merge-sort-and-insertion-sort that states that:

Hybrid MergeInsertionSort:

  • Best Case: θ(n + n log (n/x))
  • Worst Case: θ(nx + n log (n/x))

I would like to ask if there are results in the CS community that shows definitive proof that a Hybrid MergeSort algorithm will work better than a normal MergeSort algorithm below a certain threshold, S, and if so, why?

Thank you so much SO community, it might be a trivial question, but it really will clarify many questions that I currently have regarding Time Complexities and stuff :)

Note: I am using Java for the coding of the algorithm, and runtime could be affected by the way java stores data in memory..

Code in Java:

 public static int mergeSort2(int n, int m, int s, int[] arr){
        int mid = (n+m)/2, right=0, left=0;
        if(m-n<=s)
            return insertSort(arr,n,m);
        else
        {
            right = mergeSort2(n, mid,s, arr);
            left = mergeSort2(mid+1,m,s, arr);
            return right+left+merge(n,m,s,arr);
        }      
    }

    public static int insertSort(int[] arr, int n, int m){
        int temp, comp=0;
        for(int i=n+1; i<= m; i++){
            for(int j=i; j>n; j--){ 
                comp++;
                comparison2++;
                if(arr[j]<arr[j-1]){
                    temp = arr[j];
                    arr[j] = arr[j-1];
                    arr[j-1] = temp;
                }
                else
                    break;
            }
        }
        return comp;
    }

    public static void shiftArr(int start, int m, int[] arr){
        for(int i=m; i>start; i--)
            arr[i] = arr[i-1];     
    }

public static int merge(int n, int m, int s, int[] arr){
        int comp=0;
        if(m-n<=s)
            return 0;
        int mid = (n+m)/2;
        int temp, i=n,  j=mid+1;
        while(i<=mid && j<=m)
        {
            comp++;
            comparison2++;


            if(arr[i] >= arr[j])
            {
                if(i==mid++&&j==m && (arr[i]==arr[j]))
                    break;
                temp = arr[j];
                shiftArr(i,j++,arr);
                arr[i] = temp;
                if(arr[i+1]==arr[i]){
                   i++;
                }
            }
            i++;


        }
        return comp;
    }
Counterpoint answered 29/9, 2017 at 20:46 Comment(5)
interesting work! I won't speak to if this is a good question for SO, but I recommend also posting it on Computer Science stack exchange for more visibilityBiologist
Hi @Tyler, yes will do, it says i have to wait another 20 minutes to post it on CS Stack exchange :)Counterpoint
3500 elements is not large enough to show the asymptotic runtime. Also please include your code, Java makes it easy to create flawed benchmarks.Curative
Hi @ThomasJungblut! I have included the code, but unfortunately im new to SO and do not know how to create a fiddle.. currently trying to churn out the results with a sample size of 500,000 :)Counterpoint
Hey @Tyler, please don't encourage people to cross-post on multiple SE sites. Each community should have an honest shot at answering without anybody's time being wasted. Thank you!Umbelliferous
D
5

The example code isn't a conventional merge sort. The merge function is shifting an array instead of merging runs between the original array and a temporary working array and back.

I tested top down and bottom up merge sorts and both take about 42 ms == 0.042 seconds to sort 500,000 32 bit integers, versus the apparent results in the graph which are 1000 times slower at about 42 seconds instead of 42 ms. I also tested with 10,000,000 integers and it takes a bit over 1 second to sort.

In the past, using C++, I compared a bottom up merge sort with a hybrid bottom up merge / insertion sort, and for 16 million (2^24 == 16,777,216) 32 bit integers, the hybrid sort was about 8% faster with S == 16. S == 64 was slightly slower than S == 16. Visual Studio std::stable_sort is a variation of bottom up merge sort (the temp array is 1/2 the size of the original array) and insertion sort, and uses S == 32.

For small arrays, insertion sort is quicker than merge sort, a combination of cache locality and fewer instructions needed to sort a small array with insertion sort. For pseudo random data and S == 16 to 64, insertion sort was about twice as fast as merge sort.

The relative gain diminishes as the array size increases. Considering the effect on bottom up merge sort, with S == 16, only 4 merge passes are optimized. In my test case with 2^24 == 16,777,216 elements, that's 4/24 = 1/6 ~= 16.7% of the number of passes, resulting in about an 8% improvement (so the insertion sort is about twice as fast as merge sort for those 4 passes). The total times were about 1.52 seconds for the merge only sort, and about 1.40 seconds for the hybrid sort, a 0.12 second gain on a process that only takes 1.52 seconds. For a top down merge sort, with S == 16, the 4 deepest levels of recursion would be optimized.

Update - Example java code for an hybrid in place merge sort / insertion sort with O(n log(n)) time complexity. (Note - auxiliary storage is still consumed on the stack due to recursion.) The in place part is accomplished during merge steps by swapping the data in the area merged into with the data in the area merged from. This is not a stable sort (the order of equal elements is not preserved, due to the swapping during merge steps). Sorting 500,000 integers takes about 1/8th of a second, so I increased this to 16 million (2^24 == 16777216) integers, which takes a bit over 4 seconds. Without the insertion sort, the sort takes about 4.524 seconds, and with the insertion sort with S == 64, the sort takes about 4.150 seconds, about 8.8% gain. With essentially the same code in C, the improvement was less: from 2.88 seconds to 2.75 seconds, about 4.5% gain.

package msortih;
import java.util.Random;

public class msortih {

    static final int S = 64;    // use insertion sort if size <= S

    static void swap(int[] a, int i, int j) {
        int tmp = a[i]; a[i] = a[j]; a[j] = tmp;
    }

    // a[w:] = merged a[i:m]+a[j:n]
    // a[i:] = reordered a[w:]
    static void wmerge(int[] a, int i, int m, int j, int n, int w) {
        while (i < m && j < n)
            swap(a, w++, a[i] < a[j] ? i++ : j++);
        while (i < m)
            swap(a, w++, i++);
        while (j < n)
            swap(a, w++, j++);
    }  

    // a[w:]  = sorted    a[b:e]
    // a[b:e] = reordered a[w:]
    static void wsort(int[] a, int b, int e, int w) {
        int m;
        if (e - b > 1) {
            m = b + (e - b) / 2;
            imsort(a, b, m);
            imsort(a, m, e);
            wmerge(a, b, m, m, e, w);
        }
        else
            while (b < e)
                swap(a, b++, w++);
    }

    // inplace merge sort a[b:e]
    static void imsort(int[] a, int b, int e) {
        int m, n, w, x;
        int t;
        // if <= S elements, use insertion sort
        if (e - b <= S){
            for(n = b+1; n < e; n++){
               t = a[n];
               m = n-1;
                while(m >= b && a[m] > t){
                    a[m+1] = a[m];
                    m--;}
                a[m+1] = t;}
            return;
        }
        if (e - b > 1) {
            // split a[b:e]
            m = b + (e - b) / 2;
            w = b + e - m;
            // wsort -> a[w:e] = sorted    a[b:m]
            //          a[b:m] = reordered a[w:e]
            wsort(a, b, m, w);
            while (w - b > 2) {
                // split a[b:w], w = new mid point
                n = w;
                w = b + (n - b + 1) / 2;
                x = b + n - w;
                // wsort -> a[b:x] = sorted    a[w:n]
                //          a[w:n] = reordered a[b:x]
                wsort(a, w, n, b);
                // wmerge -> a[w:e] = merged    a[b:x]+a[n:e]
                //           a[b:x] = reordered a[w:n]
                wmerge(a, b, x, n, e, w);
            }
            // insert a[b:w] into a[b:e] using left shift
            for (n = w; n > b; --n) {
                t = a[n-1];
                for (m = n; m < e && a[m] < t; ++m)
                    a[m-1] = a[m];
                a[m-1] = t;
            }
        }
    }

    public static void main(String[] args) {
        int[] a = new int[16*1024*1024];
        Random r = new Random(0);
        for(int i = 0; i < a.length; i++)
            a[i] = r.nextInt();
        long bgn, end;
        bgn = System.currentTimeMillis();
        imsort(a, 0, a.length);
        end = System.currentTimeMillis();
        for(int i = 1; i < a.length; i++){
            if(a[i-1] > a[i]){
                System.out.println("failed");
                break;
            }
        }
        System.out.println("milliseconds " + (end-bgn));
    }
}
Disillusion answered 30/9, 2017 at 0:36 Comment(8)
Hi! Thank you for the clarification.. The algorithm that i've been taught tries to implicitly tackle the space complexity issue by not using an auxillary storage. I've noticed this too, which makes the algorithm different from the original mergesort. This could be why I was unable to replicate the results.. However, the experiment that I am performing 'requires' the use of the shifting method, this could be why my results differ from the standard hybridsort. Thank you again for the help!Counterpoint
@BenjiTan - the example code in the question is using auxiliary storage on the stack, log2(n/s) stack frames due to recursion. A bottom up merge sort would avoid this.Disillusion
Will include that in my conclusion when I submit the report :) Yeah due to the lesser number of recursions at smaller sizes, hence the reduced log2(n/s) part! However, the problem I now have will be to give a conclusion as to whether this version of mergesort vs hybridsort (with shifting) will give any reduced runtime. Theoretically, the hybridsort should still be faster than mergesort, however the shifting could be taking up time as well (as shifting is only done during insertionsort portion of the hybridsort)...Counterpoint
@BenjiTan - You may want to take a look at this answer for an example of in place merge sort. It's not clear to me if the merge sort versus hybrid sort are supposed to be comparing optimized sorts with the given constraints, or if the example code in the question is to be compared. The insertion sort can be improved by using temp = ... before the loop to shift, and then ... = temp after the loop to shift.Disillusion
@BenjiTan - I updated my answer to show an example of the in place merge sort algorithm from this answer , modified to be a hybrid in place merge sort / insertion sort. It's recursive, so it is also using auxiliary storage on the stack. In addition, it's not a "stable" sort.Disillusion
hi! thank you for the update to your answer! I have tested with swap and it indeed is faster! However, my findings with shift (instead of swap) yielded no helpful results (around 1-3% improvements). Thank you for the useful insight :) Edit: My timings are very much higher due to the fact that my insertion sort uses shifting instead of swapping, this could be the reason as to why the results are not obvious. So much for project requirements tho.. haha!Counterpoint
@BenjiTan - The main change I made to the core code from the linked to answer was to switch the insertion part of that code to shift using one temp variable instead of swapping pairs to accomplish the shift. In the case of C / C++, the result was about a 1.5% to 2% gain in performance for larger arrays. That insertion part is not a full insertion sort, it only moves two elements into their proper sorted locations. This is my example code that follows the comment // insert a[b:w] into a[b:e] using left shift . The other change was to switch to full insertion sort for size <= S.Disillusion
@BenjiTan - There is a stable and faster still in place version of merge sort in this Q&A with links to Java and C examples of block merge sort . It is a bit more complicated.Disillusion

© 2022 - 2024 — McMap. All rights reserved.