QuickSort and MergeSort performance on Sequential data fit in memory vs Slow to Access Sequential data on disk

Asked 19/12, 2017 at 0:8 Answered 19/12, 2017 at 2:28

Solved java algorithm sorting quicksort mergesort

The following quote is from "Comparison with other sort algorithms" section from Wikipedia Merge Sort page

On typical modern architectures, efficient quicksort implementations generally outperform mergesort for sorting RAM-based arrays.[citation needed] On the other hand, merge sort is a stable sort and is more efficient at handling slow-to-access sequential media.

My questions:

Why does Quicksort outperform Mergesort when the data to be sorted can all fit into memory? If all data needed are cached or in memory wouldn't it be fast for both Quicksort and Mergesort to access?
Why is Mergesort more efficient at handling slow-to-access sequential data (such as from disk in the case where the data to be sorted can't all fit into memory)?
(move from my comments below to here)In an array arr of primitives (data are sequential) of n elements. The pair of elements that has to be read and compared in MergeSort is arr[0] and arr[n/2] (happens in the final merge). Now think the pair of elements that has to be read and compared in QuickSort is arr[1] and arr[n] (happens in the first partition, assume we swap the randomly chosen pivot with the first element). We know data are read in blocks and load into cache, or disk to memory (correct me if I am wrong) then isn't there a better chance for the needed data gets load together in one block when using MergeSort? It just seems to me MergeSort would always have the upperhand because it is likely comparing elements that are closer together. I know this is False (see graph below) because QuickSort is obviously faster...... I know MergeSort is not in place and requires extra memory and that is likely to slow things down. Other than that what pieces am I missing in my analysis?

images are from Princeton CS MergeSort and QuickSort slides

My Motive:

I want to understand these above concepts because they are one of the main reasons of why mergeSort is preferred when sorting LinkedList,or none sequential data and quickSort is preferred when sorting Array, or sequential data. And why mergeSort is used to sort Object in Java and quickSort is used to sort primitive type in java.

update: Java 7 API actually uses TimSort to sort Object, which is a hybrid of MergeSort and InsertionSort. For primitives Dual-Pivot QuickSort. These changes were implemented starting in Java SE 7. This has to do with the stability of the sorting algorithm. Why does Java's Arrays.sort method use two different sorting algorithms for different types?

Edit:

I will appreciate an answer that addresses the following aspects:

I know the two sorting algorithms differ in the number of moves, read, and comparisons. If those are that reasons contribute to the behaviors I see listed in my questions (I suspected it) then a thorough explanation of how the steps and process of the sorting algorithm results it having advantages or disadvantages seeking data from disk or memory will be much appreciated.
Examples are welcome. I learn better with examples.

note: if you are reading @rcgldr's answer. check out our conversation in the chat room it has lots of good explanations and details. https://chat.stackoverflow.com/rooms/161554/discussion-between-rcgldr-and-oliver-koo

Courtyard answered 19/12, 2017 at 0:8 Comment(1)

On a typical PC, quick sort is not going to be 3 times as fast as merge sort, more like 10% to 20% faster, depending on checks in quick sort to avoid worst case behaviors. – Bulla 20/12, 2017 at 0:24

The main difference is that merge sort does more moves, but fewer compares than quick sort. Even in the case of sorting an array of native types, quick sort is only around 15% faster, at least when I've tested it on large arrays of pseudo random 64 bit unsigned integers, which should be quick sort's best case, on my system (Intel 3770K 3.5ghz, Windows 7 Pro 64 bit, Visual Studio 2015, sorting 16 million pseudo random 64 bit unsigned integers, 1.32 seconds for quick sort, 1.55 seconds for merge sort, 1.32/1.55 ~= 0.85, so quick sort was about 15% faster than merge sort). My test was with a quick sort that had no checks to avoid worst case O(n^2) time or O(n) space. As checks are added to quick sort to reduce or prevent worst case behavior (like fall back to heap sort if recursion becomes too deep), the speed advantage decreases to less than 10% (which is the difference I get between VS2015's implementation of std::sort (modified quick sort) versus std::stable_sort (modified merge sort).

If sorting "strings", it's more likely that what is being sorted is an array of pointers (or references) to those strings. This is where merge sort is faster, because the moves involve pointers, while the compares involve a level of indirection and comparison of strings.

The main reason for choosing quick sort over merge sort is not speed, but space requirement. Merge sort normally uses a second array the same size as the original. Quick sort and top down merge sort also need log(n) stack frames for recursion, and for quick sort limiting stack space to log(n) stack frames is done by only recursing on the smaller partition, and looping back to handle the larger partition.

In terms of cache issues, most recent processors have 4 or 8 way associative caches. For merge sort, during a merge, the two input runs will end up in 2 of the cache lines, and the one output run in a 3rd cache line. Quick sort scans the data before doing swaps, so the scanned data will be in cache, although in separate lines if the two elements being compared / swapped are located far enough from each other.

For an external sort, some variation of bottom up merge sort is used. This because merge sort merge operations are sequential (the only random access occurs when starting up a new pair of runs), which is fast in the case of hard drives, or in legacy times, tape drives (a minimum of 3 tapes drives is needed). Each read or write can be for very large blocks of data, reducing average access time per element in the case of a hard drive, since a large number of elements are read or written at a time with each I/O.

It should also be noted that most merge sorts in libraries are also some variation of bottom up merge sort. Top down merge sort is mostly a teaching environment implementation.

If sorting an array of native types on a processor with 16 registers, such as an X86 in 64 bit mode, 8 of the registers used as start + end pointers (or references) for 4 runs, then a 4-way merge sort is often about the same or a bit faster than quick sort, assuming a compiler optimizes the pointers or references to be register based. It's a similar trade off, like quick sort, 4-way merge sort does more compares (1.5 x compares), but fewer moves (0.5 x moves) than traditional 2-way merge sort.

It should be noted that these sorts are cpu bound, not memory bound. I made a multi-threaded version of a bottom up merge sort, and in the case of using 4 threads, the sort was 3 times faster. Link to Windows example code using 4 threads:

https://codereview.stackexchange.com/questions/148025/multithreaded-bottom-up-merge-sort

Bulla answered 19/12, 2017 at 2:28 Comment(15)

but moves still require read/write so not doing compare is just one step less right? why is the mechanism in QuickSort cause less move more compare better for fast to access sequential data? and the mechanism in MergeSort cause more move less compare better for slow to acess sequential data? – Courtyard 19/12, 2017 at 16:48

@Courtyard - if the elements of an array fit in a register, and assuming an optimizing compiler, then after reading each element for a compare, the element is in a register, so that a move or swap after a compare only needs to do the write(s) from register(s). If the element doesn't fit in a register, it's probably still in cache after a compare, so the write(s) are from the cached copy, and don't require reading from main memory for a move or swap. – Bulla 19/12, 2017 at 16:54

ahh @Bulla I see. so the cost of move and compare are base on the type of data and how the data are stored. In fast to access sequential data cost of moves is less than the cost of compare. In slow to access sequential data the cost of move is more than the cost of compare. am I right? how fast we can acess the data influence the cost of move and compare which further impact the speed of the sorting algorithm when two sorting algorithm does different numbers of moves and compare. am I interpreting this correctly from a high level? – Courtyard 19/12, 2017 at 17:4

@Courtyard - I deleted my prior comments as they are no longer needed, you may want to do the same. The cost of move or swap is only less than a compare when entire elements are not moved or swapped, such as the case when sorting an array of indices or pointers to elements as opposed to sorting an array of elements directly. – Bulla 19/12, 2017 at 17:27

Right I agree, when comparison is expensive we should utilize mergeSort. However, I am still try to wrap my head around this "because merge sort merge operations are sequential (the only random access occurs when starting up a new pair of runs), which is fast in the case of hard drives....Each read or write can be for very large blocks of data". Why doesn't this hold true for QuickSort then? when using QuickSort the computer can still read and write large blocks of data right? same hardware but why mergesort has the upperhand. Thank you so much by the way spending time helping me – Courtyard 19/12, 2017 at 17:33

I @Bulla understand two sorts differ in moves and comparison and how the cost of moves and comparison is different base on type of data we sorting. However, I still fail to understand the reason behind the sentence from WIki "more efficient at handling slow-to-access sequential media." – Courtyard 19/12, 2017 at 17:35

@Courtyard - the main issue for quick sort on a hard drive is the swap oriented method of partitioning. Each swap only involves two elements, and this requires two random accesses on a hard drive to do the writes, with only one element being written per hard disk write. For merge sort, during a merge, the "output" elements are appended into a block of memory, and only when the block of memory is full is the write done, writing many elements with a single write operation. – Bulla 19/12, 2017 at 17:36

@Courtyard - continuing with merge sort, the blocks of memory used for reading and writing can be large, 10MB to 100MB. A 2-way merge sort would use 2 blocks for reading and 1 block (perhaps twice as large) for writing. A 16-way merge sort would use 16 blocks for reading, and 1 block for writing. Again the blocks are large so the cost of random access per element is small since many elements are read or written with each I/O. Take a look at the [wiki article] for the tape drive example, 4 devices, all sequential I/O. With large blocks, a hard drive can emulate multiple devices efficiently. – Bulla 19/12, 2017 at 17:45

@Courtyard - Take a look at the wiki article for the tape drive example, 4 devices, all sequential I/O. With large blocks, a hard drive can emulate multiple devices efficiently. – Bulla 19/12, 2017 at 17:46

okok. In your scenario above why is quickSort writing to the hard drive and mergeSort writing to memory? Of course, it is faster to write to memory. If the data we want to swap or merge is in the hard drive shouldn't mergeSort write to hard drive as well? or on the other way, why couldn't QuickSort swap elements in memory then write the result back to hard drive like MergeSort? – Courtyard 19/12, 2017 at 17:58

ok @Bulla seems like the key difference is the swap operation and merge operation. When we swap two elements we actually have to swap the value in hard drive right? when merging are we not doing the same? we are constructing a new array by over writing it with new value. Isn't the overwriting essentially the same as swap? – Courtyard 19/12, 2017 at 18:3

so I thought more of this during lunch. I think I got it. are you saying in QuickSort we read, compare and swap two at a time whereas in MergeSort we read, compare, and swap (contruct) a group of elements in subarray during merge. So in a way QuickSort read/write to harddrive small amount of data at a time amd MergeSort do a block at a time, so when the trip cost is high (go from memory to harddrive) MergeSort wins because takes less number of trips? That makes sense to me. am I explaining it right? – Courtyard 19/12, 2017 at 19:8

Let us continue this discussion in chat. – Bulla 19/12, 2017 at 20:9

That will be a wise idea. I have not used chat on SO before but let's give it a try. – Courtyard 19/12, 2017 at 20:12

thanks to universe, that we can use both quick and merge sort in the same time. That is quick sort can be used as underlying sorting of merge sort. – Tyrrell 26/12, 2017 at 4:55

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags