Why do we use Insertion Sort in the Bucket Sort?
Asked Answered
H

2

5

Bucket sort is a linear-time sort.

Why do we use insertion sort in it? We know that insertion sort takes O(n2) time. Why can we not use any linear sort inside it? As we see, when in each bucket we use insertion sort O(n2). How is bucket sort's total complexity O(n)? Why do we not use a O(nlogn) sort such as merge sort or quick sort?

Hidie answered 29/10, 2015 at 2:51 Comment(0)
S
6

Bucket sort with insertion sort in the buckets is only reasonable when we can expect that there will be only a few items in each bucket. When there are only a few items, insertion sort is fine.

In real life, this doesn't happen too often. Usually its when we expect that data to be uniformly distributed because we're sorting into hash order or something like that.

Bucket sort is most commonly used when it's the entire sort -- i.e., the buckets don't need to be sorted at all and you can just append each item into the bucket list.

Sometimes we do a top-down radix sort, which is like bucket sorting and then bucket sorting each bucket. In combination with keeping a bit-mask of non-emtpy buckets, this can be a very fast way to sort when the sort keys are 32-bit integers.

You can also do a bottom-up radix sort by repeatedly bucket sorting on a different range of bits, and just appending items to each bucket. This kind of bucket sort is stable, so when you're bucket sorting by high bits, ties are broken by the previous ordering, which you got by sorting on the lower bits.

Slipslop answered 29/10, 2015 at 3:48 Comment(0)
H
1

insertion sort takes O(n2) time.

That's not the whole story. For partially-sorted arrays, insertion sort performs well, it has linear time complexity.

An array where each entry is not far from its final position is a typical example of partially-sorted array. That's the case here for Bucket sort.

Hutchison answered 29/10, 2015 at 3:4 Comment(2)
why do we not use any linear sort (e.g, counting, radix, etc) inside the bucket sort?Hidie
How we are sure that they are partially sorted and not far away from each other? If we have arrays size n and number of buckets m. Then in each bucket there will be n/m elements on average. They could end up in the bucket in descending order like this (10, 9, 8, 7, 6), while we need final result in ascending order. So to sort each bucket it will require n/m^2. So overral it will be (n/m)^2 * m which is n^2. Why bucket sort is linear then?Footstone

© 2022 - 2024 — McMap. All rights reserved.