Golang custom sort is faster than native sort
Asked Answered
P

2

10

I was just playing around with sorting in golang and I found a qsort function on stackoverflow. It seems to run about twice as fast as the native sort function in golang. I've tried it with different input sizes and tested that it works.

Could anyone explain why this happens?

Here is the code you can test it on your pc:

package main

import (
    "fmt"
    "math/rand"
    "sort"
    "time"
)

func qsort(a []int) []int {
    if len(a) < 2 {
        return a
    }

    left, right := 0, len(a)-1

    // Pick a pivot
    pivotIndex := rand.Int() % len(a)

    // Move the pivot to the right
    a[pivotIndex], a[right] = a[right], a[pivotIndex]

    // Pile elements smaller than the pivot on the left
    for i := range a {
        if a[i] < a[right] {
            a[i], a[left] = a[left], a[i]
            left++
        }
    }

    // Place the pivot after the last smaller element
    a[left], a[right] = a[right], a[left]

    // Go down the rabbit hole
    qsort(a[:left])
    qsort(a[left+1:])

    return a
}

func main() {
    // Create an array with random integers
    rand.Seed(30)
    size := 1000000
    array1 := make([]int, size)
    start := time.Now()

    for i, _ := range array1 {
        array1[i] = rand.Int()
    }

    fmt.Println("Creating array with ", size, " elements...")
    fmt.Println("--- ", time.Since(start), " ---")
    // Create a copy of the unsorted array
    array2 := make([]int, size)
    copy(array2, array1)

    // Short using native function
    start = time.Now()
    sort.Ints(array1)

    fmt.Println("Sorting with the native sort...")
    fmt.Println("--- ", time.Since(start), " ---")

    // Sort using custom qsort
    start = time.Now()
    qsort(array2)

    fmt.Println("Sorting with custom qsort...")
    fmt.Println("--- ", time.Since(start), " ---")

}
Pooler answered 24/4, 2014 at 17:59 Comment(3)
Does the builtin use qsort? qsort can be little faster, but it can also be incredibly slower (e.g. when sorting an array that's already sorted or almost sorted already (which is quite common in practice)). qsort's worst case is O(N^2), but it's O(N log N) for many other sorts. See this post for a similar experiment in Perl.Pickford
You should calculate time delta before println, and then print it, because maybe println interferes with your time result.Sanguinary
I'm writing a real benchmark right now to try and answer. The first thing that strikes me is that sort uses sort.Interface and thus has to call methods a lot of places where you use builtins.Cronk
C
14

The difference seems to largely be due to the fact that your Quicksort uses builtins. It slices and uses len. Keep in mind that sort.Sort takes in a sort.Interface. So every time you call len it calls slice.Len and every time you do array[i],array[j] = array[j],array[i] it has to call Swap(i,j).

I wrote a comparable version that works on an arbitrary qsort.Interface:

func Qsort(a Interface, prng *rand.Rand) Interface {
    if a.Len() < 2 {
        return a
    }

    left, right := 0, a.Len()-1

    // Pick a pivot
    pivotIndex := prng.Int() % a.Len()
    // Move the pivot to the right
    a.Swap(pivotIndex, right)

    // Pile elements smaller than the pivot on the left
    for i := 0; i < a.Len(); i++ {
        if a.Less(i, right) {

            a.Swap(i, left)
            left++
        }
    }

    // Place the pivot after the last smaller element
    a.Swap(left, right)

    // Go down the rabbit hole
    leftSide, rightSide := a.Partition(left)
    Qsort(leftSide, prng)
    Qsort(rightSide, prng)

    return a
}

Then I used Go's benchmark functionality (which you should always use for Benchmarks where possible).

For reference and transparency, qsort.Interface is defined as:

type Interface interface {
    sort.Interface
    // Partition returns slice[:i] and slice[i+1:]
    // These should references the original memory
    // since this does an in-place sort
    Partition(i int) (left Interface, right Interface)
}

The actual IntSlice implementation for qsort is:

type IntSlice []int

func (is IntSlice) Less(i, j int) bool {
    return is[i] < is[j]
}

func (is IntSlice) Swap(i, j int) {
    is[i], is[j] = is[j], is[i]
}

func (is IntSlice) Len() int {
    return len(is)
}

func (is IntSlice) Partition(i int) (left Interface, right Interface) {
    return IntSlice(is[:i]), IntSlice(is[i+1:])
}

Finally, here's the qsort_test.go file:

package qsort_test

import (
    "math/rand"
    "qsort"
    "sort"
    "testing"
    "time"
)

const size int = 1000000

var list = make([]int, size)
var prng = rand.New(rand.NewSource(int64(time.Now().Nanosecond())))

func BenchmarkQsort(b *testing.B) {
    for n := 0; n < b.N; n++ {
        b.StopTimer()
        for i := range list {
            list[i] = prng.Int()
        }
        b.StartTimer()

        qsort.Qsort(qsort.IntSlice(list), prng)
    }
}

func BenchmarkNativeQsort(b *testing.B) {
    for n := 0; n < b.N; n++ {
        b.StopTimer()
        for i := range list {
            list[i] = prng.Int()
        }
        b.StartTimer()

        qsort.NativeQsort(list, prng)
    }
}

func BenchmarkSort(b *testing.B) {
    for n := 0; n < b.N; n++ {
        b.StopTimer()
        for i := range list {
            list[i] = prng.Int()
        }
        b.StartTimer()

        sort.Sort(sort.IntSlice(list))
    }
}

The results (formatting mine):

PASS

BenchmarkQsort             5     513629360 ns/op
BenchmarkNativeQsort       10    160609180 ns/op
BenchmarkSort              5     292416760 ns/op

As you can see, the standard library's sort massively outperforms your qsort on average with random data. NativeQsort refers to the qsort functions you posted in your actual question, and it outperforms both. The only thing that's changed between that and Qsort is that I swapped the builtin functions for qsort.Interface. It follows, then, that genericity is likely the reason one is slower than the other.

Edit: There aren't many samples because of how expensive sorting is, so here are the results with -benchtime 10s just for slightly more representative results.

BenchmarkQsort                50     524389994 ns/op
BenchmarkNativeQsort         100     161199217 ns/op
BenchmarkSort                 50     302037284 ns/op
Cronk answered 24/4, 2014 at 19:49 Comment(5)
sort.Sort is IMHO not a variant of mergesort but a quicksort/insertion sort combination.Potsherd
@Potsherd is that so? I recall the documentation saying it was Mergesort. It's entirely possible I'm mistaken.Cronk
@Potsherd I must be off my rocker. I just looked at the source and you're right. I deleted the stuff about Mergesort in the answer.Cronk
Quicksort. See tip.golang.org/src/pkg/sort/sort.go?s=4433:4458#L182. Stable sorting is done by mergesort.Potsherd
The golang standard library uses Intro sort which is a hybrid sort switching from quick to heap sort based on a condition.Scarborough
A
1

It seems to run about twice as fast as the native sort function in golang

Note that the native sort for slice will evolve with Go 1.19 (Q4 2022).
See:

sort: use pdqsort

  • Across all benchmarks, pdqsort is never significantly slower than the previous algorithm.
  • In common patterns, pdqsort is often faster (i.e. 10x faster in sorted slices).

The pdqsort is described at Pattern-defeating Quicksort (pdf) by Orson R. L. Peters.

(extract)
Pattern-defeating quicksort is often the best choice of algorithm overall for small to medium input sizes or data type sizes.
It and other quicksort variants suffer from datasets that are too large to fit in cache, where is4o shines.
The latter algorithm however suffers from bad performance on smaller sizes, future research could perhaps combine the best of these two algorithms

This CL is inspired by both C++ implementation and Rust implementation.

Aleris answered 21/4, 2022 at 16:32 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.