Is timsort general-purpose or Python-specific?
Asked Answered
E

7

33

Timsort is an adaptive, stable, natural mergesort. It has supernatural performance on many kinds of partially ordered arrays (less than lg(N!) comparisons needed, and as few as N-1), yet as fast as Python's previous highly tuned samplesort hybrid on random arrays.

Have you seen timsort used outside of CPython? Does it make sense?

Eterne answered 30/9, 2008 at 19:12 Comment(4)
why do you ask? without more context, your question cannot be answered.Largo
Have you noticed "Have you seen timsort used outside of CPython?" part?Eterne
i have noticed it, and it still gives us no context. what would you learn from a simple "no" as answer?Largo
'timsort' is really very specific, although it won't mean much to you unless you actually know what timsort is.Impotence
E
32

Yes, it makes quite a bit of sense to use timsort outside of CPython, in specific, or Python, in general.

There is currently an effort underway to replace Java's "modified merge sort" with timsort, and the initial results are quite positive.

Echinate answered 29/6, 2009 at 20:9 Comment(1)
Java SE 7 uses Timsort as it's sort algorithm now. See docjar.com/docs/api/java/util/Collections.html#sort(List)Scrivener
I
23

The algorithm is pretty generic, but the benefits are rather Python-specific. Unlike most sorting routines, what Python's list.sort (which is what uses timsort) cares about is avoiding unnecessary comparisons, because generally comparisons are a lot more expensive than swapping items (which is always just a set of pointer copies) or even allocating some extra memory (because it's always just an array of pointers, and the overhead is small compared to the average overhead in any Python operation.)

If you're under similar constraints, then it may be suitable. I've yet to see any other case where comparisons are really that expensive, though :-)

Impotence answered 30/9, 2008 at 20:14 Comment(3)
If comparisons are expensive, then a data-specific algorithm will usually perform better than a comparison-based one.Grammalogue
That is a good observation, and indeed probably the main reason you won't see timsort or anything close to it in the wild.Impotence
The benefits are not Python-specific. The same benefit of using Timsort in Python to sort a list of object references exists in any programming language which has pointers or object references. For instance, Java SE 7+, Android and Swift use Timsort to sort objects.Causeuse
S
6

It doesn't look particularly familiar, but "smart" mergesorts are pretty common out in the wide world of software.

As for whether it makes sense, that depends on what you're sorting, and the relative cost of comparisons vs. memory allocation. A sort that requires up to 2*N bytes of extra memory isn't going to be a good choice in a memory-constrained environment.

Scarface answered 30/9, 2008 at 20:2 Comment(3)
Are you talking about Timsort in your last sentence? It does not take 2*N bytes of extra memory.Causeuse
From the linked description in the original question:” + timsort can require a temp array containing as many as N//2 pointers, which means as many as 2*N extra bytes on 32-bit boxes.”Scarface
Thanks for making it clear. As you assumed a 32-bit arch in the sentence in your answer, it'd be nice to explicitly mention that in it.Causeuse
P
4

Answered now on Wikipedia: timsort will be used in Java 7 who copied it from Android.

Photoelectron answered 29/10, 2010 at 7:36 Comment(0)
E
3

Timsort is also in Android now: http://www.kiwidoc.com/java/l/x/android/android/5/p/java.util/c/TimSort

Eterne answered 9/6, 2011 at 6:53 Comment(0)
H
0

The description you linked looks completely general.

Heimdall answered 30/9, 2008 at 19:21 Comment(1)
Yes, but have you seen timsort used outside of CPython?Eterne
C
0

Timsort is not Python-specific. The benefits of using Timsort in Python to sort a list of object references exists in any programming language which has pointers or object references. For instance, Java SE 7+, Android and Swift use Timsort to sort objects.

On the other hand, some variation of quicksort (eg introsort, dual-pivot quicksort) usually sorts primitive types faster, due to cache coherence, and therefore it is usually chosen for this task.

Causeuse answered 29/11, 2019 at 18:20 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.