is it possible to do quicksort of a list with only one passing?

S

3

6

I am learning haskell and the function definition I see is:

quickSort (x : xs) = (quickSort less) ++ (x : equal) ++ (quickSort more)
                 where less = filter (< x) xs
                       equal = filter (== x) xs
                       more = filter (> x) xs

Is it possible to write it with only one traversal of the list, instead of 3?

Stitch answered 3/10, 2011 at 23:51 Comment(7)

Quicksort has a O(n lg n) complexity on average... – Pavlish 3/10, 2011 at 23:56

The complexity is about the number of comparisons made and the above version will have 3 times more comparisons than the one that partitions the list by traversing through it once. – Stitch 4/10, 2011 at 2:43

why does it matter how many times it traverses the list? the complexity is the same – Souterrain 4/10, 2011 at 3:57

@newacct, it is not merely traversing the list; it is comparing every element while traversing; that is why. – Stitch 4/10, 2011 at 4:3

@Salil: right, but O(3n log n) == O(n log n) so the runtime complexity is the same (the actual runtime might not be the same, but that's different from the complexity). – Murrhine 4/10, 2011 at 5:39

@ivanm, Yes the complexity is the same, and it is O(n log n) on average. But it is O(n^2), worst case. However the question is legitimate regardless of this fact, and implementations of algorithms are usually measured by the size of the constant factor. – Fadiman 4/10, 2011 at 6:34

@ivanm, O(n) notation is a way to measure complexity. It does not mean the complexity is the same. Suppose I measure distances by miles to make it easier for me. It does not mean it takes me the same time to travel to the next door and to a mall nearby. – Stitch 4/10, 2011 at 6:55

L

8

Although late, here's a version that's supposed to not leak space as much (and seems to run about twice faster than the other 3-way version here):

qsort3 xs = go xs [] 
  where
    go     (x:xs) zs       = part x xs zs [] [] []
    go     []     zs       = zs
    part x []     zs a b c = go a ((x : b) ++ go c zs)
    part x (y:ys) zs a b c =
        case compare y x of
                  LT -> part x ys zs (y:a) b c
                  EQ -> part x ys zs a (y:b) c
                  GT -> part x ys zs a b (y:c)

This addresses the possible problem with using tuples, where let (a,b) = ... is actually translated into let t= ...; a=fst t; b=snd t which leads to the situation where even after a has been consumed and processed, it is still kept around alive, as part of the tuple t, for b to be read from it - though of course completely unnecessary. This is known as "Wadler pair space leak" problem. Or maybe GHC (with -O2) is smarter than that. :)

Also this apparently uses difference lists approach (thanks, hammar) which also makes it a bit more efficient (about twice faster than the version using tuples). I think part uses accumulator parameters, as it builds them in reversed order.

Liaotung answered 3/3, 2012 at 22:27 Comment(3)

cf. a stable sort variant at the bottom of this answer. – Liaotung 9/10, 2018 at 7:55

Thanks for this solution Will. It helped me with an assignment in an algorithms course which allows submission in Haskell but has a lot of assignments that aren't very Haskell-friendly. This one was meant to use a random pivot to efficiently sort lists with many equal elements, but that wasn't feasible with Haskell since the autograder doesn't allow non-base imports. This technique was still fast enough to pass the grader in less than half a second without a random pivot. – Poler 7/11, 2021 at 3:8

@Poler nice, thanks for sharing. :) – Liaotung 7/11, 2021 at 5:12

M

13

You mean something like this?

quicksort [] = []
quicksort (x:xs) = quicksort less ++ (x : equal) ++ quicksort more
  where (less, equal, more) = partition3 x xs

partition3 _ [] = ([], [], [])
partition3 x (y:ys) =
  case compare y x of
    LT -> (y:less, equal, more)
    EQ -> (less, y:equal, more)
    GT -> (less, equal, y:more)
  where (less, equal, more) = partition3 x ys

Note that this isn't really quicksort, as the real quicksort is in-place.

Mousseline answered 4/10, 2011 at 0:17 Comment(9)

That was neat. The original algorithm I referred to also does not do in-place quicksort. I am assuming that in-place means mutable version where the list is modified in-place. – Stitch 4/10, 2011 at 2:34

isn't this the "Wadler pair" problem that's supposed to leak space? (I've posted another version that's supposed to not have this problem, done according to his technique IIRC). – Liaotung 3/3, 2012 at 22:31

@WillNess: Your version is certainly more efficient, especially since it's using difference lists instead of (++) for appending. Mine might allocate a bit more since it's using tuples, but I don't think it should leak space. I'm not familiar with the "Wadler pair" problem, though. Do you have a link? Google didn't seem to know about it either. – Mousseline 3/3, 2012 at 23:4

google "Wadler pair space leak" basically it's about let (a,b)=span ... kind of thing, being translated into let p=span ...; a=fst$p; b=snd$p which leads to p being kept for b even when a is consumed already. Pairs... :) (I mean, tuples ... IIRC this is exactly what this "leak" stuff was meant to be about). – Liaotung 4/3, 2012 at 0:9

@WillNess: Ah yes, I see what you mean. I'm not sure I'd call it a space leak, though, since it only causes the tails of the three lists to be kept, which it has to do anyway, in addition to the tuples themselves. If I was doing something else with those lists rather than just consing to the front of them, that could definitely be a leak, though. – Mousseline 4/3, 2012 at 0:19

I think it's supposed to cause the lesser to still be kept even after it's consumed and sorted, as part of a triple holding bigger also. Seems like a leak. :) Or maybe GHC with -O2 is smarter than that. :) – Liaotung 4/3, 2012 at 0:25

@WillNess: True. Perhaps you could add some text to your answer summarizing the problems with my solution and how yours improves on that? I think that would be useful to future readers. – Mousseline 4/3, 2012 at 0:31

btw I think your version is true idiomatic Haskell solution; too bad a compiler can't do that transformation automatically that I had to code by hand. We naturally express parallel "assignment" with tuples of vars; it's Haskell's fault for not having an explicit concept and forcing us to use a kludge, and then punishing us for it... – Liaotung 5/3, 2012 at 0:10

@Stitch "in-place" usually refers to array entries' values being modified, as opposed to making array's copy with better-ordered elements. For lists, when "cons" cells are explicitly "re-linked" (e.g. in Common LISP etc.) keeping the cells contents intact, it can also be seen as "in-place" as no new storage is created in the process. – Liaotung 15/5, 2012 at 8:56

I

9

It does not seem to improve anything but:

qs (x:xs) = let (a,b) = partition (< x) xs in (qs a) ++ [x] ++ (qs b)

Impower answered 4/10, 2011 at 0:11 Comment(3)

That was great. Why do you think it is not improving anything? It is traversing the list only once. So, the comparisons are much lesser. – Stitch 4/10, 2011 at 1:57

@qq191: this definition assumes there are no repeated values, which is why the original has three filters! – Murrhine 4/10, 2011 at 5:37

@JonasDuregård: wait, you're right; they go into the b partition. – Murrhine 4/10, 2011 at 10:23

L

8

Although late, here's a version that's supposed to not leak space as much (and seems to run about twice faster than the other 3-way version here):

qsort3 xs = go xs [] 
  where
    go     (x:xs) zs       = part x xs zs [] [] []
    go     []     zs       = zs
    part x []     zs a b c = go a ((x : b) ++ go c zs)
    part x (y:ys) zs a b c =
        case compare y x of
                  LT -> part x ys zs (y:a) b c
                  EQ -> part x ys zs a (y:b) c
                  GT -> part x ys zs a b (y:c)

This addresses the possible problem with using tuples, where let (a,b) = ... is actually translated into let t= ...; a=fst t; b=snd t which leads to the situation where even after a has been consumed and processed, it is still kept around alive, as part of the tuple t, for b to be read from it - though of course completely unnecessary. This is known as "Wadler pair space leak" problem. Or maybe GHC (with -O2) is smarter than that. :)

Also this apparently uses difference lists approach (thanks, hammar) which also makes it a bit more efficient (about twice faster than the version using tuples). I think part uses accumulator parameters, as it builds them in reversed order.

Liaotung answered 3/3, 2012 at 22:27 Comment(3)

cf. a stable sort variant at the bottom of this answer. – Liaotung 9/10, 2018 at 7:55

Thanks for this solution Will. It helped me with an assignment in an algorithms course which allows submission in Haskell but has a lot of assignments that aren't very Haskell-friendly. This one was meant to use a random pivot to efficiently sort lists with many equal elements, but that wasn't feasible with Haskell since the autograder doesn't allow non-base imports. This technique was still fast enough to pass the grader in less than half a second without a random pivot. – Poler 7/11, 2021 at 3:8

@Poler nice, thanks for sharing. :) – Liaotung 7/11, 2021 at 5:12

Recommended topics

Hot tags