How to sort a list using partial order in Haskell?
Asked Answered
L

2

10

I have a procedural EDSL which uses blocks of statements.

These statements are added to the blocks in no particular order although there may be dependencies between statements.

During compilation of the EDSL, however, I need to make sure that the statements are ordered in the order of dependence, e.g.

B := A
C := B
E := D

Since not all statements have dependencies there is no total order (E.g. E := D above is independent and can be placed anywhere). There are no cyclic dependencies so list ordering should be possible.

I have tried to hack a solution by using Data.List.sortBy and defining Ordering which would return EQ to mean that the statements have no dependencies. This worked for some examples but not in the general case, e.g. ordering the following did nothing:

C := B                           B := A
D := C    = should produce =>    C := B
B := A                           D := C

This is because the default sort insertion sort and only makes sure the inserted item is smaller or equal to the next.

I have searched the internets for a Poset implementation but have not found anything applicable:

altfloat:Data.Poset defines Ordering = LT | GT | EQ | NC (NC for Non-comparable) which is good but the provided sort assumes NaN-like non-comparable items and just throws them away.

logfloat:Data.Number.PartialOrd is similar to the above except uses Maybe Ordering and I didn't see a sorting function anywhere in the package.

Math.Combinatorics.Poset I haven't figured out how to use it or whether it's applicable.

Below is a minimal example which has both binding and non-binding statements. The order of non-biniding statements matters and they must maintain the original order (i.e. sorting needs to be stable w.r.t. statements that don't have a dependence relation).

I hope there is a simple solution to this without using a full-blown dependence graph...

module Stmts where

import Data.List ( sortBy )

data Var = A | B | C | D | E | F | G | H deriving (Eq, Show)
data Stmt = Var := Var
          | Inc Var
  deriving (Show)

-- LHS variable
binds :: Stmt -> Maybe Var
binds (v := _) = Just v
binds _        = Nothing

-- RHS variables
references :: Stmt -> [Var]
references (_ := v) = [v]
references (Inc v)  = [v]

order :: [Stmt] -> [Stmt]
order = sortBy orderStmts

orderStmts :: Stmt -> Stmt -> Ordering
orderStmts s1 s2 = ord mbv1 mbv2
 where
  ord Nothing   Nothing   = EQ  -- No dep since they don't bind vars
  ord (Just v1) Nothing   = LT  -- Binding statements have precedence
  ord Nothing   (Just v2) = GT  -- ^^^
  ord (Just v1) (Just v2)       -- Both statements are binding:
    | v1 `elem` refs2 = LT      --  * s2 depends on s1
    | v2 `elem` refs1 = GT      --  * s1 depends on s2
    | otherwise       = EQ      --  * neither

  -- *Maybe* they bind variables
  mbv1  = binds s1
  mbv2  = binds s2

  -- Variables they reference  
  refs1 = references s1
  refs2 = references s2

-- The following should return [B := A, C := B, D := C, Inc F, Inc G]
test = order [Inc F, Inc G, C := B, D := C, B := A]
Leprosy answered 2/10, 2014 at 9:43 Comment(3)
To be absolutely clear: You're looking for a stable sort which leaves the order unchanged for incomparable elements?Chiropodist
I searched around a bit and found this previous SO question. I also found comments on the web doubting whether "stable topological sort" (which this would be if it exists) is a well-defined thing. Certainly Wikipedia's article on topological sorting does not contain the word "stable" anywhere. My own question in that vein: is it actually possible to always keep the order of incomparable terms?Warmonger
Hm I guess if statements only need to be stable if they have no dependency relation to other statements at all, then there is no problem. For the stronger problem of general stable topological sorting I found a counterexample: Nodes A B C D E F with A > C, B > D, E < C, F < D. Then A and E have no relation to B and F, but if C is sorted before D then B must switch order with E, and if D is sorted before C then A must switch order with F.Warmonger
A
8

The problem with your approach is that your orderStmts is neither an ordering nor a partial ordering. In particular, it's not transitive and this is why the attempts at using it for sorting fail.

What you are looking for is topological sorting. You have a graph of vertices (statements) that have edges between them (their dependencies) and you want to ensure that the ordering matches the edges.

I'll focus only on the declarations, as the non-binding statements are easy (we just need to split the list into two, sort the declarations and concatenate again).

Topological sorting is already implemented in Data.Graph, which makes the task very simple:

module Stmts where

import Data.Graph

data Var = A | B | C | D | E | F | G | H deriving (Eq, Ord, Show)

data Decl = Var := Var 
  deriving (Show, Eq)

data Stmt = Decl
          | Inc Var
  deriving (Show, Eq)

sortDecls :: [Decl] -> [SCC Decl]
sortDecls = stronglyConnComp . map triple
  where
    triple n@(x := y)   = (n, x, [y])

-- The following should return [B := A, C := B, D := C]
test = map flattenSCC . sortDecls $ [C := B, D := C, B := A]

Calling flattenSCC is only for testing, as SCC has no Show instance. You'll probably want to check the SCCs for cycles (a cycle would be a language compilation error), and if there is none, extract the sorted sequence.

Arron answered 3/10, 2014 at 12:42 Comment(9)
Topological sort (used directly) ignore a important restriction "sorting needs to be stable w.r.t. statements that don't have a dependence relation" E.g. "[D := C, C := B, B := A, H := G, F := H]".Untread
@Untread As I wrote, this part is easy: we just need to split the list into two, sort the declarations and concatenate again. It only makes sense to compute the dependencies of declarations, the non-binding statements can be just filtered out and appended later.Arron
Excuse me if I'm wrong but, possible topological sorted results are not unique ("lots" in fact) and "and concatenate again" don't preserve statement "prefered" order (just group order like my steps). But (I think) the question is not clear about that "order preservation".Untread
@Untread The question says that the order of non-binding statements needs to be preserved and the implementation of orderStmts in the question implies that they should go only after the binding statements. So since for the non-binding statements there is no sorting involved, I proposed to exclude them from the dependency ordering, just to append them in the original order after sorting the binding statements. I can also post an updated block of code to clarify, if needed.Arron
(remark that your solution is great) "I proposed to exclude" yes but lots of solutions exists. I see your solution one case of a more general question (but my mind is blurring... :D)Untread
@PetrPudlák You were absolutely right that the non-binding statements simply need to appear at the bottom of the block in the original order. I should have been more explicit about that. Thank you for introducing me to topological graph sorting and explaining about transitivity.Leprosy
(from @Leprosy comments on my response) "in that it tries to respect the original order of all types of statements" ok, yes I was trying to solve a more general problem, sorry my noise :(Untread
did you mean perhaps flattenSCCs? and there's no need for Show instance then, it seems.Pr
@WillNess I meant flattenSCC because it also shows if there is a cyclic dependency or not. But yes, for just showing the result (knowing there are no cycles) flattenSCCs might be more convenient.Arron
U
2

I think the only way to sort your statements groups are walking from roots to children

import Data.List

data Var = A | B | C | D | E | F | G | H deriving (Eq, Show)
data Stmt = Var := Var deriving (Show)

parent :: Stmt -> Var
parent (_ := p) = p

child :: Stmt -> Var
child (c := _) = c

steps :: [Stmt] -> [[Stmt]]
steps st = step roots st
  where step _ [] = []
        step r s = let (a, b) = partition (flip elem r . parent) s
                       (v, u) = partition (flip elem (map child b) . child ) a
                   in  if null u then error "Cycle!"
                                 else u : step (r ++ (nub $ map child u)) (v ++ b)

        roots = let cs = map child st
                    rs = nub $ filter (not . flip elem cs) (map parent st)
                in  if null rs then error "No roots!"
                               else rs

main = mapM_ print $ steps [F := H, G := H, C := B, D := C, B := A]

with output

[F := H,G := H,B := A]
[C := B]
[D := C]

when "sort" is over groups (not statements).

(Stability is granted on this code, since is invariant through partition, map, ++, ...)

(Added)

If you really wish some stability property (sorting your statements) you must to add some other restriction (defining "stability").

Let two "sort" direct algorithms (simply reordering statements to front or to back)

orderToFront :: [Stmt] -> [Stmt]
orderToFront [] = []
orderToFront (s@(_ := p):xs) = let (l, r) = splitAtFirst ((==p).child) xs
                               in  if null r then s: orderToFront xs
                                             else head r: s: orderToFront (l ++ tail r)

orderToBack' :: [Stmt] -> [Stmt]
orderToBack' [] = []
orderToBack' (s@(c := _):xs) = let (l, r) = splitAtFirst ((==c).parent) xs
                               in  if null r then s: orderToBack' xs
                                             else orderToBack' (l ++ head r: s: tail r)
orderToBack = reverse . orderToBack'

splitAtFirst :: (a -> Bool) -> [a] -> ([a], [a])
splitAtFirst f xs = let rs = dropWhile (not.f) xs
                    in  (take (length xs - length rs) xs, rs)


main = do

    let q = [F := H, C := B, D := C, G := F, B := A]

    putStrLn "-- orderToFront"
    mapM_ print $ orderToFront q

    putStrLn "-- orderToBack"
    mapM_ print $ orderToBack q

with the same input, orderToFront output is different than orderToBack output but both are valid

-- orderToFront
F := H
B := A
C := B
D := C
G := F
-- orderToBack
B := A
F := H
G := F
C := B
D := C

(with only equality relation your algorithm cannot be lower than O(n^2) but if you define stability restriction it could be reduced)

Untread answered 2/10, 2014 at 10:50 Comment(2)
I may have confused everyone by my definition of stable. I only meant to say that all non-binding statements need to be sorted to the bottom, preserving the original order. I'm sorry for the confusion.Leprosy
Your solution may actually be the most general of all in that it tries to respect the original order of all types of statements. I may have to use a similar solution in the next iteration of the language if I need "effectful" binding statements whose placement w.r.t. other "effectful" statements matters.Leprosy

© 2022 - 2024 — McMap. All rights reserved.