Graph serialization
Asked Answered
L

4

48

I'm looking for a simple algorithm to 'serialize' a directed graph. In particular I've got a set of files with interdependencies on their execution order, and I want to find the correct order at compile time. I know it must be a fairly common thing to do - compilers do it all the time - but my google-fu has been weak today. What's the 'go-to' algorithm for this?

Labanna answered 7/8, 2008 at 0:22 Comment(0)
W
67

Topological Sort (From Wikipedia):

In graph theory, a topological sort or topological ordering of a directed acyclic graph (DAG) is a linear ordering of its nodes in which each node comes before all nodes to which it has outbound edges. Every DAG has one or more topological sorts.

Pseudo code:

L ← Empty list where we put the sorted elements
Q ← Set of all nodes with no incoming edges
while Q is non-empty do
    remove a node n from Q
    insert n into L
    for each node m with an edge e from n to m do
        remove edge e from the graph
        if m has no other incoming edges then
            insert m into Q
if graph has edges then
    output error message (graph has a cycle)
else 
    output message (proposed topologically sorted order: L)
Waterway answered 7/8, 2008 at 10:53 Comment(2)
Eh... copied directly off wikipedia?Coryden
Thanks. Helped me map the fact that dependency tree can be sorted using this method. :-)Hamel
P
1

I would expect tools that need this simply walk the tree in a depth-first manner and when they hit a leaf, just process it (e.g. compile) and remove it from the graph (or mark it as processed, and treat nodes with all leaves processed as leaves).

As long as it's a DAG, this simple stack-based walk should be trivial.

Pronounced answered 7/8, 2008 at 0:27 Comment(1)
Yes, that's how you do it. It's called a depth-first search (DFS). And unless you are certain you have a DAG, you mustcheck for back edges, otherwise a cycle will send you into an infinite loop.Underpass
L
1

I've come up with a fairly naive recursive algorithm (pseudocode):

Map<Object, List<Object>> source; // map of each object to its dependency list
List<Object> dest; // destination list

function resolve(a):
    if (dest.contains(a)) return;
    foreach (b in source[a]):
        resolve(b);
    dest.add(a);

foreach (a in source):
    resolve(a);

The biggest problem with this is that it has no ability to detect cyclic dependencies - it can go into infinite recursion (ie stack overflow ;-p). The only way around that that I can see would be to flip the recursive algorithm into an interative one with a manual stack, and manually check the stack for repeated elements.

Anyone have something better?

Labanna answered 7/8, 2008 at 0:30 Comment(1)
instead of a foreach use a while, maintain two pointers traversing the data a, a leading one that double steps and a trailing one that single steps. The leading pointer handles the acutal resolutions and if it ever laps the trailing pointer there is a cycle.Depriest
B
1

If the graph contains cycles, how can there exist allowed execution orders for your files? It seems to me that if the graph contains cycles, then you have no solution, and this is reported correctly by the above algorithm.

Bamford answered 23/1, 2009 at 15:9 Comment(1)
Yes, a topological sort is not possible if a graph contains cycles. This corresponds to the real world: If I ask you to do A before B, and B before A, there's no way you're gonna satisfy me ;-).Underpass

© 2022 - 2024 — McMap. All rights reserved.