If topological sort uses DFS, how can it succeed on disconnected graphs?
Asked Answered
E

2

13

There's a gap in my knowledge but I'm not sure exactly where. Topological sorting can be done using depth first search, as wikipedia explains. However I've only seen depth first search implemented for trees, where as topological sort is for DAGs.

  1. is a tree a special case of a DAG where the implied direction of the edges is from the root node down
  2. is the algorithm used for topological sort not really doing DFS, only something a lot like it?

For example, topological sort can handle disconnected graphs where as DFS cannot traverse a node with no edges connecting it...can it?

England answered 19/4, 2016 at 8:1 Comment(1)
Don't forget to accept an answer, or explain what is not clear in the anwer you received.Galling
G
13

Because when used for topological sort you do a DFS on every nodes. If one of the children is already visited by a previous DFS (colored black). Then it is already pushed in the output vector and so you the dependency is already done.

Quoting your link (emphasis mine):

The algorithm loops through each node of the graph, in an arbitrary order, initiating a depth-first search ...

As the wikipedia article is a bit confusing (in my opinion) I will try to better describe the algorithm:

     V: set of vertices
     E: set of edges
     E.adj(v): set of vertices adjacent to vertex v

 0 function topological_sort(V,E):
 1   for v in V:
 2     paint v white
 3
 4   for v in V:
 5     if v is white:
 6       dfs(v)

 7 function dfs(v):
 8   paint v grey
 9   for child in E.adj(v)
10     if child is white:
11       dfs(child)
12   paint v black
13   push v to output

We can easily compute the complexity:

  • We paint a vertex white, grey and black once per vertex: O(V)
  • We check the color of the vertex at line 5 once per vertex: O(V)
  • We check the color of the vertex at line 10 once per edge: O(E)
  • We push the vertex to output at line 13 once per vertex: O(V)

So the final complexity is: O(V+E). It is pretty efficient.

One of the strength of this algorithm is that it doesn't need to modify the input graph. We can easily implement coloring by a temporary hashtable with a size in O(V). Some other topological sort algorithm requires to destroy the graph when they proceed (by removing edges). This would require to copy the graph before running the toplogical sort if you still need the graph after the sort.

Galling answered 19/4, 2016 at 8:10 Comment(9)
What do you mean you do a DFS on every node? Are you saying for a graph with n nodes you'd run DFS n times?England
Exactly. But, this is important, you stop going deep if you encounter a node already painted black by a previous DFS calls.Galling
Nonetheless this seems awfully inefficient to even initialize DFS n times.England
You visit each node at most once, and each edge at most once.Galling
@England See my update for the complexity computationGalling
Thanks for the update. Correct me if I'm wrong, but wouldn't DFS implementations normally "reset" the visited state of the nodes after they return? It appears in your example the paint still stays on a node after dfs() ends. I guess what I'm asking/saying is: isn't a DFS implementation specific for topological sorting required, not any old DFS implementation? Let me know if that didn't make sense and I'll reword.England
@England It really depends on how you store the color information. You can very well store colors in a hashtable, local to your topological sort function, in which case there is no need to clear colors ater the sort. If you store the color inside the graph then yes you will have to restore it to their original values.Galling
@Galling Why do you paint v grey in line 8? Can you paint it black straight away?Interlunar
@SamShen In this specific case it doesn't matter. Usually when you do a DFS you want to know the difference between a grey and black node.Galling
O
1

You might wish to try to add a new "source" node to your graph, and connect it with each and every other nodes with a directed edge. Then start your search/traverse from this new node. This approach might or might not suit your needs.

Oriflamme answered 19/4, 2016 at 11:0 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.