Recently I needed to implement non-recursive DFS as part of a more complicated algorithm, Tarjan's algorithm to be precise. The recursive implementation is very elegant, but not suitable for large graphs. When I implemented the iterative version, I was shocked at how inelegant it finally ended up being, and I was wondering if I had done something wrong.
There's two basic approaches to iterative DFS. First, you can push all the children of a node at once onto the stack (seems by far more common). Or you can just push one. I will focus on the first one as that seems how everyone does it.
I had various problems with this algorithm, and eventually I realized that to do it efficiently, I needed not 1, not 2, but 3 boolean flags (I don't necessarily mean you need three explicit boolean variables, you might store the information indirectly via special values of variables that are usually integers, but you need to access those 3 pieces of information one way or another. The three flags were: 1) visited. This was to prevent children from being pushed onto the stack very redundantly. 2) Done. To prevent redundant processing of the same node. 3) Ascending/descending. To indicate whether the children had already been pushed onto the stack. The pseudocode looks something like this:
while(S)
if S.peek().done == True
S.pop()
continue
S.peek().visited = True
if S.peek().descending == True
S.peek().descending = False
for c in S.peek().children
if c.visited == False
S.push(c)
doDescendingStuff()
else
w = S.pop()
w.done = True
doAscendingStuff()
Some notes: 1) You don't need ascending/descending technically, as you could just see if the children are all done or not. But it's pretty inefficient in a dense graph.
2), The main kicker: The visited/done thing might not seem necessary. Here's why (I think) you need it. You can't mark things visited until you visit them on the stack. If you do, you can process things in the wrong order. E.g. suppose A is linked to B and C, B links to D, and D links to C. Then from A, you will push B and C on the stack. From B you push D on the stack... and then what? If you are marking things visited when you push them on the stack, you won't push C on the stack here. But this is wrong, C should be visited from D, not from A in this graph (assuming that A visits B before C). So, you don't mark things visited until you process them. But then, you will have C on the stack twice. So you need another flag to show you are completely done with it, so you don't process C a second time.
I don't see how to avoid all this to having a perfectly correct non-recursive DFS that supports actions both winding and unwinding. But instinctively it feels crufty. Is there a better way? Almost every place that I have consulted online really glosses over how to actually implement non-recursive DFS, saying that it can be done and providing a very basic algorithm. When the algorithm is correct (in terms of properly supporting multiple paths to the same node) which is rare, it rarely properly supports doing stuff both on winding and unwinding.
if S.peek().done == True
withif S.peek().visited == True
). In your example, you wouldn't process C twice, since you'd setvisited = True
when processing C from D. – Marchelle