Will a source-removal sort always return a maximal cycle?

Asked 2/4, 2010 at 17:58 Answered 11/11, 2010 at 18:47

dependencies graph-theory topological-sort

I wrote a source-removal algorithm to sort some dependencies between tables in our database, and it turns out we have a cycle. For simplicity, let's say we have tables A, B, C, and D. The edges are like this:

(A, B)
(B, A)
(B, C)
(C, D)
(D, A)

As you can see, there are two cycles here. One is between A and B and another is between all four of them. Will this type of sort always choke on the largest cycle? Or is that not necessarily the case?

Keffiyeh answered 2/4, 2010 at 17:58 Comment(2)

No idea what you are talking about. But if you had a polynomial time algorithm which always gave you the longest cycle, you have proved P = NP. btw, did you mean longest/maximum instead of maximal? – Vegetal 3/4, 2010 at 17:2

@Moron - possibly. I figured they meant the same thing. :-) – Keffiyeh 9/4, 2010 at 11:37

By source-removal I presume you mean at each step removing a node with no incoming edges.

What I think you are asking for is finding the maximal Euler tour of your graph (i.e. a cycle with unique edges, while nodes can be repeated).

Obviously, no vertex in a cycle can be removed (no vertex in the cycle would have zero incoming edges), so this algorithm certainly preserves all cycles (and the biggest), but still, it doesn't help you find it, the remaining edges are not guaranteed to be part of any cycle (I can easily construct an example where the algorithm you describe retains all edges, while the largest cycle is merely of size two, thus not too helpful in finding the latter).

Here is how you can do it instead:

Perform a Depth-First Search on your graph.

You are interested in recognizing back edges, i.e., in the traversal, an edge which points back to an ancestor (in the DFS tree, which is induced by edges of visiting nodes for the first time) of the visited node. For example, if the DFS stack has nodes [A->B->C->D] and while you explore D you find an edge D->B, that's a back edge. Each back edge defines a cycle.

More importantly, the cycles induced by back-edges are a basic set of cycles of the graph. "A basic set of cycles": you can construct all cycles of the graph just by UNIONing and XORing cycles of the basic set. For example, consider the cycles [A1->A2->A3->A1] and [A2->B1->B2->B3->A2]. You can union them to the cycle: [A1->A2->B1->B2->B3->A2->A3->A1]. Since you want the maximal cycle, you don't need to consider XORs.

Construct the maximal cycle by UNIONing all basic cycles that intersect at a node. (If you do it carefully this should also have a linear time complexity).

On the other hand, if you required a maximal cycle with no repeating vertex, that's going to be much harder than linear :)

Roumell answered 6/4, 2010 at 22:19 Comment(1)

I am actually not trying to find one type of cycle or another. I'd actually prefer to have no cycles (source removal does require a DAG). However, I noticed this behavior from a program that I wrote and was just curious if that behavior is to always be expected. – Keffiyeh 9/4, 2010 at 11:41

Your source removal algorithm (which I will assume means removing nodes with no dependencies one at a time, like Dimitris) will choke on any cycle. In fact, algorithm will remove all nodes that don't depend on the cycles, and the nodes you have left over will either be part of a cycle or depend on a node that is part of a cycle.

Those cycles are also called strongly connected components, and if you replaced each cycle with a single node you would have a DAG.

Finke answered 11/11, 2010 at 18:47 Comment(0)

Recommended topics

Hot tags