How to detect if adding an edge to a directed graph results in a cycle?
Asked Answered
A

5

42

I came upon wait-for graphs and I wonder, are there any efficient algorithms for detecting if adding an edge to a directed graph results in a cycle?

The graphs in question are mutable (they can have nodes and edges added or removed). And we're not interested in actually knowing an offending cycle, just knowing there is one is enough (to prevent adding an offending edge).

Of course it'd be possible to use an algorithm for computing strongly connected components (such as Tarjan's) to check if the new graph is acyclic or not, but running it again every time an edge is added seems quite inefficient.

Agna answered 27/11, 2013 at 15:26 Comment(7)
Do you have memory or time constraints in mind?Capsulize
@MarkElliot The graphs are used to coordinate locking of many concurrently running threads. I expect the graph to have dozens to hundreds of nodes and the number of edges could be quadratic in the number of nodes. The graph will be frequently updated, and all updates will need to check, if they don't cause a cycle.Agna
A naive starting approach would be to carry a data structure at each node of nodes which can reach the current node. This would make cycle detection cheap on addition and update, but add complexity to remove. I'm wondering if there's a data structure better suited to tracking this (an adjacency matrix?), but am still thinking through stuff.Capsulize
From the description it sounds like edge removal will be about as frequent as adding, which pretty much rules out any data caching since removal invalidates everything (or at least I don't see how you can avoid this). My guess, therefore, is that the simplest solution (when adding edge A->B, flood fill from B and check that A is not reachable) is going to be the most efficient.Hoosegow
If the graph is from a "wait-for graph" then nodes can only be removed when something completes. So deletion only occurs at the leaf nodes. Am I correct?Criseldacrisey
@PetrPudlák can you evolve what you're missing from Thomas's answer? I consider this question answered. You're looking to detect cycles at time of insertion?Swarey
@Criseldacrisey That is not necessarily true. A node can be waiting for another node with a timeout, and when it elapses, the node is removed.Agna
U
45

If I understood your question correctly, then a new edge (u,v) is only inserted if there was no path from v to u before (i.e., if (u,v) does not create a cycle). Thus, your graph is always a DAG (directed acyclic graph). Using Tarjan's Algorithm to detect strongly connected components (http://en.wikipedia.org/wiki/Tarjan%27s_strongly_connected_components_algorithm) sounds like an overkill in this case. Before inserting (u,v), all you have to check is whether there is a directed path from v to u, which can be done with a simple BFS/DFS.

So the simplest way of doing it is the following (n = |V|, m = |E|):

  • Inserting (u,v): Check whether there is a path from v to u (BFS/DFS). Time complexity: O(m)
  • Deleting edges: Simply remove them from the graph. Time complexity: O(1)

Although inserting (u,v) takes O(m) time in the worst case, it is probably pretty fast in your situation. When doing the BFS/DFS starting from v to check whether u is reachable, you only visit vertices that are reachable from v. I would guess that in your setting the graph is pretty sparse and that the number of vertices reachable by another is not that high.

However, if you want to improve the theoretical running time, here are some hints (mostly showing that this will not be very easy). Assume we aim for testing in O(1) time whether there exists a directed path from v to u. The keyword in this context is the transitive closure of a DAG (i.e., a graph that contains an edge (u, v) if and only if there is a directed path from u to v in the DAG). Unfortunately, maintaining the transitive closure in a dynamic setting seems to be not that simple. There are several papers considering this problem and all papers I found were STOC or FOCS papers, which indicates that they are very involved. The newest (and fastest) result I found is in the paper Dynamic Transitive Closure via Dynamic Matrix Inverse by Sankowski (http://dl.acm.org/citation.cfm?id=1033207).

Even if you are willing to understand one of those dynamic transitive closure algorithms (or even want to implement it), they will not give you any speed up for the following reason. These algorithms are designed for the situation, where you have a lot of connectivity queries (which then can be performed in O(1) time) and only few changes in the graph. The goal then is to make these changes cheaper than recomputing the transitive closure. However, this update is still slower that a single check for connectivity. Thus, if you need to do an update on every connectivity query, it is better to use the simple approach mentioned above.

So why do I mention this approach of maintaining the transitive closure if it does not fit your needs? Well, it shows that searching an algorithm consuming only O(1) query time does probably not lead you to a solution faster than the simple one using BFS/DFS. What you could try is to get a query time that is faster than O(m) but worse than O(1), while updates are also faster than O(m). This is a very interesting problem, but it sounds to me like a very ambitious goal (so maybe do not spend too much time on trying to achieve it..).

Underbodice answered 30/11, 2013 at 8:36 Comment(0)
I
5

As Mark suggested it is possible to use data structure that stores connected nodes. It is the best to use boolean matrix |V|x|V|. Values can be initialized with Floyd–Warshall algorithm. That is done in O(|V|^3).

Let T(i) be set of vertices that have path to vertex i, and F(j) set of vertices where exists path from vertex j. First are true's in i'th row and second true's in j'th column.

Adding an edge (i,j) is simple operation. If i and j wasn't connected before, than for each a from T(i) and each b from F(j) set matrix element (a,b) to true. But operation isn't cheap. In worst case it is O(|V|^2). That is in case of directed line, and adding edge from end to start vertex makes all vertices connected to all other vertices.

Removing an edge (i,j) is not so simple, but not more expensive operation in the worst case :-) If there is a path from i to j after removing edge, than nothing changes. That is checked with Dijkstra, less than O(|V|^2). Vertices that are not connected any more are (a,b):

  • a in T(i) - i - T(j),
  • b in F(j) + j

Only T(j) is changed with removing edge (i,j), so it has to be recalculated. That is done by any kind of graph traversing (BFS, DFS), by going in opposite edge direction from vertex j. That is done in less then O(|V|^2). Since setting of matrix element is in worst case is again O(|V|^2), this operation has same worst case complexity as adding edge.

Inlaid answered 29/11, 2013 at 20:36 Comment(0)
C
1

This is a problem which I recently faced in a slightly different situation (optimal ordering of interdependent compiler instructions).

While I can't improve on O(n*n) theoretical bounds, after a fair amount of experimentation and assuming heuristics for my case (for example, assuming that the initial ordering wasn't created maliciously) the following was the best compromise algorithm in terms of performance.

(In my case I had an acceptable "right side failure": after the initial nodes and arcs were added (which was guaranteed to be possible), it was acceptable for the optimiser to occasionally reject the addition of further arcs where one could actually be added. This approximation isn't necessary for this algorithm when carried to completion, but it does admit such an approximation if you wish to do so, and so limiting its runtime further).

While a graph is topologically sorted, it is guaranteed to be cycle-free. In the first phase when I had a static bulk of nodes and arcs to add, I added the nodes and then topologically sorted them.

During the second phase, adding additional arcs, there are two situations when considering an arc from A to B. If A already lies to the left of B in the sort, an arc can simply be added and no cycle can be generated, as the list is still topologically sorted.

If B is to the left of A, we consider the sub-sequence between B and A and partition it into two disjoint sequences X, Y, where X is those nodes which can reach A (and Y the others). If A is not reachable from B, ie there are no direct arcs from B into X or to A, then the sequence can be reordered XABY before adding the A to B arc, showing it is still cycle-free and maintaining the topological sort. The efficiency over the naive algorithm here is that we only need consider the subsequence between B and A as our list is topologically sorted: A is not reachable from any node to the right of A. For my situation, where localised reorderings are the most frequent and important, this an important gain.

As we don't reorder within the sequences X,A,B,Y, clearly any arcs which start or end within the same sequence are still ordered correctly, and the same in each flank, and any "fly-over" arcs from the left to the right flanks. Any arcs between the flanks and X,A,B,Y are also still ordered correctly as our reordering is restricted to this local region. So we only need to consider arcs between our four sequences. Consider each possible "problematic" arc for our final ordering XABY in turn: YB YA YX BA BX AX. Our initial order was B[XY]A, so AX and YB cannot occur. X reaches A, but Y does not, therefore YX and YA do not occur or A could be reached from the source of the arc in Y (potentially via X) a contradiction. Our criterion for acceptability was that there are no links BX or BA. So there are no problematic arcs, and we are still topologically sorted.

Our only acceptability criterion (that A is not reachable from B) is clearly sufficient to create a cycle on adding the arc A->B: B -(X)-> A -> B, so the converse is also shown.

This can be implemented reasonably efficiently if we can add a flag to each node. Consider the nodes [BXY] going right-to-left from the node immediately to the left of A. If that node has a direct arc to A then set the flag. At an arbitrary such node, we need only consider direct outgoing arcs: the nodes to its right are either after A (and so irrelevant), or else have already been flagged if reachable from A, so the flag on such an arbitrary node is set when any flagged nodes are encountered by direct link. If B is not flagged at the end of the process, the reordering is acceptable and the flagged nodes comprise X.

Though this always yields a correct ordering if carried to completion (as far as I can tell), as I mentioned in the introduction it is particularly efficient if your initial build is approximately correct (in the sense of accommodating of likely additional arcs without reordering).

There also exists an effective approximation, if your context is such that "outrageous" arcs can be rejected (those which would massively reorder) by limiting the A to B distance you are prepared to scan. If you have an initial list of the additional arcs you wish to add, they can be ordered by increasing distance in the initial ordering until you run out of some scanning "credit", and call your optimisation a day at that point.

Cormophyte answered 16/12, 2022 at 11:30 Comment(0)
C
0

If all previous jobs are in Topologically sorted order. Then if you add an edge that appears to brake the sort, and can not be fixed, then you have a cycle.

https://mcmap.net/q/36462/-which-cycle-detection-within-a-directed-graph-are-more-efficient-than-o-n-2-closed

So if we have a sorted list of nodes:

1, 2, 3, ..., x, ..., z, ...
Such that each node is waiting for nodes to its left.

Say we want to add an edge from x->z. Well that appears to brake the sort. So we can move the node at x to position z+1 which will fix the sort iif none of the nodes (x, z] have an edge to the node at x.

Criseldacrisey answered 6/12, 2013 at 16:24 Comment(0)
C
0

If the graph is directed, you would only have to check the parent nodes (navigate up until you reach the root) of the node where the new edge should start. If one of the parent nodes is equal to the end of the edge, adding the edge would create a cycle.

Cowpea answered 6/12, 2013 at 16:45 Comment(1)
The graph need not be a tree. The root may be reachable through different ancestors, and you would have to check all of them. Not that this is particularly costly, but it is a potentially O(|E|) solution -- and exactly what @Underbodice is proposing.Bul

© 2022 - 2024 — McMap. All rights reserved.