Why DFS and not BFS for finding cycle in graphs
Asked Answered
S

12

116

Predominantly DFS is used to find a cycle in graphs and not BFS. Any reasons? Both can find if a node has already been visited while traversing the tree/graph.

Seawright answered 19/5, 2010 at 21:42 Comment(2)
In directed graphs, only DFS can be used to detect a cycle; but in Undirected graphs both can be used.Scutate
are you sure if you are still saying BFS is not applicable on Directed Graph? ref: geeksforgeeks.org/detect-cycle-in-a-directed-graph-using-bfsSocage
A
95

Depth first search is more memory efficient than breadth first search as you can backtrack sooner. It is also easier to implement if you use the call stack but this relies on the longest path not overflowing the stack.

Also if your graph is directed then you have to not just remember if you have visited a node or not, but also how you got there. Otherwise you might think you have found a cycle but in reality all you have is two separate paths A->B but that doesn't mean there is a path B->A. For example,

If you do BFS starting from 0, it will detect as cycle is present but actually there is no cycle.

With a depth first search you can mark nodes as visited as you descend and unmark them as you backtrack. See comments for a performance improvement on this algorithm.

For the best algorithm for detecting cycles in a directed graph you could look at Tarjan's algorithm.

Ashien answered 19/5, 2010 at 21:44 Comment(7)
(Memory efficient because you get to backtrack sooner, and easier to implement because you can just let the stack take care of storing the open list instead of having to explicitly maintain it.)Coady
IMO, it's only easier if you can rely on tail recursion.Palomino
"unmark them as you backtrack" - at your own peril! This can easily lead to O(n^2) behavior, specifically such a DFS would misunderstand cross edges as "tree" edges ("tree" edges would also be a misnomer since they wouldn't actually form a tree anymore)Debenture
@Dimitris Andreo: You can use three visited states instead of two to improve performance. With directed graphs there's a difference between 'I've seen this node before' and 'This node is part of a loop'. With undirected graphs they are equivalent.Ashien
Exactly, you definitely need a third state (to make the algorithm linear), so you should consider revising that part.Debenture
@MarkByers: I have posted a question in this link based on your answer here. can you comment on this plz: #20555791Belle
@DimitrisAndreou: I have posted a question where I do not use backtracking nor a third state is essential, but could detect cycles in directed graph. can you pass you valuable comment here: #20555791Belle
P
35
  1. DFS is easier to implement
  2. Once DFS finds a cycle, the stack will contain the nodes forming the cycle. The same is not true for BFS, so you need to do extra work if you want to also print the found cycle. This makes DFS a lot more convenient.
Protuberancy answered 19/5, 2010 at 21:51 Comment(0)
A
21

I don't know why such an old question popped up in my feed, but all the previous answers are bad, so...

DFS is used to find cycles in directed graphs, because it works.

In a DFS, every vertex is "visited", where visiting a vertex means:

  1. The vertex is started
  2. The subgraph reachable from that vertex is visited. This includes tracing all untraced edges that are reachable from that vertex, and visiting all reachable unvisited vertexes.

  3. The vertex is finished.

The critical feature is that all edges reachable from a vertex are traced before the vertex is finished. This is a feature of DFS, but not BFS. In fact this is the definition of DFS.

Because of this feature, we know that when the first vertex in a cycle is started:

  1. None of the edges in the cycle have been traced. We know this, because you can only get to them from another vertex in the cycle, and we're talking about the first vertex to be started.
  2. All untraced edges reachable from that vertex will be traced before it is finished, and that includes all the edges in the cycle, because none of them has been traced yet. Therefore, if there is a cycle, we will find an edge back to the first vertex after it is started, but before it is finished; and
  3. Since all edges that are traced are reachable from every started-but-unfinished vertex, finding an edge to such a vertex always indicates a cycle.

So, if there is a cycle, then we are guaranteed to find an edge to a started-but-unfinished vertex (2), and if we find such an edge, then we are guaranteed that there is a cycle (3).

That's why DFS is used to find cycles in directed graphs.

BFS provides no such guarantees, so it just doesn't work. (notwithstanding perfectly good cycle-finding algorithms that include BFS or similar as a sub-procedure)

An undirected graph, on the other hand, has a cycle whenever there are two paths between any pair of vertexes, i.e., when it's not a tree. This is easy to detect during either BFS or DFS -- The edges traced to new vertexes form a tree, and any other edge indicates a cycle.

Amey answered 3/10, 2019 at 1:59 Comment(1)
Indeed, this is the most (maybe the only) related answer here, elaborating on the actual reasons.Bay
D
15

A BFS could be reasonable if the graph is undirected (be my guest at showing an efficient algorithm using BFS that would report the cycles in a directed graph!), where each "cross edge" defines a cycle. If the cross edge is {v1, v2}, and the root (in the BFS tree) that contains those nodes is r, then the cycle is r ~ v1 - v2 ~ r (~ is a path, - a single edge), which can be reported almost as easily as in DFS.

The only reason to use a BFS would be if you know your (undirected) graph is going to have long paths and small path cover (in other words, deep and narrow). In that case, BFS would require proportionally less memory for its queue than DFS' stack (both still linear of course).

In all other cases, DFS is clearly the winner. It works on both directed and undirected graphs, and it is trivial to report the cycles - just concat any back edge to the path from the ancestor to the descendant, and you get the cycle. All in all, much better and practical than BFS for this problem.

Debenture answered 20/5, 2010 at 8:53 Comment(0)
F
6

BFS wont work for a directed graph in finding cycles. Consider A->B and A->C->B as paths from A to B in a graph. BFS will say that after going along one of the path that B is visited. When continuing to travel the next path it will say that marked node B has been again found,hence, a cycle is there. Clearly there is no cycle here.

Filose answered 29/1, 2015 at 6:0 Comment(6)
Can you explain how DFS will clearly identify that cycle does not exist in your example.I agree that the cycle does not exist in provided example.But if we go from A->B and then A->C->B we will find that B was already visited and its parent is A not C..and i read that DFS will detect the cycle by comparing the parent of already visited element with current node from which direction we are checking at this moment.am i getting DFS wrong or what?Harlan
All you've shown here is that this particular implementation doesn't work, not that it's impossible with BFS. In fact, it is possible, although it takes more work and space.Donelson
@Prune: All the threads(i think) here are trying to prove that bfs wont work for detecting cycles. If you know how to counter prove you should give the proof. Simply saying that efforts are greater wont sufficeFilose
Since the algorithm is given in the linked postings, I don't feel it's appropriate to repeat the outline here.Donelson
I couldnt find any linked postings, hence asked for the same. I agree with your point about bfs capability and have just thought about the implementation. Thanks for the tip :)Filose
BFS can work for cycle detection in Directed graph too, Look for Kahn's algorithm which uses DAG facts to detect cycle 1. In DAG there must be atleast 1 root(indeg=0) 2. After each predecessor node is processed, there must be atleast 1 new root node. This is done be reducing indegree of direct successor nodes at each root node starting from topmost root node(indeg=0). More info: geeksforgeeks.org/detect-cycle-in-a-directed-graph-using-bfsSocage
F
3

If you place a cycle at a random spot in a tree, DFS will tend to hit the cycle when it's covered about half the tree, and half the time it will have already traversed where the cycle goes, and half the time it will not (and will find it on average in half the rest of the tree), so it will evaluate on average about 0.5*0.5 + 0.5*0.75 = 0.625 of the tree.

If you place a cycle at a random spot in a tree, BFS will tend to hit the cycle only when it's evaluated the layer of the tree at that depth. Thus, you usually end up having to evaluate the leaves of a balance binary tree, which generally results in evaluating more of the tree. In particular, 3/4 of the time at least one of the two links appear in the leaves of the tree, and on those cases you have to evaluate on average 3/4 of the tree (if there is one link) or 7/8 of the tree (if there are two), so you're already up to an expectation of searching 1/2*3/4 + 1/4*7/8 = (7+12)/32 = 21/32 = 0.656... of the tree without even adding the cost of searching a tree with a cycle added away from the leaf nodes.

In addition, DFS is easier to implement than BFS. So it's the one to use unless you know something about your cycles (e.g. cycles are likely to be near the root from which you search, at which point BFS gives you an advantage).

Forcer answered 19/5, 2010 at 21:53 Comment(4)
A lot of magic numbers there. I disagree with the "DFS is faster" arguments. It depends entirely on the input, and no input is more common than another in this case.Protuberancy
@Vlad - The numbers aren't magic. They are means, are are stated as such, and are almost trivial to calculate given the assumptions I stated. If approximating by the mean is a bad approximation, that would be a valid criticism. (And I explicitly stated that if you could make assumptions about the structure, the answer could change.)Forcer
the numbers are magical because they don't mean anything. You took a case DFS does better on and extrapolated those results to the general case. Your statements are unfounded: "DFS will tend to hit the cycle when it's covered about half the tree": prove it. Not to mention that you cannot talk about cycles in a tree. A tree does not have a cycle by definition. I just don't see what your point is. DFS will go one way until it hits a dead end, so you have no way of knowing how much of the GRAPH (NOT tree) it will explore on average. You just picked a random case that proves nothing.Protuberancy
@Vlad - All noncyclic fully connected undirected graphs are (unrooted undirected) trees. I meant "a graph that would be a tree save for one spurious link". Perhaps this isn't the main application for the algorithm--maybe you want to find cycles in some tangled graph that has very many links that make it not a tree. But if it is tree-like, averaged over all graphs, any node is equally likely to be the source of said spurious link, which makes the expected tree coverage 50% when the link is hit. So I accept that the example may not have been representative. But the math should be trivial.Forcer
H
2

You'll have to use BFS when you want to find the shortest cycle containing a given node in a directed graph.

Eg:enter image description here

If the given node is 2, there are three cycles where it is part of - [2,3,4], [2,3,4,5,6,7,8,9] & [2,5,6,7,8,9]. Shortest is [2,3,4]

For implementing this using BFS, you have to explicitly maintain history of visited nodes using proper data structures.

But for all other purposes (eg: to find any cyclical path or to check if a cycle exists or not), DFS is the clear choice for reasons mentioned by others.

Hancock answered 28/5, 2020 at 10:2 Comment(0)
R
1

To prove that a graph is cyclic you just need to prove it has one cycle(edge pointing towards itself either directly or indirectly).

In DFS we take one vertex at a time and check if it has cycle. As soon as a cycle is found we can omit checking other vertices.

In BFS we need to keep track of many vertex edges simultaneously and more often than not at the end you find out if it has cycle. As the size of the graph grows BFS requires more space, computation and time compared to DFS.

Recant answered 28/7, 2015 at 1:32 Comment(0)
P
1

I found that both BFS and DFS can be used to detect a cycle. Some questions mentioned that BFS can not be used with directed graph. I humble disagree.

In BFS-Queue we can keep track of parent node list/set, and if the same node is encountered again(except immediate parent) we can mark it a cycle. So this way BFS should work with directed graph also.

Although this will be highly memory inefficient in comparison to DFS and that is the reason we mainly use DFS.

  1. DFS is memory efficient
  2. DFS is easier to visualize as it already uses a explicit/implicit stack
Prepared answered 21/11, 2021 at 8:17 Comment(0)
G
0

It sort of depends if you are talking about recursive or iterative implementations.

Recursive-DFS visits every node twice. Iterative-BFS visits every node once.

If you want to detect a cycle, you need to investigate the nodes both before and after you add their adjacencies -- both when you "start" on a node and when you "finish" with a node.

This requires more work in Iterative-BFS so most people choose Recursive-DFS.

Note that a simple implementation of Iterative-DFS with, say, std::stack has the same problem as Iterative-BFS. In that case, you need to place dummy elements into the stack to track when you "finish" working on a node.

See this answer for more details on how Iterative-DFS requires additional work to determine when you "finish" with a node (answered in the context of TopoSort):

Topological sort using DFS without recursion

Hopefully that explains why people favor Recursive-DFS for problems where you need to determine when you "finish" processing a node.

Gravimeter answered 9/2, 2016 at 20:28 Comment(1)
This is totally wrong, since it doesn't matter whether you use recursion or you eliminate recursion by iteration. You can implement an iterative DFS which visits every node twice, just like you can implement a recursive variant which visits every node only once.Bay
R
0

you can use either of one both are able to do task. but many people more prefer DSF because it help in further graph algorithms and also DFS in easy to implement. also DFS is space optimized way compare to BFS. but it's totally depend on you what you choose.

Rabush answered 1/3 at 20:59 Comment(1)
Your answer could be improved with additional supporting information. Please edit to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers in the help center.Subvene
C
-1

DFS is optimized approach then bfs.

Cachet answered 25/3, 2023 at 13:30 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.