How does a Breadth-First Search work when looking for Shortest Path?
Asked Answered
T

9

180

I've done some research, and I seem to be missing one small part of this algorithm. I understand how a Breadth-First Search works, but I don't understand how exactly it will get me to a specific path, as opposed to just telling me where each individual node can go. I guess the easiest way to explain my confusion is to provide an example:

So for instance, let's say I have a graph like this:

enter image description here

And my goal is to get from A to E (all edges are unweighted).

I start at A, because that's my origin. I queue A, followed by immediately dequeueing A and exploring it. This yields B and D, because A is connected to B and D. I thus queue both B and D.

I dequeue B and explore it, and find that it leads to A (already explored), and C, so I queue C. I then dequeue D, and find that it leads to E, my goal. I then dequeue C, and find that it also leads to E, my goal.

I know logically that the fastest path is A->D->E, but I'm not sure how exactly the breadth-first search helps - how should I be recording paths such that when I finish, I can analyze the results and see that the shortest path is A->D->E?

Also, note that I'm not actually using a tree, so there are no "parent" nodes, only children.

Tammitammie answered 5/12, 2011 at 0:35 Comment(1)
"Also, note that I'm using not actually using a tree, so there are no "parent" nodes, only children" - well you obviously will have to store the parent somewhere. For DFS you're doing it indirectly through the call stack, for BFS you have to do it explicitly. Nothing you can do about it I fear :)Reconstructive
A
116

Technically, Breadth-first search (BFS) by itself does not let you find the shortest path, simply because BFS is not looking for a shortest path: BFS describes a strategy for searching a graph, but it does not say that you must search for anything in particular.

Dijkstra's algorithm adapts BFS to let you find single-source shortest paths.

In order to retrieve the shortest path from the origin to a node, you need to maintain two items for each node in the graph: its current shortest distance, and the preceding node in the shortest path. Initially all distances are set to infinity, and all predecessors are set to empty. In your example, you set A's distance to zero, and then proceed with the BFS. On each step you check if you can improve the distance of a descendant, i.e. the distance from the origin to the predecessor plus the length of the edge that you are exploring is less than the current best distance for the node in question. If you can improve the distance, set the new shortest path, and remember the predecessor through which that path has been acquired. When the BFS queue is empty, pick a node (in your example, it's E) and traverse its predecessors back to the origin. This would give you the shortest path.

If this sounds a bit confusing, wikipedia has a nice pseudocode section on the topic.

Alix answered 5/12, 2011 at 0:52 Comment(10)
Thank you! I had read over the pseudocode previously but was unable to understand it, your explanation made it click for meTammitammie
I'd like to make the following note for people that look at this post in the future: If the edges are unweighted, there is no need to store the "current shortest distance" for each node. All that needs to be stored is the parent for each node discovered. Thus, while you are examining a node and enqueueing all of its successors, simply set the parent of those nodes to the node you are examining (this will double as marking them "discovered").If this parent pointer is NUL/nil/None for any given node, it means either that it has not yet been discovered by BFS or it is the source/root node itself.Agonistic
@Agonistic If we are not maintaining distance then how would we know the shortest distance,please explain more.Ceolaceorl
@gauravsehgal That comment is for graphs with unweighted edges. BFS will find the shortest distance simply because of its radial-search pattern which considers nodes in order of their distance from the starting point.Agonistic
A tip for readers of @Shashank's comment: unweighted and uniformly weighted (e.g. all edges have weight=5) are equivalent.Shuttlecock
It can be shown that, for unweighted graphs, the BFS is equivalent to Dijkstra's algorithm and thus finds the shortest path in this circumstance.Councilman
@dasblinkenlight I understand from definition of Dijkstra is that it finds the shortest path. But in case of a weighted graph it looks for a path that has the minimum weights which could be shortest . So is it safe to say that in weighted graph Dijkstra will not always find the shortest path?Keyhole
@Keyhole Absolutely. Dijkstra's algorithm is not minimizing the number of edges in the path in any way, only the total weight of the path. In an unweighted graph this happens to coincide with the shortest path, because all edges have the same weight.Alix
@Agonistic but if there is more than one parent like in the question example, where E has both C and D as parents..Orjonikidze
@Orjonikidze As Shashank says, "because of its radial-search pattern which considers nodes in order of their distance from the starting point.". So they are ordered appropriately because their distance are $c\cdot k$ where $k$ is the iteration index when the node is first met (prev[v] in the while loop can't changed more than once in wikipedia pseudocode). But in weighed case the parent may be changed due to the smaller distance because they are calculated by summing up different weights (e.g. 3-length edge sequence 1+2+3 will be smaller than 1-length sequence 10).Rhoades
M
74

As pointed above, BFS can only be used to find shortest path in a graph if:

  1. There are no loops

  2. All edges have same weight or no weight.

To find the shortest path, all you have to do is start from the source and perform a breadth first search and stop when you find your destination Node. The only additional thing you need to do is have an array previous[n] which will store the previous node for every node visited. The previous of source can be null.

To print the path, simple loop through the previous[] array from source till you reach destination and print the nodes. DFS can also be used to find the shortest path in a graph under similar conditions.

However, if the graph is more complex, containing weighted edges and loops, then we need a more sophisticated version of BFS, i.e. Dijkstra's algorithm.

Macrophage answered 13/9, 2013 at 5:7 Comment(8)
Dijkstra if no -ve weights else use bellman ford algo if -ve weightsUnific
Does BFS work to find all the shortest paths between two nodes?Upheaval
@javaProgrammer, it is not right. BFS can be used to find shortest path in an unweighted cyclic graph as well. If a graph is unweighted, then BFS can be applied for SP regardless of having loops.Escadrille
To print the path, simple loop through the previous[] array from source till you reach destination. But previous of source is null. I think what you meant is, backtrace from destination using the previous array until you reach the source.Truce
Why do you say DFS will work under similar conditions? Is it not possible for DFS to take a naive round-about route to get from nodes start->end, and therefore give you a path that is not the shortest?Flushing
storing the previous nodes of visited nodes & printing the array doesn't give the shortest path. Did you try it? It has some additional backtracked nodes too. Do you have that code anywhere?Ciceronian
Imagine i doing a BFS in a unweighted graph and everytime that a node is visited first time(because have not null verification) i store the previous node in this node. Obviosly if we backtrace from the end node to start node we will be able to get shortest pathBromleigh
"There are no loops" - this is incorrect. See my answer https://mcmap.net/q/136345/-how-does-a-breadth-first-search-work-when-looking-for-shortest-pathCorydalis
D
31

From tutorial here

"It has the extremely useful property that if all of the edges in a graph are unweighted (or the same weight) then the first time a node is visited is the shortest path to that node from the source node"

Disbranch answered 17/12, 2012 at 5:38 Comment(1)
This is good for directly reachable node (1->2) (2 is reached directly from 1). For non-directly reachable node there is more work (1->2->3, 3 is not reached directly from 1). Of course, it is still true considering individually, i.e. 1->2 & 2->3 individually are shortest paths.Runyon
R
17

I have wasted 3 days
ultimately solved a graph question
used for
finding shortest distance
using BFS

Want to share the experience.

When the (undirected for me) graph has
fixed distance (1, 6, etc.) for edges

#1
We can use BFS to find shortest path simply by traversing it
then, if required, multiply with fixed distance (1, 6, etc.)

#2
As noted above
with BFS
the very 1st time an adjacent node is reached, it is shortest path

#3
It does not matter what queue you use
   deque/queue(c++) or
   your own queue implementation (in c language)
   A circular queue is unnecessary

#4
Number of elements required for queue is N+1 at most, which I used
(dint check if N works)
here, N is V, number of vertices.

#5
Wikipedia BFS will work, and is sufficient.
    https://en.wikipedia.org/wiki/Breadth-first_search#Pseudocode

I have lost 3 days trying all above alternatives, verifying & re-verifying again and again above
they are not the issue.
(Try to spend time looking for other issues, if you dint find any issues with above 5).


More explanation from the comment below:

      A
     /  \
  B       C
 /\       /\
D  E     F  G

Assume above is your graph
graph goes downwards
For A, the adjacents are B & C
For B, the adjacents are D & E
For C, the adjacents are F & G

say, start node is A

  1. when you reach A, to, B & C the shortest distance to B & C from A is 1

  2. when you reach D or E, thru B, the shortest distance to A & D is 2 (A->B->D)

similarly, A->E is 2 (A->B->E)

also, A->F & A->G is 2

So, now instead of 1 distance between nodes, if it is 6, then just multiply the answer by 6
example,
if distance between each is 1, then A->E is 2 (A->B->E = 1+1)
if distance between each is 6, then A->E is 12 (A->B->E = 6+6)

yes, bfs may take any path
but we are calculating for all paths

if you have to go from A to Z, then we travel all paths from A to an intermediate I, and since there will be many paths we discard all but shortest path till I, then continue with shortest path ahead to next node J
again if there are multiple paths from I to J, we only take shortest one
example,
assume,
A -> I we have distance 5
(STEP) assume, I -> J we have multiple paths, of distances 7 & 8, since 7 is shortest
we take A -> J as 5 (A->I shortest) + 8 (shortest now) = 13
so A->J is now 13
we repeat now above (STEP) for J -> K and so on, till we get to Z

Read this part, 2 or 3 times, and draw on paper, you will surely get what i am saying, best of luck


Runyon answered 30/9, 2015 at 16:58 Comment(5)
Could you please elaborate how did you manage to find the shortest path with a breadth first search. A breadth first search mainly searches for a node, there can be n paths to a goal node from the source node & bfs may take any path. How are you determining the best path?Ciceronian
i have added 'more explanation' part to above answer, let me know if that satisfiesRunyon
I see you are trying to run a BFS on weighted graph. Of distances 7 & 8 why did you choose 8? why not 7? what if the 8 node doesn't have further edges to destination? The flow will have to choose 7 then.Ciceronian
good question, upvoted, yes, we are not throwing away, we keep track of all adjacent nodes, until we reach destination. BFS will only work when there are only constant distances like all 7 or all 8. I gave a generic one which has 7 & 8 which is also called as dijkstra's algorithm.Runyon
sorry, not sure what you mean, see en.wikipedia.org/wiki/Dijkstra's_algorithmRunyon
C
7

Breadth-first search will always find the shortest path in an unweighted graph (for weighted graphs, see Dijkstra's algorithm). The graph may be cyclic or acyclic. Pseudocode is below.

Note that the following pseudocode uses a queue to manage which vertices to visit. It also assumes you have vertex objects where each vertex is initialized with

vertex {
    visited  = false
    distance = infinity
    name     = "anything unique" [or, "start_vertex"/"end_vertex"]
    children = [ ... ]
}

start_vertex below is a reference to the vertex with the name start_vertex

Algorithm:

start_vertex.distance = 0
queue.enqueue(start_vertex)

while not queue.empty():    
    curr = queue.dequeue()

    if curr.name == "end_vertex":
        return curr.distance

    # deals with cyclic graphs
    if curr.visited: 
        continue

    curr.visited        = true
    curr_child_distance = curr.distance + 1

    for child in curr.children:
        if not child.visited and child.distance > curr_child_distance:
            child.distance = curr_child_distance
            queue.enqueue(child)

# no path connecting the vertices if we get here
return -1 
Corydalis answered 28/7, 2020 at 21:33 Comment(0)
G
3

Based on acheron55 answer I posted a possible implementation here.
Here is a brief summery of it:

All you have to do, is to keep track of the path through which the target has been reached. A simple way to do it, is to push into the Queue the whole path used to reach a node, rather than the node itself.
The benefit of doing so is that when the target has been reached the queue holds the path used to reach it.
This is also applicable to cyclic graphs, where a node can have more than one parent.

Glissade answered 15/1, 2018 at 9:44 Comment(0)
G
0

I had the same lack of understanding. It took me a while to wrap my head around how BFS finds the shortest path between any two nodes.

Unfortunately, most answers are unhelpful and convolute the matter by bringing in Djikstra's algorithm unnecessarily. I shall expand upon the answer by Himanshu who raised the crucial point that...

Just by the way BFS traverses a graph, any node it reaches is the shortest path to that node from the source node. Suppose we need to find the path D-C in the following simple graph. Follow along using pen and paper.

Smallgraph4nodes

We start from D and explore the graph using BFS. You see that it's certain that we shall reach C through B and not A because B would come first in our traversal. This is why we think of nodes as parents and children and why this information is stored in the form of a dictionary as we move along the graph for later retrieval to construct a path. Suppose this dict is called 'rel_dict', rel for relationship.

Let's make this graph bigger (picture and code added for the graph). Now, we'll see that to find the path between any two nodes, it's important that we start from one of the two nodes. Because if we choose a third node the rel_dict so formed at the end of the BFS traversal would possibly not give the shortest path but instead either no path at all or a longer one.

7nodesgraph

import networkx as nx
import matplotlib.pyplot as plt 

d  = {
    "A": ["B", "C"],
    "B": ["A", "D", "E", "C"],
    "C": ["A", 'B', "F"],
    "D": ["B"],
    "E": ["B", "G"],
    "F": ["C"],
    "G": ["E"]
}

g = nx.Graph(d)
nx.draw_networkx(g)
plt.draw()
plt.show()

To dequeue a node we enqueue its children, if any. Simultaneously for each node visited we map it to its parent in the rel_dict dictionary. We keep doing this till the queue empties out. Then to construct the path between the starting node and any other node we, take the ending node find its parent then find the parent of this parent and so on, till we find the starting node and we will. Do this exercise by pen and paper: you'll have to maintain the queue and the rel_dict. Then construct the path in reverse.

Now, try to find the path between any two nodes neither of which were the starting nodes and you'll see that it doesn't give the correct answer.

Gudren answered 13/2, 2023 at 7:32 Comment(0)
D
-1

A good explanation of how BFS computes shortest paths, accompanied by the most efficient simple BFS algorithm of which I'm aware and also by working code, is provided in the following peer-reviewed paper:

https://queue.acm.org/detail.cfm?id=3424304

The paper explains how BFS computes a shortest-paths tree represented by per-vertex parent pointers, and how to recover a particular shortest path between any two vertices from the parent pointers. The explanation of BFS takes three forms: prose, pseudocode, and a working C program.

The paper also describes "Efficient BFS" (E-BFS), a simple variant of classic textbook BFS that improves its efficiency. In the asymptotic analysis, running time improves from Theta(V+E) to Omega(V). In words: classic BFS always runs in time proportional to the number of vertices plus the number of edges, whereas E-BFS sometimes runs in time proportional to the number of vertices alone, which can be much smaller. In practice E-BFS can be much faster, depending on the input graph. E-BFS sometimes offers no advantage over classic BFS but it's never much slower.

Remarkably, despites its simplicity E-BFS appears not to be widely known.

Delilahdelimit answered 28/10, 2020 at 18:20 Comment(1)
Hi Terrence, welcome to stack overflow. The link looks interesting but your answer itself doesn't seem to attempt to answer the question. As links can and do break, it's always appreciated if an answer tries to include relevant details from an linked resources.Huzzah
A
-13

The following solution works for all the test cases.

import java.io.*;
import java.util.*;
import java.text.*;
import java.math.*;
import java.util.regex.*;

public class Solution {

   public static void main(String[] args)
        {
            Scanner sc = new Scanner(System.in);

            int testCases = sc.nextInt();

            for (int i = 0; i < testCases; i++)
            {
                int totalNodes = sc.nextInt();
                int totalEdges = sc.nextInt();

                Map<Integer, List<Integer>> adjacencyList = new HashMap<Integer, List<Integer>>();

                for (int j = 0; j < totalEdges; j++)
                {
                    int src = sc.nextInt();
                    int dest = sc.nextInt();

                    if (adjacencyList.get(src) == null)
                    {
                        List<Integer> neighbours = new ArrayList<Integer>();
                        neighbours.add(dest);
                        adjacencyList.put(src, neighbours);
                    } else
                    {
                        List<Integer> neighbours = adjacencyList.get(src);
                        neighbours.add(dest);
                        adjacencyList.put(src, neighbours);
                    }


                    if (adjacencyList.get(dest) == null)
                    {
                        List<Integer> neighbours = new ArrayList<Integer>();
                        neighbours.add(src);
                        adjacencyList.put(dest, neighbours);
                    } else
                    {
                        List<Integer> neighbours = adjacencyList.get(dest);
                        neighbours.add(src);
                        adjacencyList.put(dest, neighbours);
                    }
                }

                int start = sc.nextInt();

                Queue<Integer> queue = new LinkedList<>();

                queue.add(start);

                int[] costs = new int[totalNodes + 1];

                Arrays.fill(costs, 0);

                costs[start] = 0;

                Map<String, Integer> visited = new HashMap<String, Integer>();

                while (!queue.isEmpty())
                {
                    int node = queue.remove();

                    if(visited.get(node +"") != null)
                    {
                        continue;
                    }

                    visited.put(node + "", 1);

                    int nodeCost = costs[node];

                    List<Integer> children = adjacencyList.get(node);

                    if (children != null)
                    {
                        for (Integer child : children)
                        {
                            int total = nodeCost + 6;
                            String key = child + "";

                            if (visited.get(key) == null)
                            {
                                queue.add(child);

                                if (costs[child] == 0)
                                {
                                    costs[child] = total;
                                } else if (costs[child] > total)
                                {
                                    costs[child] = total;
                                }
                            }
                        }
                    }
                }

                for (int k = 1; k <= totalNodes; k++)
                {
                    if (k == start)
                    {
                        continue;
                    }

                    System.out.print(costs[k] == 0 ? -1 : costs[k]);
                    System.out.print(" ");
                }
                System.out.println();
            }
        }
}
Anni answered 29/1, 2016 at 19:43 Comment(1)
Downvoted for not answering the question. Simply pasting a code snippet won't work on SO.Lowell

© 2022 - 2024 — McMap. All rights reserved.