Algorithm to find multiple short paths
Asked Answered
P

2

6

Seeking an algorithm that will yield N short paths.

Does anyone have experience with an algorithm to find multiple short paths in a directed graph? My application is for language (finding synonym chains) but logically, this could be for geography, or social networks instead. I want significantly different paths, not just swapping a few nodes along the way. I would really like also if there's a way to avoid "hubs", i.e. huge airports or super-connectors; or in language, words that have a ton of meanings.

(For my application, I've solved this with Dijkstra's and A-star, but I don't have a good way yet to get multiple paths, other than changing weights after I get the first path.)

For example (if a social network), how do I find multiple paths that connect ME and YOU, with mostly different people along the way. Chances are with 4-7 points of separation, that's possible. For both language, and social networks, it's often around 6 degrees of separation. i.e. it rarely takes 20+ nodes.

I've seen Dijkstra's algorithm to find all the shortest paths possible, a variation of shortest path algorithm, What is difference between BFS and Dijkstra's algorithms when looking for shortest path?, Find several shortest paths with A* algorithm -- but none are what I seek.

Surely someone has figured this out, but I don't know what to search for.

Popularize answered 1/3, 2021 at 22:22 Comment(0)
B
2

Network flows

This one is better for the social networks case, however it could be bent to include the synonym chains as well. The algorithm I have in mind is the Dinic's since it's augmenting paths are selected to be the shortest available paths. This will allow us to modify the algorithm to suit your case. Also note that we will be working with integer network flows.

Graph construction:

  • source, sink
  • for every person p, nodes ps, pe and a directed edge (ps, pe) with the capacity of one.1
  • for every edge (u,v) in your original graph a chain of edges (ue, x1), (x1, x2), ... (xk, vs) so that the number of the chain links equals the weight of the (u, v).2 This is to make use of the fact that Dinic finds the shortest improvement to the current flow so this penalizes the longer chains from being used too early.
  • For the person a you want to start with, change the capacity of (xs, xe) to the number of paths you wish to find.3
  • Ad an edge with unlimited capacity from the source to xs
  • Merge your target person with the sink. (Or add the appropriate edges)

Run the Dinic's algorithm. The resulting flow will consist of the maximal number of disjunct paths. If you terminate the algorithm soon enough, these will likely be quite short since that is what the algorithm starts with. However since the network flow in the graph we constructed attempts to maximize the number of disjunct paths, it will start to prefer the longer ones if it improves the count. That is why you might want to limit the capacity of the first node's edge.


1Bigger capacities won't work for this case because it would just increase the flow through the shortest path uniformly. However you can try to tweak some of the edges if you wish to allow at least few hubs or if your branching starts later.
2Note that if you have unweighted graph, then the edge (ue, vs) will be enough.
3Or infinity if you wish to find all of them. This would likely come with the cost of them no longer being reasonably short.

Binkley answered 2/3, 2021 at 10:6 Comment(0)
C
1

This is a variation on the answer that @Shamis already provided. It allows nodes to be reused so that if there is a bottleneck, you'll still find n solutions.

More general than a maximum flow is a minimum cost flow. So try this.

Take your graph, give each edge capacity 1 and cost 1. Now add a second edge for each first with capacity 1 and cost 4. (I'm using 4 as a heuristic to make it avoid that edge if it can.) Add a third edge with capacity 1 and cost 16. And so on up to edges of cost 4^n.

In this book you will find the following non-obvious result about minimum cost flows:

Since basic feasible solutions are integer-valued, when there is an optimal solution, there will be one that is integer-valued. This enables use of linear programming algorithms to solve min-cost-flow problems even when integer valued solutions are required. We discuss some examples in the following subsections.

We now look for a minimum cost flow of capacity n. If there are n independent solutions, you'll find them. If they have to reuse nodes, you'll find solutions that work hard to not reuse nodes. They will go longer if they need to.

If you change the sequence 1, 4, 16, ... to something else you'll get a different tradeoff between accepting longer solutions and avoiding duplicate ones.

I offer no estimates on how efficient this is. Just that it is doable.

Consortium answered 2/3, 2021 at 18:8 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.