Finding the shortest path in a graph without any negative prefixes
Asked Answered
A

9

17

Find the shortest path from source to destination in a directed graph with positive and negative edges, such that at no point in the path the sum of edges coming before it is negative. If no such path exists report that too.

I tried to use modified Bellman Ford, but could not find the correct solution.


I would like to clarify a few points :

  1. yes there can be negative weight cycles.
  2. n is the number of edges.
  3. Assume that a O(n) length path exists if the problem has a solution.
  4. +1/-1 edge weights.
Acute answered 1/3, 2012 at 2:2 Comment(13)
This came up in a special interest group on algorithms which I am member of. Suggested by a professor though.Acute
Can the path contain cycles? Or must it be acyclic? Also, how do you define "shortest?" Fewest number of edges? Least total cost?Equilateral
Yes, cycles are allowed. Also the graph is directed, I am editing the question.Acute
How are you measuring the "shortest" path? Is that the one with the fewest number of edges in it? Or the one with the least total cost?Equilateral
least total cost, but obviously positive.Acute
Are the edge costs integers? Real numbers? Always +1/-1? (Sorry if I'm being a bit pedantic - I really like this problem and just want to make sure I have it down right!)Equilateral
Also, do you want the length of the path, or the path itself? If you want the path itself, you can't get an efficient algorithm in all cases, since the length of this path might be exponentially long (think about a hugely negative edge with a long cycle before it; you have to go around the cycle a lot before you can cross the edge.)Equilateral
I think we need to place the problem statement into the acyclic set as if we do have a cycle we can always think of forming the solution around the cycle to reduce the cost.Psychoneurotic
@templatetypedef, I think is better to rewrite question as the format you like it (by too many answered and unanswered comments is not clear what's your goal).Deathblow
@user1240022: the cost must not cross zero en route but be as close to zero as possible at the end? That's a bit absurd a requirement. And I feel it's NP for that reason - the cost dynamics en route don't corellate with its net value so we can't walk local minimums like Bellman-Ford does. We'll have to consider all the paths like in Travelling salesman problem.Grisly
Even with "assume that a O(n) length path exists if the problem has a solution" (where n is the number of edges) it's at least as hard as the binary/discrete knapsack problems. The graph illustrated in that case has O(n) vertices, O(n) edges, and all paths from start to end are of length O(n). In other words, those requirements make it difficult to see how the problem at hand is NP-Hard (like TSP) but it's still NP. Not O(n^3).Salvatore
For newcomers to this question, please note the clarified points at the bottom of the question have changed -- previously the weights could be any real number. Now they're limited to +1/-1.Salvatore
After all this time, the OP hasn't managed to make the question clear. I do not believe any more that the OP knows what the question is. -1Metastasis
S
14

Admittedly this isn't a constructive answer, however it's too long to post in a comment...

It seems to me that this problem contains the binary as well as discrete knapsack problems, so its worst-case-running-time is at best pseudo-polynomial. Consider a graph that is connected and weighted as follows:

Graph with initial edge with weight x and then a choice of -a(i) or 0 at each step

Then the equivalent binary knapsack problem is trying to choose weights from the set {a0, ..., an} that maximizes Σ ai where Σ ai < X.

As a side note, if we introduce weighted loops it's easy to construct the unbounded knapsack problem instead.

Therefore, any practical algorithm you might choose has a running time that depends on what you consider the "average" case. Is there a restriction to the problem that I've either not considered or not had at my disposal? You seem rather sure it's an O(n3) problem. (Although what's n in this case?)

Salvatore answered 12/3, 2012 at 21:5 Comment(9)
Brilliant, thanks for the edit Gareth. :) What software did you use to generate the graph?Salvatore
You're welcome. I used OmniGraffle.Stomato
Similarly, I think for an undirected graph with n vertices you can construct a n^2 directed graph with weights of -2i for vertex i (plus a starting weight of 2n-1) and get a solution to the NP-complete Hamiltonian path problem. (The minimum cost flow will go through every vertex once to get a total cost of 0)Suksukarno
@PeterdeRivaz- Can you elaborate on this? I tried to make that work, but you could always take some weird cyclic path and end up not actually finding the Hamiltonian cycle.Equilateral
@GarethRees would setting the negative weights to the reciprocals of primes and the positive weight to their sum fix it?Respectively
...or just use base n+1 instead of 2. Considering pseudopolynomial algorithms is equivalent to considering edge lengths in {-1, 1}.Fustigate
@Equilateral S->(i,0) weight 2n-1, for edge i->j in original graph add (i,t)->(j,t+1) with weight -2i, (j,n-1)->E with weight -2**j. Find path from S->E. Does this work?Suksukarno
@Peter: I expanded on your comment here.Stomato
Although the OP has since changed the question so that this doesn't directly answer the question, this answers the original question I was interested in by providing a proof of the NP-hardness of the original question. Thanks for a simple, elegant reduction!Equilateral
S
11

Peter de Rivaz pointed out in a comment that this problem includes HAMILTONIAN PATH as a special case. His explanation was a bit terse, and it took me a while to figure out the details, so I've drawn some diagrams for the benefit of others who might be struggling. I've made this post community wiki.

I'll use the following graph with six vertices as an example. One of its Hamiltonian paths is shown in bold.

Graph with six vertices and seven edges; one of its Hamiltonian paths shown in bold

Given an undirected graph with n vertices for which we want to find a Hamiltonian path, we construct a new weighted directed graph with n2 vertices, plus START and END vertices. Label the original vertices vi and the new vertices wik for 0 ≤ ik < n. If there is an edge between vi and vj in the original graph, then for 0 ≤ k < n−1 there are edges in the new graph from wik to wj(k+1) with weight −2j and from wjk to wi(k+1) with weight −2i. There are edges from START to wi0 with weight 2n − 2i − 1 and from wi(n−1) to END with weight 0.

It's easiest to think of this construction as being equivalent to starting with a score of 2n − 1 and then subtracting 2i each time you visit wij. (That's how I've drawn the graph below.)

Each path from START to END must visit exactly n + 2 vertices (one from each row, plus START and END), so the only way for the sum along the path to be zero is for it to visit each column exactly once.

So here's the original graph with six vertices converted to a new graph with 38 vertices. The original Hamiltonian path corresponds to the path drawn in bold. You can verify that the sum along the path is zero.

Same graph converted to shortest-weighted path format as described.

Stomato answered 1/3, 2012 at 2:2 Comment(2)
It's also not too hard to imagine using this technique to solve the travelling salesman problem -- In this example, start with 64 instead and have weights with small fractions that represent the original TSP graph weights while still having the power-of-two integer part values for enforcing visitation of every node. Making it a cycle can be brute forced by replicating the graph for each potential starting node. (There may be a nicer way.)Salvatore
Thanks Gareth! I love your diagrams! (I think this at least answers why the question has resisted efforts to solve it...)Suksukarno
M
5

UPDATE: The OP now has had several rounds of clarifications, and it is a different problem now. I'll leave this here for documenting my ideas for the first version of the problem (or rather my understanding of it). I'll try a new answer for the current version of the problem. End of UPDATE

It's a pity that the OP hasn't clarified some of the open questions. I'll assume the following:

  1. The weights are +/- 1.
  2. n is the number of vertices

The first assumption is no loss of generality, obviously, but it has great impact on the value of n (via the second assumption). Without the first assumption, even a tiny (fixed) graph can have arbitrary long solutions by varying the weights without limits.

The algorithm I propose is quite simple, and similar to well-known graph algorithms. I'm no graph expert though, so I may use the wrong words in some places. Feel free to correct me.

  1. For the source vertex, remember cost 0. Add (source, 0) to the todo list.
  2. Pop an item from the todo list. Follow all outgoing edges of the vertex, computing the new cost c to reach the new vertex v. If the new cost is valid (c >= 0 and c <= n ^ 2, see below) and not remembered for v, add it to the remembered cost values of v, and add (v, c) to your todo list.
  3. If the todo list is not empty, continue with step 2. (Or break early if the destination can be reached with cost 0).

It's clear that each "step" that's not an immediate dead end creates a new (vertex, cost) combination. There will be stored at most n * n ^2 = n ^ 3 of these combinations, and thus, in a certain sense, this algorithm is O(n^3).

Now, why does this find the optimal path? I don't have a real proof, but I think it the following ideas justify why I think this suffices, and it may be possible that they can be turned into a real proof.

I think it is clear that the only thing we have to show is that the condition c <= n ^ 2 is sufficient.

First, let's note that any (reachable) vertex can be reached with cost less than n.

Let (v, c) be part of an optimal path and c > n ^ 2. As c > n, there must be some cycle on the path before reaching (v, c), where the cost of the cycle is 0 < m1 < n, and there must be some cycle on the path after reaching (v, c), where the cost of the cycle is 0 > m2 > -n.

Furthermore, let v be reachable from the source with cost 0 <= c1 < n, by a path that touches the first cycle mentioned above, and let the destination be reachable from v with cost 0 <= c2 < n, by a path that touches the other cycle mentioned above.

Then we can construct paths from source to v with costs c1, c1 + m1, c1 + 2 * m1, ..., and paths from v to destination with costs c2, c2 + m2, c2 + 2 * m2, ... . Choose 0 <= a <= m2 and 0 <= b <= m1 such that c1 + c2 + a * m1 + b * m2 is minimal and thus the cost of an optimal path. On this optimal path, v would have the cost c1 + a * m1 < n ^ 2.

If the gcd of m1 and m2 is 1, then the cost will be 0. If the gcd is > 1, then it might be possible to choose other cycles such that the gcd becomes 1. If that is not possible, it's also not possible for the optimal solution, and there will be a positive cost for the optimal solution.

(Yes, I can see several problems with this attempt of a proof. It might be necessary to take the gcd of several positive or negative cycle costs etc. I would be very interested in a counterexample, though.)

Here's some (Python) code:

def f(vertices, edges, source, dest):
    # vertices: unique hashable objects
    # edges: mapping (u, v) -> cost; u, v in vertices, cost in {-1, 1}

    #vertex_costs stores the possible costs for each vertex
    vertex_costs = dict((v, set()) for v in vertices)
    vertex_costs[source].add(0) # source can be reached with cost 0

    #vertex_costs_from stores for each the possible costs for each vertex the previous vertex
    vertex_costs_from = dict()

    # vertex_gotos is a convenience structure mapping a vertex to all ends of outgoing edges and their cost
    vertex_gotos = dict((v, []) for v in vertices)
    for (u, v), c in edges.items():
        vertex_gotos[u].append((v, c))

    max_c = len(vertices) ** 2 # the crucial number: maximal cost that's possible for an optimal path

    todo = [(source, 0)] # which vertices to look at

    while todo:
        u, c0 = todo.pop(0)
        for v, c1 in vertex_gotos[u]:
            c = c0 + c1
            if 0 <= c <= max_c and c not in vertex_costs[v]:
                vertex_costs[v].add(c)
                vertex_costs_from[v, c] = u
                todo.append((v, c))

    if not vertex_costs[dest]: # destination not reachable
        return None # or raise some Exception

    cost = min(vertex_costs[dest])

    path = [(dest, cost)] # build in reverse order
    v, c = dest, cost
    while (v, c) != (source, 0):
        u = vertex_costs_from[v, c]
        c -= edges[u, v]
        v = u
        path.append((v, c))

    return path[::-1] # return the reversed path

And the output for some graphs (edges and their weight / path / cost at each point of the path; sorry, no nice images):

AB+ BC+ CD+ DA+             AX+ XY+ YH+ HI- IJ- JK- KL- LM- MH-
 A  B  C  D  A  X  Y  H  I  J  K  L  M  H
 0  1  2  3  4  5  6  7  6  5  4  3  2  1
AB+ BC+ CD+ DE+ EF+ FG+ GA+ AX+ XY+ YH+ HI- IJ- JK- KL- LM- MH-
 A  B  C  D  E  F  G  A  B  C  D  E  F  G  A  B  C  D  E  F  G  A  X  Y  H  I  J  K  L  M  H  I  J  K  L  M  H  I  J  K  L  M  H  I  J  K  L  M  H
 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10  9  8  7  6  5  4  3  2  1  0
AB+ BC+ CD+ DE+ EF+ FG+ GA+ AX+ XY+ YH+ HI- IJ- JK- KL- LM- MN- NH-
 A  X  Y  H
 0  1  2  3
AB+ BC+ CD+ DE+ EF+ FG+ GA+ AX+ XY+ YH+ HI- IJ- JK- KL- LM- MN- NO- OP- PH-
 A  B  C  D  E  F  G  A  B  C  D  E  F  G  A  B  C  D  E  F  G  A  B  C  D  E  F  G  A  B  C  D  E  F  G  A  B  C  D  E  F  G  A  X  Y  H  I  J  K  L  M  N  O  P  H  I  J  K  L  M  N  O  P  H  I  J  K  L  M  N  O  P  H  I  J  K  L  M  N  O  P  H  I  J  K  L  M  N  O  P  H
 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 44 43 42 41 40 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10  9  8  7  6  5  4  3  2  1  0

Here's the code to produce that output:

def find_path(edges, source, path):
    from itertools import chain

    print(edges)
    edges = dict(((u, v), 1 if c == "+" else -1) for u, v, c in edges.split())
    vertices = set(chain(*edges))

    path = f(vertices, edges, source, dest)
    path_v, path_c = zip(*path)
    print(" ".join("%2s" % v for v in path_v))
    print(" ".join("%2d" % c for c in path_c))

source, dest = "AH"

edges = "AB+ BC+ CD+ DA+             AX+ XY+ YH+ HI- IJ- JK- KL- LM- MH-"
# uv+ means edge from u to v exists and has cost 1, uv- = cost -1
find_path(edges, source, path)

edges = "AB+ BC+ CD+ DE+ EF+ FG+ GA+ AX+ XY+ YH+ HI- IJ- JK- KL- LM- MH-"
find_path(edges, source, path)

edges = "AB+ BC+ CD+ DE+ EF+ FG+ GA+ AX+ XY+ YH+ HI- IJ- JK- KL- LM- MN- NH-"
find_path(edges, source, path)

edges = "AB+ BC+ CD+ DE+ EF+ FG+ GA+ AX+ XY+ YH+ HI- IJ- JK- KL- LM- MN- NO- OP- PH-"
find_path(edges, source, path)
Metastasis answered 13/3, 2012 at 23:57 Comment(5)
Algorithm's complexity cannot be O(n^3), it is more likely O(n^5): imagine process of increasing cost for each node up to n^2 in a long cycle of cost 1. Proof for c <= n ^ 2 is not very convincing. Anyway, I like this answer.Bumpkin
@EvgenyKluev: There can't be more than n * n ^ 2 (vertex, cost) pairs, if cost is kept below n ^ 2. How can this result in O (n ^ 5)? Each step in a path belongs to exactly one of these pairs, and each of these pairs belongs to each (shortest) path at most once. You are right, the attempt of a proof is a layout of some ideas, nothing more.Metastasis
You have up to n^3 (vertex, cost) pairs, for each of them you call todo.append, then in while todo you iterate through all the edges (up to n times). Which gives n^4. (Not n^5 - that's my mistake).Bumpkin
@EvgenyKluev: You are right; I "cheated" in that respect. However, we now know that n is the number of edges; I think that helps. Unfortunately, I don't have time to do any more work on this :-(Metastasis
You can change it to c <= 2*n. Look at my answer below.Borchardt
F
2

As Kaganar notes, we basically have to make some assumption in order to get a polytime algorithm. Let's assume that the edge lengths are in {-1, 1}. Given the graph, construct a weighted context-free grammar that recognizes valid paths from source to destination with weight equal to the number of excess 1 edges (it generalizes the grammar for balanced parentheses). Compute, for each nonterminal, the cost of the cheapest production by initializing everything to infinity or 1, depending on whether there is a production whose RHS has no nonterminal, and then relaxing n - 1 times, where n is the number of nonterminals.

Fustigate answered 12/3, 2012 at 21:21 Comment(1)
(Why does this work? No nonterminal can produce with negative weight, so the minimum-weight production is not recursive and thus has height at most n - 1.)Fustigate
P
0

I would use recursion brute forcing here: something like (pseudo code to make sure it's not language specific)

you will need:

  • 2D array of bools showing where you CAN and where you CAN'T go, this should NOT include "forbidden values", like before negative edge, you can choose to add a vertical and horizontal 'translation' to make sure it starts at [0][0]
  • an integer (static) containing the shortest path
  • a 1D array of 2 slots, showing the goal. [0] = x, [1] = y

you will do:

function(int xPosition, int yPosition, int steps)
{
if(You are at target AND steps < Shortest Path)
    Shortest Path = steps
if(This Position is NOT legal)
    /*exit function*/
else
    /*try to move in every legal DIRECTION, not caring whether the result is legal or not
    but always adding 1 to steps, like using:*/
    function(xPosition+1, yPosition, steps+1);
    function(xPosition-1, yPosition, steps+1);
    function(xPosition, yPosition+1, steps+1);
    function(xPosition, yPosition-1, steps+1);
}

then just run it with function(StartingX, StartingY, 0);

the shortest path will be contained in the static external int

Paver answered 1/3, 2012 at 2:13 Comment(5)
This is an exponential-time algorithm that does an enormous amount of work. Moreover, this only works in 2-D grids rather than arbitrary graphs. On top of this, it doesn't take the edge costs into account.Equilateral
I realize that, it never was meant to be a perfect solution, but more of a nudge in the direction. I'm just a student right now (read bio if you're curious) and I don't know any better way of doing something similar, the OP doesn't seem to know ANY way, so I though some way is better then none. If you know a better one please post it so both I and the OP can learn a better method.Paver
I need something like O(n^3).Acute
O(n^3) is very big! then again, my solution is O(n^x) x being amount of legal moves, please correct me if I'm mistakenPaver
It can be done in O(nlogn) if I'm not mistaken, unfortunately I don't know of any way to do this that's more effective then what I wrote, templatetypedef seems like he's preparing an answer, so I'll make sure to look at it as soon as it's there.Paver
A
0

I would like to clarify a few points :

  1. yes there can be negative weight cycles.
  2. n is the number of edges.
  3. weights are arbitrary not just +1/-1.
  4. Assume that a O(n) length path exists if the problem has a solution. (n is number of edges)
Acute answered 14/3, 2012 at 19:8 Comment(3)
weights are arbitrary not just +1/-1 Then there's no O(n^3) algorithm unless P = NP.Fustigate
@Acute those clarifications should be made in the question text - I will edit itHotien
Is the path referred to in 4. an optimal path, or just some path? Anyway, 4 is an extreme change to the prerequisites; it excludes many graphs.Metastasis
H
0

Although people have shown that no fast solution exists (unless P=NP)..

I think for most graphs (95%+) you should be able to find a solution fairly quickly.

I take advantage of the fact that if there are cycles then there are usually many solutions and we only need to find one of them. There are probably some glaring holes in my ideas so please let me know.

Ideas:

1. find the negative cycle that is closest to the destination. denote the shortest distance between the cycle and destination as d(end,negC)

(I think this is possible, one way might be to use floyds to detect (i,j) with a negative cycle, and then breadth first search from the destination until you hit something that is connected to a negative cycle.)

2. find the closest positive cycle to the start node, denote the distance from the start as d(start,posC)

(I argue in 95% of graphs you can find these cycles easily)

    Now we have cases:
a) there is both the positive and negative cycles were found:
The answer is d(end,negC).

b) no cycles were found:
simply use shortest path algorithm?

c) Only one of the cycles was found. We note in both these cases the problem is the same due to symmetry (e.g. if we swap the weights and start/end we get the same problem). I'll just consider the case that there was a positive cycle found.

find the shortest path from start to end without going around the positive cycle. (perhaps using modified breadth first search or something). If no such path exists (without going positive).. then .. it gets a bit tricky.. we have to do laps of the positive cycle (and perhaps some percentage of a lap).
If you just want an approximate answer, work out shortest path from positive cycle to end node which should usually be some negative number. Calculate number of laps required to overcome this negative answer + the distance from the entry point to the cycle to the exit point of the cycle. Now to do better perhaps there was another node in the cycle you should have exited the cycle from... To do this you would need to calculate the smallest negative distance of every node in the cycle to the end node.. and then it sort of turns into a group theory/ random number generator type problem... do as many laps of the cycle as you want till you get just above one of these numbers.

good luck and hopefully my solutions would work for most cases.

Hilburn answered 15/3, 2012 at 21:2 Comment(0)
M
0

The current assumptions are:

  1. yes there can be negative weight cycles.
  2. n is the number of edges.
  3. Assume that a O(n) length path exists if the problem has a solution.
  4. +1/-1 edge weights.

We may assume without loss of generality that the number of vertices is at most n. Recursively walk the graph and remember the cost values for each vertex. Stop if the cost was already remembered for the vertex, or if the cost would be negative.

After O(n) steps, either the destination has not been reached and there is no solution. Otherwise, for each of the O(n) vertices we have remembered at most O(n) different cost values, and for each of these O(n ^ 2) combinations there might have been up to n unsuccessful attempts to walk to other vertices. All in all, it's O(n ^ 3). q.e.d.

Update: Of course, there is something fishy again. What does assumption 3 mean: an O(n) length path exists if the problem has a solution? Any solution has to detect that, because it also has to report if there is no solution. But it's impossible to detect that, because that's not a property of the individual graph the algorithm works on (it is asymptotic behaviour).

(It is also clear that not all graphs for which the destination can be reached have a solution path of length O(n): Take a chain of m edges of weight -1, and before that a simple cycle of m edges and total weight +1).

[I now realize that most of the Python code from my other answer (attempt for the first version of the problem) can be reused.]

Metastasis answered 15/3, 2012 at 22:36 Comment(0)
B
0

Step 1: Note that your answer will be at most 2*n (if it exists).
Step 2: Create a new graph with vertexes that are a pairs of [vertex][cost]. (2*n^2 vertexes)
Step 3: Note that new graph will have all edges equal to one, and at most 2*n for each [vertex][cost] pair.
Step 4: Do a dfs over this graph, starting from [start][0]
Step 5: Find minimum k, such that [finish][k] is accesible.

Total complexity is at most O(n^2)*O(n) = O(n^3)

EDIT: Clarification on Step 1.
If there is a positive cycle, accesible from start, you can go all the way up to n. Now you can walk to any accesible vertex, over no more than n edges, each is either +1 or -1, leaving you with [0;2n] range. Otherwise you'll walk either through negative cycles, or no more than n +1, that aren't in negative cycle, leaving you with [0;n] range.

Borchardt answered 17/3, 2012 at 20:18 Comment(2)
1) How do you get Step 1? We are only told that O(n) path exists (what ever that means). (Also, the OP hasn't clarified if that refers to an optimal path, or just any path.) 2) How does that differ from the algorithm in my answer from yesterday?Metastasis
Thanks for the clarification. OK, the minimal total weight will be at most 2*n. However, intermediate steps of the optimal solution might have much higher weight, so the graph of Step 2 does not necessarily allow for the optimal solution.Metastasis

© 2022 - 2024 — McMap. All rights reserved.