Route problem in a graph: minimize average edge cost instead of total cost
Asked Answered
Y

4

14

I have a weighted graph, no negative weights, and I would like to find the path from one node to another, trying to minimize the cost for the single step. I don't need to minimize the total cost of the trip (as e.g. Dijkstra does) but the average step-cost. However, I have a constraint: K, the maximum number of nodes in the path.

So for example to go from A to J maybe Dijkstra would find this path (between parenthesis the weight)

A (4) D (6) J -> total cost: 10

and the algorithm I need, setting K = 10, would find something like

A (1) B (2) C (2) D (1) E (3) F (2) G (1) H (3) J -> total cost: 15

Is there any well known algorithm for this problem?

Thanks in advance.

Eugenio

Edit as answer to templatetypedef. Some questions:

1) The fact that it can happen to take a cycle multiple times to drive down the average is not good for my problem: maybe I should have mentioned it but I don' want to visit the same node more than once

2) Is it possible to exploit the fact that I don't have negative weights?

3) When you said O(kE) you meant for the whole algorithm or just for the additional part?

Let's take this simple implementation in C where n=number of nodes e=number of edges, d is a vector with the distances, p a vector with the predecessor and a structure edges (u,v,w) memorize the edges in the graphs

for (i = 0; i < n; ++i)
    d[i] = INFINITY;

    d[s] = 0;

    for (i = 0; i < n - 1; ++i)
        for (j = 0; j < e; ++j)
            if (d[edges[j].u] + edges[j].w < d[edges[j].v]){
                d[edges[j].v] = d[edges[j].u] + edges[j].w;
                p[edges[j].v] = u;
                }

I'm not sure how I should modify the code according to your answer; to take into consideration the average instead of the total cost should this be enough?

for (i = 0; i < n; ++i)
        d[i] = INFINITY;

    d[s] = 0;

    for (i = 0; i < n - 1; ++i)
        steps = 0;
        for (j = 0; j < e; ++j)
            if ( (d[edges[j].u]+ edges[j].w)/(steps+1) < d[edges[j].v]/steps){
                d[edges[j].v] = d[edges[j].u] + edges[j].w;
                p[edges[j].v] = u;
                steps++;
            }

But anyway I don't know how take into consideration the K limit at the same time...Thanks again in advance for your help.

Edit Since I can afford some errors I'm thinking about this naif solution:

  • precompute all the shortest paths and memorize in A
  • precompute all the shortest paths on a modified graph, where I cut the edges over a certain weight and memorize them in B

When I need a path, I look in A, e.g. from x to y this is the path x->z->y then for each step I look in B, so for x > z I see if there is a connection in B, if not I keep x > z otherwise I fill the path x > z with the subpath provided by B, that could be something like x->j->h->z; then I do the same for z->y. Each time I will also check if I'm adding a cyclic path.

Maybe I will get some weird paths but it could work in most of the case. If I extend the solution trying with different "cut thresholds" maybe I can also be close to respect the K constrain.

Yuhas answered 25/8, 2011 at 18:27 Comment(0)
T
6

I believe that you can solve this using a modified version of the Bellman-Ford algorithm.

Bellman-Ford is based on the following dynamic programming recurrence that tries to find the shortest path from some start node s to each other node that's of length no greater than m for some m. As a base case, when you consider paths of length zero, the only reachable node is s and the initial values are

BF(s, t, 0) = infinity
BF(s, s, 0) = 0

Then, if we know the values for a path of length m, we can find it for paths of length m + 1 by noting that the old path may still be valid, or we want to extend some path by length one:

BF(s, t, m + 1) = min {
                     BF(s, t, m),
                     BF(s, u, m) + d(u, t) for any node u connected to t
                   }

The algorithm as a whole works by noting that any shortest path must have length no greater than n and then using the above recurrence and dynamic programming to compute the value of BF(s, t, n) for all t. Its overall runtime is O(EV), since there are E edges to consider at each step and V total vertices.

Let's see how we can change this algorithm to solve your problem. First, to limit this to paths of length k, we can just cut off the Bellman-Ford iteration after finding all shortest paths of length up to k. To find the path with lowest average cost is a bit trickier. At each point, we'll track two quantities - the length of the shortest path reaching a node t and the average length of that path. When considering new paths that can reach t, our options are to either keep the earlier path we found (whose cost is given by the shortest path so far divided by the number of nodes in it) or to extend some other path by one step. The new cost of that path is then given by the total cost from before plus the edge length divided by the number of edges in the old path plus one. If we take the cheapest of these and then record both its cost and number of edges, at the end we will have computed the path with lowest average cost of length no greater than k in time O(kE). As an initialization, we will say that the path from the start node to itself has length 0 and average cost 0 (the average cost doesn't matter, since whenever we multiply it by the number of edges we get 0). We will also say that every other node is at distance infinity by saying that the average cost of an edge is infinity and that the number of edges is one. That way, if we ever try computing the cost of a path formed by extending the path, it will appear to have average cost infinity and won't be chosen.

Mathematically, the solution looks like this. At each point we store the average edge cost and the total number of edges at each node:

BF(s, t, 0).edges = 1
BF(s, t, 0).cost  = infinity

BF(s, s, 0).edges = 0
BF(s, s, 0).cost  = 0

BF(s, t, m + 1).cost = min {
    BF(s, t, m).cost,
    (BF(s, u, m).cost * BF(s, u, m).edges + d(u, t)) / (BF(s, u, m).edges + 1)
}

BF(s, t, m + 1).edges = {
    BF(s, t, m).edges         if you chose the first option above. 
    BF(s, u, m).edges + 1     else, where u is as above
}

Note that this may not find a simple path of length k, since minimizing the average cost might require you to take a cycle with low (positive or negative) cost multiple times to drive down the average. For example, if a graph has a cost-zero loop, you should just keep taking it as many times as you can.

EDIT: In response to your new questions, this approach won't work if you don't want to duplicate nodes on a path. As @comestibles has pointed out, this version of the problem is NP-hard, so unless P = NP you shouldn't expect to find any good polynomial-time algorithm for this problem.

As for the runtime, the algorithm I've described above runs in total time O(kE). This is because each iteration of computing the recurrence takes O(E) time and there are a total of k iterations.

Finally, let's look at your proposed code. I've reprinted it here:

for (i = 0; i < n - 1; ++i) {
    steps = 0;
    for (j = 0; j < e; ++j) {
        if ( (d[edges[j].u]+ edges[j].w)/(steps+1) < d[edges[j].v]/steps){
            d[edges[j].v] = d[edges[j].u] + edges[j].w;
            p[edges[j].v] = u;
            steps++;
        }
    }
}

Your first question was how to take k into account. This can be done easily by rewriting the outer loop to count up to k, not n - 1. That gives us this code:

for (i = 0; i < k; ++i) {
    steps = 0;
    for (j = 0; j < e; ++j) {
        if ( (d[edges[j].u]+ edges[j].w)/(steps+1) < d[edges[j].v]/steps){
            d[edges[j].v] = d[edges[j].u] + edges[j].w;
            p[edges[j].v] = u;
            steps++;
        }
    }
}

One problem that I'm noticing is that the modified Bellman-Ford algorithm needs to have each candidate best path store its number of edges independently, since each node's optimal path might be reached by a different number of edges. To fix this, I would suggest having the d array store two values - the number of edges required to reach the node and the average cost of a node along that path. You would then update your code by replacing the steps variable in these equations with the cached path lengths.

Hope this helps!

Tensile answered 25/8, 2011 at 18:43 Comment(4)
Hi @Tensile thanks a lot for your answer; I have some doubts about, I've explained them editing the original question.Yuhas
@Eugenio- I've updated my answer in response to your new questions. Can you take a look at this and see if it answers your new questions? Also, it might be a good idea to open a new question with these separate concerns, since what you're asking now ends up being a completely different question than the original.Tensile
thanks again @Tensile for the answer but yes, I need acyclic paths, I should have specified that since the beginning. Since a suboptimal solution is enough for me, I'm thinking about an hack I've just described as edit of the question.Yuhas
What about the case where there are two paths from A to B both of the same average cost? It would matter which you choose when adding edges from B to another node.Graft
P
3

For the new version of your problem, there's a reduction from Hamilton path (making your problem intractable). Take an instance of Hamilton path (i.e., a graph whose edges are assumed to have unit weight), add source and sink vertices and edges of weight 2 from the source to all others and from the sink to all others. Set K = |V| + 2 and request a path from source to sink. There exists a Hamilton path if and only if the optimal mean edge length is (|V| + 3)/(|V| + 2).

Care to tell us why you want these paths so that we can advise you of a reasonable approximation strategy?

Phenothiazine answered 25/8, 2011 at 21:18 Comment(0)
I
1

I feel like it is impossible to use dynamic programming as the problem does not have an optimal substructure.

Consider the following 2 paths from A -> B:

     path_num total_cost number_of_edges average_cost
     1        2          2               1
     2        10         9               1.11...

Then apparently, the second path has a higher average cost. Then let's add another point C and an edge B -> C of cost 4, we would like to minimize the average cost from A -> C.

However, as (2+4)/(2+1) = 2 > (10+4)/(9+1) = 1.4 Now we see the path that has a minimum average cost is actually path2 + edge(B, C)

This, in particular, contradicts the optimal substructure for DP.

Information answered 3/3 at 8:51 Comment(0)
C
0

You can slightly modify Bellman-Ford algorithm to find minimum path using at most K edges/nodes. If the number of edges is fixed than you have to minimize total cost, because average cost would be TotalCost/NumberOfEdges.

One of the solutions would be to iterate NumberOfEdges from 1 to K, find minimal total cost and choose minimum TotalCost/NumberOfEdges.

Crevice answered 25/8, 2011 at 18:38 Comment(3)
You have to be careful with this because the Bellman-Ford iteration tracks paths of lengths up to k, so just finding the min-cost path after k Bellman-Ford iterations doesn't necessarily give you the correct answer. See my solution for how to fix this.Tensile
Hi @Crevice thanks for the answer, I'm not sure if I have well formulated the question. At the moment I'm actually using Bellman-Ford but the paths I get are too short and the cost of each edge too high...I would like to get a longest trip (more edges, less cost per edge), but if I just set "use at most K edges" as rule I would get the same results I'm getting now, right?Yuhas
Sorry I'm using en.wikipedia.org/wiki/Floyd%E2%80%93Warshall_algorithm not Bellman-FordYuhas

© 2022 - 2024 — McMap. All rights reserved.