Optimizations for longest path problem in cyclic graph
Asked Answered
K

1

15

What optimizations exist for trying to find the longest path in a cyclic graph?

Longest path in cyclic graphs is known to be NP-complete. What optimizations or heuristics can make finding the longest path faster than DFSing the entire graph? Are there any probabilistic approaches?

I have a graph with specific qualities, but I'm looking for an answer to this in the general case. Linking to papers would be fantastic. Here is a partial answer:

  1. Confirm it is cyclic. Longest path in acyclic graphs is easily computed using dynamic programming.

  2. Find out if the graph is planar (which algorithm is best?). If it is, you might see if it is a block graph, ptolemaic graph, or cacti graph and apply the methods found in this paper.

  3. Find out how many simple cycles there are using Donald B Johnson's algorithm (Java implementation). You can change any cyclic graph into an acyclic one by removing an edge in a simple cycle. You can then run the dynamic programming solution found on the Wikipedia page. For completeness, you would have to do this N times for each cycle, where N is the length of the cycle. Thus, for an entire graph, the number of times you have to run the DP solution is equal to the product of the lengths of all cycles.

  4. If you have to DFS the entire graph, you can prune some paths by computing the "reachability" of each node in advance. This reachability, which is mainly applicable to directed graphs, is the number of nodes each node can reach without repetitions. It is the maximum the longest path from that node could possibly be. With this information, if your current path plus the reachability of the child node is less than the longest you've already found, there is no point in taking that branch as it is impossible that you would find a longer path.

Kelleher answered 23/11, 2010 at 2:30 Comment(5)
Does (3) relate to finding strongly connected components? (en.wikipedia.org/wiki/Strongly_connected_component) - if not, I'd have thought you could do something with those. I found Tarjans algorithm fairly easy to understand.Kelcey
I honestly don't see how identifying strongly connected components or vertex contraction helps solve LPP, but I could be wrong. To coerce it into an acyclic graph, I believe you need the simple cycles (Johnson's), as there may still be cycles in the strongly connected components.Kelleher
maybe it doesn't help. My thought was that the SCCs are typically smaller than the whole graph, so finding the longest routes through them for each entry/exit pair should be relatively quick. Remove the SCCs and add a weighted edge instead for each of these longest-route-through-SCC solutions and you should end up with an acyclic digraph that you can do dynamic programming on. Each longest-route-through-SCC edge actually replaces an edge that entered the SCC and an edge that left it, as well as the route through the SCC itself. I may be missing something, though.Kelcey
@Steve314: The longest path may go through an SCC more than once (but still without visiting any node more than once). I don't think there is a way to account for this without expanding the SCC.Kelleher
How can the longest path go through an SCC more than once? When you factor out the SCCs, the result is an acyclic graph - so once you exit an SCC, there's no way to return to it. If there were such a route, you would have discovered a larger SCC. Maybe I should have pointed out that algorithms such as Tarjans find the maximal SCCs, not just any SCCs?Kelcey
A
6

Here is a O(n*2^n) dynamic programming approach that should be feasible for up to say 20 vertices:

m(b, U) = the maximum length of any path ending at b and visiting only (some of) the vertices in U.

Initially, set m(b, {b}) = 0.

Then, m(b, U) = max value of m(x, U - x) + d(x, b) over all x in U such that x is not b and an edge (x, b) exists. Take the maximum of these values for all endpoints b, with U = V (the full set of vertices). That will be the maximum length of any path.

The following C code assumes a distance matrix in d[N][N]. If your graph is unweighted, you can change every read access to this array to the constant 1. A traceback showing an optimal sequence of vertices (there may be more than one) is also computed in the array p[N][NBITS].

#define N 20
#define NBITS (1 << N)

int d[N][N];       /* Assumed to be populated earlier.  -1 means "no edge". */
int m[N][NBITS];   /* DP matrix.  -2 means "unknown". */
int p[N][NBITS];   /* DP predecessor traceback matrix. */

/* Maximum distance for a path ending at vertex b, visiting only
   vertices in visited. */
int subsolve(int b, unsigned visited) {
    if (visited == (1 << b)) {
        /* A single vertex */
        p[b][visited] = -1;
        return 0;
    }

    if (m[b][visited] == -2) {
        /* Haven't solved this subproblem yet */
        int best = -1, bestPred = -1;
        unsigned i;
        for (i = 0; i < N; ++i) {
            if (i != b && ((visited >> i) & 1) && d[i][b] != -1) {
                int x = subsolve(i, visited & ~(1 << b));
                if (x != -1) {
                    x += d[i][b];
                    if (x > best) {
                        best = x;
                        bestPred = i;
                    }
                }
            }
        }

        m[b][visited] = best;
        p[b][visited] = bestPred;
    }

    return m[b][visited];
}

/* Maximum path length for d[][].
   n must be <= N.
   *last will contain the last vertex in the path; use p[][] to trace back. */
int solve(int n, int *last) {
    int b, i;
    int best = -1;

    /* Need to blank the DP and predecessor matrices */
    for (b = 0; b < N; ++b) {
        for (i = 0; i < NBITS; ++i) {
            m[b][i] = -2;
            p[b][i] = -2;
        }
    }

    for (b = 0; b < n; ++b) {
        int x = subsolve(b, (1 << n) - 1);
        if (x > best) {
            best = x;
            *last = b;
        }
    }

    return best;
}

On my PC, this solves a 20x20 complete graph with edge weights randomly chosen in the range [0, 1000) in about 7s and needs about 160Mb (half of that is for the predecessor trace).

(Please, no comments about using fixed-size arrays. Use malloc() (or better yet, C++ vector<int>) in a real program. I just wrote it this way so things would be clearer.)

Aspiration answered 24/11, 2010 at 6:58 Comment(2)
Did you come up with this algorithm, or did you read it somewhere?Kelleher
@Craig: I'm certain other people have come up with it before, but yes I did come up with it myself. I started by thinking about the Floyd-Warshall algorithm (which finds the shortest path between each pair of vertices in O(n^3) time) -- I initially thought you would be able to find longest paths by just flipping the signs. But that doesn't work because the path found by F-W might visit some vertices twice. BTW it took a bit of messing around: my 1st try used a function m(a, b, U) that recorded the longest distance between a and b via U. Turns out we don't need to care about a.Aspiration

© 2022 - 2024 — McMap. All rights reserved.