Directed probability graph - algorithm to reduce cycles?

Asked 8/2, 2017 at 22:18 Answered 23/1, 2021 at 19:54

Solved java c++c directed-graph markov-chains

Consider a directed graph which is traversed from first node 1 to some final nodes (which have no more outgoing edges). Each edge in the graph has a probability associated with it. Summing up the probabilities to take each possible path towards all possible final nodes returns 1. (Which means, we are guaranteed to arrive at one of the final nodes eventually.)

The problem would be simple if loops in the graph would not exist. Unfortunately rather convoluted loops can arise in the graph, which can be traversed an infinite amount of times (probability decreases multiplicatively with each loop traversal, obviously).

Is there a general algorithm to find the probabilities to arrive at each of the final nodes?

A particularly nasty example:

We can represent the edges as a matrix (probability to go from row (node) x to row (node) y is in the entry (x,y))

{{0, 1/2, 0, 1/14, 1/14, 0, 5/14}, 
 {0, 0, 1/9, 1/2, 0, 7/18, 0}, 
 {1/8, 7/16, 0, 3/16, 1/8, 0, 1/8}, 
 {0, 1, 0, 0, 0, 0, 0}, 
 {0, 0, 0, 0, 0, 0, 0}, 
 {0, 0, 0, 0, 0, 0, 0}, 
 {0, 0, 0, 0, 0, 0, 0}}

Or as a directed graph:

The starting node 1 is blue, the final nodes 5,6,7 are green. All edges are labelled by the probability to traverse them when starting from the node where they originate.

This has eight different paths from starting node 1 to the final nodes:

{{1/14, {1, 5}}, {5/14, {1, 7}}, {7/36, {1, 2, 6}}, 
 {1/144, {1, 2, 3, 5}}, {1/144, {1, 2, 3, 7}}, 
 {1/36, {1, 4, 2, 6}}, {1/1008, {1, 4, 2, 3, 5}}, {1/1008, {1, 4, 2, 3, 7}}}

(The notation for each path is {probability,sequence of nodes visited})

And there are five distinct loops:

{{1/144, {2, 3, 1}}, {7/144, {3, 2}}, {1/2, {4, 2}}, 
{1/48, {3, 4, 2}}, {1/1008, {4, 2, 3, 1}}}

(Notation for loops is {probability to traverse loop once,sequence of nodes visited}).

If only these cycles could be resolved to obtain an effectively tree like graph, the problem would be solved.

Any hint on how to tackle this?

I'm familiar with Java, C++ and C, so suggestions in these languages are preferred.

Alicealicea answered 8/2, 2017 at 22:18 Comment(9)

You avoid infinite loops by blocking a node already visited. Nothing nasty about that at all, whether you are using a recursive solution or something like Dijkstra's algorithm. – Alfaro 8/2, 2017 at 22:28

@WeatherVane Are you saying that the eight different paths I listed are already the final result? I believe having infinite loops is actually allowed, since it ends up being sort of a geometric series in the loop traversal probability (in simplest case of 1 loop), which converges to a finite number. My trouble is when there are many loops and several intersecting. Then I don't see how to generalize the geometric series. – Alicealicea 8/2, 2017 at 22:32

You asked how to remove cycles. – Alfaro 8/2, 2017 at 22:36

@WeatherVane Yes, I'd like to remove cycles and replace them with altered probabilities at the tree edges. – Alicealicea 8/2, 2017 at 22:37

The subgraph without loops is called "spanning tree". Google from here. – Derby 8/2, 2017 at 22:38

He doesn't want simple cycle removal, he wants to evaluate the asymptotic probability of arriving at each end node, given there is an infinite set of cyclic paths through the graph. – Lundberg 8/2, 2017 at 22:45

@ChristopherOicles Yes! Thank you. That is my question. – Alicealicea 8/2, 2017 at 22:45

You are describing graphs of certain kinds of Markov chains. I've added SO's tag for that, which might draw attention from people better-suited to answer than I am. – Groceries 9/2, 2017 at 13:56

@FauChristian when I asked the question, I already solved the problems you mentioned and were at the stage of dealing with loops (As you can see, the spanning tree paths, as well as the cycles are already found and provided for the example in the question). The probabilistic analysis is summarized in John Bollingers answer. If you find that trivial, then I'm impressed. For me it was not easy, but I got it to work eventually. – Alicealicea 15/2, 2017 at 4:53

I'm not expert in the area of Markov chains, and although I think it's likely that algorithms are known for the kind of problem you present, I'm having difficulty finding them.

If no help comes from that direction, then you can consider rolling your own. I see at least two different approaches here:

Simulation.

Examine how the state of the system evolves over time by starting with the system in state 1 at 100% probability, and performing many iterations in which you apply your transition probabilities to compute the probabilities of the state obtained after taking a step. If at least one final ("absorbing") node can be reached (at non-zero probability) from every node, then over enough steps, the probability that the system is in anything other than a final state will decrease asymptotically toward zero. You can estimate the probability that the system ends in final state S as the probability that it is in state S after n steps, with an upper bound on the error in that estimate given by the probability that the system is in a non-final state after n steps.

As a practical matter, this is the same is computing Trⁿ, where Tr is your transition probability matrix, augmented with self-edges at 100% probability for all the final states.

Exact computation.

Consider a graph, G, such as you describe. Given two vertices i and f, such that there is at least one path from i to f, and f has no outgoing edges other than self-edges, we can partition the paths from i to f into classes characterized by the number of times they revisit i prior to reaching f. There may be an infinite number of such classes, which I will designate C_if(n), where n represents the number of times the paths in C_if(n) revisit node i. In particular, C_ii(0) contains all the simple loops in G that contain i (clarification: as well as other paths).

The total probability of ending at node f given that the system traverses graph G starting at node i is given by

Pr(f|i, G) = Pr(C_if(0)|G) + Pr(C_if(1)|G) + Pr(C_if(2)|G) ...

Now observe that if n > 0 then each path in C_if(n) has the form of a union of two paths c and t, where c belongs to C_ii(n-1) and t belongs to C_if(0). That is, c is a path that starts at node i and ends at node i, passing through i n-1 times between, and t is a path from i to f that does not pass through i again. We can use that to rewrite our probability formula:

But note that every path in C_ii(n) is a composition of n+1 paths belonging to C_ii(0). It follows that Pr(C_ii(n)|G) = Pr(C_ii(0)|G)ⁿ⁺¹, so we get

And now, a little algebra gives us

Pr(f|i,G) - Pr(C_if(0)|G) = Pr(C_ii(0)|G) * Pr(f|i,G)

, which we can solve for Pr(f|i,G) to get

Pr(f|i,G) = Pr(C_if(0)|G) / (1 - Pr(C_ii(0)|G))

We've thus reduced the problem to one in terms of paths that do not return to the starting node, except possibly as their end node. These do not preclude paths that have loops that don't include the starting node, but we can we nevertheless rewrite this problem in terms of several instances of the original problem, computed on a subgraph of the original graph.

In particular, let S(i, G) be the set of successors of vertex i in graph G -- that is, the set of vertices s such that there is an edge from i to s in G, and let X(G,i) be the subgraph of G formed by removing all edges that start at i. Furthermore, let p_is be the probability associated with edge (i, s) in G.

Pr(C_if(0)|G) = Sum over s in S(i, G) of p_is * Pr(f|s,X(G,i))

In other words, the probability of reaching f from i through G without revisiting i in between is the sum over all successors of i of the product of the probability of reaching s from i in one step with the probability of reaching f from s through G without traversing any edges outbound from i. That applies for all f in G, including i.

Now observe that S(i, G) and all the p_is are knowns, and that the problem of computing Pr(f|s,X(G,i)) is a new, strictly smaller instance of the original problem. Thus, this computation can be performed recursively, and such a recursion is guaranteed to terminate. It may nevertheless take a long time if your graph is complex, and it looks like a naive implementation of this recursive approach would scale exponentially in the number of nodes. There are ways you could speed the computation in exchange for higher memory usage (i.e. memoization).

There are likely other possibilities as well. For example, I'm suspicious that there may be a bottom-up dynamic programming approach to a solution, but I haven't been able to convince myself that loops in the graph don't present an insurmountable problem there.

Groceries answered 9/2, 2017 at 15:32 Comment(14)

I believe memorization will make it almost linear (polynomial for sure). You are awesome! Thank you. – Alicealicea 9/2, 2017 at 15:58

@FauChristian, what the question asks for is "a general algorithm to find the probabilities to arrive at each of the final nodes," (emphasis added) and this answer provides two. If you do not consider those to be hints for implementation then hints for implementation are indeed not what the question asks for. Certainly the OP seems entirely satisfied that this answer responds to the question. – Groceries 15/2, 2017 at 21:38

This answer completely disassembles the problem. Nothing left to add. Just take it and implement it literally and it will work. – Alicealicea 15/2, 2017 at 22:43

@FauChristian, whether or not you accept that the conclusions presented satisfy the definition of an "algorithm", one of the advantages of developing it in the way I did is that the logical process by which the conclusions are reached is crystal clear. I assert that the approach accommodates cycle overlap just fine. If you disagree, then surely you can show where a dependency on non-overlap enters the derivation. – Groceries 18/2, 2017 at 13:41

@FauChristian, (1) there are no dropped terms in the derivation for which any "correction" would be required. The infinite series is reduced to a sum of two terms by use of the recurrence relation inherent in it. (2) The presence or absence of vertex i in S(i,G) is irrelevant because at that point we are computing paths in subgraph X(G,i). That graph contains all the vertices of G, but no paths from (or through) i to any other vertex. I described it in terms of removing edges outbound from i, but if you prefer, the Markov property is retained by replacing them all with one self edge. – Groceries 19/2, 2017 at 0:7

@FauChristian, (1) yes. The series has form P = p0 + p0 * q + p0 * q^2 + p0 * q^3 + .... If we subtract p0 from each side and factor out a q, we get P - p0 = q * (p0 + p0 * q + p0 * q^2 + ...). The infinite series in the parentheses is the same one we started with, so we know its value: P. Therefore we get P - p0 = q * P. No terms are dropped. No correction is required. – Groceries 19/2, 2017 at 1:4

@FauChristian, (2) to the extent that any further proof is required, it is summed up in the observations that X(G, i) contains no paths from any of the S(i, G) to f that are not also in G; that all paths from i to f in G that do not revisit i consist of the union edge from i to an element of S(i, G) with a path in X(G, i); and that all edges G and X(i, G) have in common have the same probability in each. It's just a formalization of looking at all the same paths as we were looking at in G, starting at the immediate successors of i. – Groceries 19/2, 2017 at 1:13

@FauChristian, in any case, you still have yet to substantiate your claim that this approach fails to handle overlapping cycles. – Groceries 19/2, 2017 at 1:15

@FauChristian, you having accepted my finite formula for Pr(f|i,G), your remaining objection seems to be to my last formula, giving Pr(Cif(0)|G) as a finite sum. But this formula follows directly from the definitions of the terms on the left-hand side. It is as I said, the sum on the right is over all the paths in Cif(0) -- including those that contain cycles. The point you seem to be stuck on is already accounted for by (earlier) using the recurrence to reduce the infinite sum to a finite one. – Groceries 19/2, 2017 at 18:51

@FauChristian, I have indeed been reading your comments carefully, but I nevertheless am having difficulty seeing where your objection lodges. In my view, reducing the problem to one over a subgraph is justified by the fact that the set of paths to be explored in the parent graph and the set of paths to be explored in the subgraph are identical. It's simply a reorganization of terms. I have asserted this already, but you seem to have ignored or rejected it without explanation. – Groceries 20/2, 2017 at 14:13

@FauChristian There is really no point in a lengthy discussion since, as I already pointed out, a direct implementation of the steps John Bollinger suggests leads to the exact solution. One criterion to verify this is the fact that the resulting overall probabilities to arrive at final nodes properly sum up to 1. This is a highly non-trivial check, considering that various loops with several infinite summations were taken into account. Also, this might interest you: it describes the infinite summations happening here in more detail en.wikipedia.org/wiki/Geometric_series – Alicealicea 21/2, 2017 at 16:6

@FauChristian actually, I wrote a script that generates random adjacency matrices M in various sizes (up to 12 nodes) and tested John Bollingers' algorithm on thousands of them. In all cases the asymptotic final node probabilities sum up to 1. Besides, one can check numerically that matrix multiplication M^n for large n makes the component M[1,i] (transition node 1-> some final node i) approach the analytically obtained value. That's further evidence in each case. It would be great to compare to other treatments. The fact that they can't be found makes this answer even more valuable! – Alicealicea 21/2, 2017 at 19:36

@FauChristian, the problem here is that it was never my intention to provide a rigorous proof, but only an overview of a proof sufficient to explain where the conclusion comes from. I confess that the primary reason I have continued the discussion is my irritation at what seem to me to be unjustified assertions that the logic presented and therefore the conclusion are incorrect. Your patronizing tone contributes to that irritation. I am certainly capable of making mistakes, but at this point, if you still maintain that I have made one here then I ask you to prove it. – Groceries 22/2, 2017 at 15:59

This should be published as paper! I had the same problem and difficulties to find anything in the Markov chain literature or lecture notes, especially when cycles overlap. The solution is derived beautifully and straightforward to implement in a short recursive function. – Engross 29/12, 2020 at 9:7

Problem Clarification

The input data is a set of m rows of n columns of probabilities, essentially an m by n matrix, where m = n = number of vertices on a directed graph. Rows are edge origins and columns are edge destinations. We will, on the bases of the mention of cycles in the question, that the graph is cyclic, that at least one cycle exists in the graph.

Let's define the starting vertex as s. Let's also define a terminal vertex as a vertex for which there are no exiting edges and the set of them as set T with size z. Therefore we have z sets of routes from s to a vertex in T, and the set sizes may be infinite due to cycles ¹. In such a scenario, one cannot conclude that a terminal vertex will be reached in an arbitrarily large number of steps.

In the input data, probabilities for rows that correspond with vertices not in T are normalized to total to 1.0. We shall assume the Markov property, that the probabilities at each vertex do not vary with time. This precludes the use of probability to prioritize routes in a graph search ².

Finite math texts sometimes name example problems similar to this question as Drunken Random Walks to underscore the fact that the walker forgets the past, referring to the memory-free nature of Markovian chains.

Applying Probability to Routes

The probability of arriving at a terminal vertex can be expressed as an infinite series sum of products.

P_t = lim _{s -> ∞} Σ ∏ P_{i, j},

where s is the step index, t is a terminal vertex index, i ∈ [1 .. m] and j ∈ [1 .. n]

Reduction

When two or more cycles intersect (sharing one or more vertices), analysis is complicated by an infinite set of patterns involving them. It appears, after some analysis and review of relevant academic work, that arriving at an accurate set of terminal vertex arrival probabilities with today's mathematical tools may best be accomplished with a converging algorithm.

A few initial reductions are possible.

The first consideration is to enumerate the destination vertex, which is easy since the corresponding rows have probabilities of zero.
The next consideration is to differentiate any further reductions from what the academic literature calls irreducible sub-graphs. The below depth first algorithm remembers which vertices have already been visited while constructing a potential route, so it can be easily retrofitted to identify which vertices are involved in cycles. However it is recommended to use existing well tested, peer reviewed graph libraries to identify and characterize sub-graphs as irreducible.

Mathematical reduction of irreducible portions of the graph may or may not be plausible. Consider starting vertex A and sole terminating vertex B in the graph represented as {A->C, C->A, A->D, D->A, C->D, D->C, C->B, D->B}.

Although one can reduce the graph to probability relations absent of cycles through vertex A, the vertex A cannot be removed for further reduction without either modifying probabilities of vertices exiting C and D or allowing both totals of probabilities of edges exiting C and D to be less than 1.0.

Convergent Breadth First Traversal

A breadth first traversal that ignores revisiting and allows cycles can iterate step index s, not to some fixed s_max but to some sufficiently stable and accurate point in a convergent trend. This approach is especially called for if cycles overlap creating bifurcations in the simpler periodicity caused by a single cycle.

Σ P_s^{Δ s}.

For the establishment of a reasonable convergence as s increases, one must determine the desired accuracy as a criteria for completing convergence algorithm and a metric for measuring accuracy by looking at longer term trends in results at all terminal vertices. It may be important to provide a criteria where the sum of terminal vertex probabilities is close to unity in conjunction with the trend convergence metric, as both a sanity check and an accuracy criteria. Practically, four convergence criteria may be necessary ³.

Per terminal vertex probability trend convergence delta
Average probability trend convergence delta
Convergence of total probability on unity
Total number of steps (to cap depth for practical computing reasons)

Even beyond these four, the program may need to contain a trap for an interrupt that permits the writing and subsequent examination of output after a long wait without the satisfying of all four above criteria.

An Example Cycle Resistant Depth First Algorithm

There are more efficient algorithms than the following one, but it is fairly comprehensible, it compiles without warning with C++ -Wall, and it produces the desired output for all finite and legitimate directed graphs and start and destination vertices possible ⁴. It is easy to load a matrix in the form given in the question using the addEdge method ⁵.

#include <iostream>
#include <list>

class DirectedGraph {

    private:
        int miNodes;
        std::list<int> * mnpEdges;
        bool * mpVisitedFlags;

    private:
        void initAlreadyVisited() {
            for (int i = 0; i < miNodes; ++ i)
                mpVisitedFlags[i] = false;
        }

        void recurse(int iCurrent, int iDestination,
               int route[], int index,
               std::list<std::list<int> *> * pnai) {

            mpVisitedFlags[iCurrent] = true;
            route[index ++] = iCurrent;

            if (iCurrent == iDestination) {
                auto pni = new std::list<int>;
                for (int i = 0; i < index; ++ i)
                    pni->push_back(route[i]);
                pnai->push_back(pni);

            } else {
                auto it = mnpEdges[iCurrent].begin();
                auto itBeyond = mnpEdges[iCurrent].end();
                while (it != itBeyond) {
                    if (! mpVisitedFlags[* it])
                        recurse(* it, iDestination,
                                route, index, pnai);
                    ++ it;
                }
            }

            -- index;
            mpVisitedFlags[iCurrent] = false;
        } 

    public:
        DirectedGraph(int iNodes) {
            miNodes = iNodes;
            mnpEdges = new std::list<int>[iNodes];
            mpVisitedFlags = new bool[iNodes];
        }

        ~DirectedGraph() {
            delete mpVisitedFlags;
        }

        void addEdge(int u, int v) {
            mnpEdges[u].push_back(v);
        }

        std::list<std::list<int> *> * findRoutes(int iStart,
                int iDestination) {
            initAlreadyVisited();
            auto route = new int[miNodes];
            auto pnpi = new std::list<std::list<int> *>();
            recurse(iStart, iDestination, route, 0, pnpi);
            delete route;
            return pnpi;
        }
};

int main() {

    DirectedGraph dg(5);

    dg.addEdge(0, 1);
    dg.addEdge(0, 2);
    dg.addEdge(0, 3);
    dg.addEdge(1, 3);
    dg.addEdge(1, 4);
    dg.addEdge(2, 0);
    dg.addEdge(2, 1);
    dg.addEdge(4, 1);
    dg.addEdge(4, 3);

    int startingNode = 2;
    int destinationNode = 3;

    auto pnai = dg.findRoutes(startingNode, destinationNode);

    std::cout
            << "Unique routes from "
            << startingNode
            << " to "
            << destinationNode
            << std::endl
            << std::endl;

    bool bFirst;
    std::list<int> * pi;
    auto it = pnai->begin();
    auto itBeyond = pnai->end();
    std::list<int>::iterator itInner;
    std::list<int>::iterator itInnerBeyond;
    while (it != itBeyond) {
        bFirst = true;
        pi = * it ++;
        itInner = pi->begin();
        itInnerBeyond = pi->end();
        while (itInner != itInnerBeyond) {
            if (bFirst)
                bFirst = false;
            else
                std::cout << ' ';
            std::cout << (* itInner ++);
        }
        std::cout << std::endl;
        delete pi;
    }

    delete pnai;

    return 0;
}

Notes

[1] Improperly handled cycles in a directed graph algorithm will hang in an infinite loop. (Note the trivial case where the number of routes from A to B for the directed graph represented as {A->B, B->A} is infinity.)

[2] Probabilities are sometimes used to reduce the CPU cycle cost of a search. Probabilities, in that strategy, are input values for meta rules in a priority queue to reduce the computational challenge very tedious searches (even for a computer). The early literature in production systems termed the exponential character of unguided large searches Combinatory Explosions.

[3] It may be practically necessary to detect breadth first probability trend at each vertex and specify satisfactory convergence in terms of four criteria

Δ(Σ∏P)_t <= Δ_max ∀ t
Σ_t=0^T Δ(Σ∏P)_t / T <= Δ_ave
|Σ Σ∏P - 1| <= u_max, where u is the maximum allowable deviation from unity for the sum of final probabilities
s < S_max

[4] Provided there are enough computing resources available to support the data structures and ample time to arrive at an answer for the given computing system speed.

[5] You can load DirectedGraph dg(7) with the input data using two loops nested to iterate through the rows and columns enumerated in the question. The body of the inner loop would simply be a conditional edge addition.

if (prob != 0) dg.addEdge(i, j);

Variable prob is P _m,n. Route existence is only concerned with zero/nonzero status.

Chaffer answered 15/2, 2017 at 4:11 Comment(3)

I ended up doing exact resummation for the loop traversals. The recursive algorithm with memorization runs very quickly for sizes of i.e. 10 state systems, even under very low resources. Did you test your code on the example in my question? Maybe we can compare if the final state probabilities we get agree. – Alicealicea 15/2, 2017 at 4:43

The / is heuristic for division. Basically, the entries of that matrix are rational numbers. Each row adds up to 1 (so the probability to go to some other state (or stay at the original) is 100%). The rows with all zeros are final states (probability to leave them is 0%). Node id is the row number. (First row: id=0, second row: id=1 etc.) Correspondingly, column number is destination node id. When there is non-zero probability in a column, a transition to that node from the respective row (starting node) is possible. – Alicealicea 15/2, 2017 at 5:4

Thank you! I will take a closer look. – Alicealicea 15/2, 2017 at 22:42

I found this question while researching directed cyclic graphs. The probability of reaching each of the final nodes can be calculated using absorbing Markov chains.

The video Markov Chains - Part 7 (+ parts 8 and 9) explains absorbing states in Markov chains and the math behind it.

Milson answered 23/1, 2021 at 19:54 Comment(0)

I understand this as the following problem:

Given an initial distribution to be on each node as a vector b and a Matrix A that stores the probability to jump from node i to node j in each time step, somewhat resembling an adjacency matrix.

Then the distribution b_1 after one time step is A x b. The distribution b_2 after two time steps is A x b_1. Likewise, the distribution b_n is A^n x b.

For an approximation of b_infinite, we can do the following:

Vector final_probability(Matrix A, Vector b,
    Function Vector x Vector -> Scalar distance, Scalar threshold){
    b_old = b
    b_current = A x b
    while(distance(b_old,b_current) < threshold){
        b_old = b_current
        b_current = A x b_current
    }
    return b_current
}

(I used mathematical variable names for convencience)

In other words, we assume that the sequence of distributions converges nicely after the given threshold. Might not hold true, but will usually work.

You might want to add a maximal amount of iterations to that.

Euclidean distance should work well as distance.

(This uses the concept of a Markov Chain but is more of a pragmatical solution)

Desex answered 9/2, 2017 at 15:31 Comment(0)

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags