Algorithm to find lowest common ancestor in directed acyclic graph?
Asked Answered
L

11

34

Imagine a directed acyclic graph as follows, where:

  • "A" is the root (there is always exactly one root)
  • each node knows its parent(s)
  • the node names are arbitrary - nothing can be inferred from them
  • we know from another source that the nodes were added to the tree in the order A to G (e.g. they are commits in a version control system)

Directed Acyclic Graph

What algorithm could I use to determine the lowest common ancestor (LCA) of two arbitrary nodes, for example, the common ancestor of:

  • B and E is B
  • D and F is B

Note:

Latton answered 13/2, 2013 at 23:14 Comment(9)
Do you mean 'acylic'. And by 'parents' do you mean all nodes that have a directed edge into the node in question?Ramose
All nodes have directed edges to their parents, if any (e.g. A has no parents). AFAIK the graph is cyclic because of the cycle G-F-E-B-C-D-G.Latton
If you post this question here: cs.stackexchange.com, you definitely get more and better answers.Crump
The problem then becomes understanding the answers... ;-)Latton
@AndrewSwan: The graph would be cyclic if it was undirected. In its current state it's acyclic.Walt
Is the number of nodes in the graph limited? Is it known in advance?Leatherman
So did you settle on a solution for this pb?Auriscope
No, alas, but it's now only of academic interest to me.Latton
I may be naive, but why not use topologial sort here?Brynnbrynna
B
14

Den Roman's link (Archived version) seems promising, but it seemed a little bit complicated to me, so I tried another approach. Here is a simple algorithm I used:

Let say you want to compute LCA(x,y) with x and y two nodes. Each node must have a value color and count, resp. initialized to white and 0.

  1. Color all ancestors of x as blue (can be done using BFS)
  2. Color all blue ancestors of y as red (BFS again)
  3. For each red node in the graph, increment its parents' count by one

Each red node having a count value set to 0 is a solution.

There can be more than one solution, depending on your graph. For instance, consider this graph:

DAG example where LCA can have two values

LCA(4,5) possible solutions are 1 and 2.

Note it still work if you want find the LCA of 3 nodes or more, you just need to add a different color for each of them.

Ballerina answered 4/12, 2014 at 3:14 Comment(1)
The algorithm you've described seems to have some gratuitous complexity that masks what's really going on. Why a count when you're just using the count as a flag? Why N colors when it seems that you only need one color for "ancestor of all nodes previously considered" and a second color for "ancestor of node currently being considered"?Snowman
A
4

I was looking for a solution to the same problem and I found a solution in the following paper:

http://dx.doi.org/10.1016/j.ipl.2010.02.014

In short, you are not looking for the lowest common ancestor, but for the lowest SINGLE common ancestor, which they define in this paper.

Abeokuta answered 18/2, 2013 at 3:8 Comment(0)
L
3

I know it's and old question and pretty good discussion, but since I had some similar problem to solve I came across JGraphT's Lowest Common Ancestor algorithms, thought this might be of help:

Letta answered 31/7, 2018 at 13:19 Comment(1)
JGraphT NaivaLcaFinder is the way to go. Tarjan works only for trees.Drambuie
E
2

Just some wild thinking. What about using both input nodes as roots, and doing two BFS simultaneously step by step. At a certain step, when there are overlapping in their BLACK sets (recording visited nodes), algorithm stops and the overlapped nodes are their LCA(s). In this way, any other common ancestors will have longer distances than what we have discovered.

Eldaelden answered 22/5, 2014 at 22:48 Comment(0)
P
2

Assume that you want to find the ancestors of x and y in a graph.

Maintain an array of vectors- parents (storing parents of each node).

  1. Firstly do a bfs(keep storing parents of each vertex) and find all the ancestors of x (find parents of x and using parents, find all the ancestors of x) and store them in a vector. Also, store the depth of each parent in the vector.

  2. Find the ancestors of y using same method and store them in another vector. Now, you have two vectors storing the ancestors of x and y respectively along with their depth.

  3. LCA would be common ancestor with greatest depth. Depth is defined as longest distance from root(vertex with in_degree=0). Now, we can sort the vectors in decreasing order of their depths and find out the LCA. Using this method, we can even find multiple LCA's (if there).

Palpable answered 17/5, 2018 at 8:53 Comment(0)
S
1

This link (Archived version) describes how it is done in Mercurial - the basic idea is to find all parents for the specified nodes, group them per distance from the root, then do a search on those groups.

Scow answered 27/10, 2014 at 22:51 Comment(0)
R
0

If the graph has cycles then 'ancestor' is loosely defined. Perhaps you mean the ancestor on the tree output of a DFS or BFS? Or perhaps by 'ancestor' you mean the node in the digraph that minimizes the number of hops from E and B?

If you're not worried about complexity, then you could compute an A* (or Dijkstra's shortest path) from every node to both E and B. For the nodes that can reach both E and B, you can find the node that minimizes PathLengthToE + PathLengthToB.

EDIT: Now that you've clarified a few things, I think I understand what you're looking for.

If you can only go "up" the tree, then I suggest you perform a BFS from E and also a BFS from B. Every node in your graph will have two variables associated with it: hops from B and hops from E. Let both B and E have copies of the list of graph nodes. B's list is sorted by hops from B while E's list is sorted by hops from E.

For each element in B's list, attempt to find it in E's list. Place matches in a third list, sorted by hops from B + hops from E. After you've exhausted B's list, your third sorted list should contain the LCA at its head. This allows for one solution, multiple solutions(arbitrarily chosen among by their BFS ordering for B), or no solution.

Ramose answered 14/2, 2013 at 0:9 Comment(5)
The ancestor of a node must be reachable by going "up" the graph as drawn, i.e. by traversing edges in the direction of the arrow.Latton
@AndrewSwan: Yes, but the answer still isn't unique. Consider A>C, B>D, C>E, C>F, D>E, D>F. If I ask for LCA(A,B), do you want E or F?Dairymaid
That graph isn't valid for this scenario because it has two roots, E and F. There should be exactly one root, meaning any two nodes always have exactly one LCA. I've edited the question to clarify this.Latton
Add E>G, F>G to @tmyklebu's example and you've got exactly one root and two LCAs, E and F. This is a direct consequence of allowing a node to have multiple parents.Atmo
@AndrewSwan: I've made an edit to my post. Have I understood your problem correctly?Ramose
V
0

I also need exactly same thing , to find LCA in a DAG (directed acyclic graph). LCA problem is related to RMQ (Range Minimum Query Problem).

It is possible to reduce LCA to RMQ and find desired LCA of two arbitrary node from a directed acyclic graph.

I found THIS TUTORIAL detail and good. I am also planing to implement this.

Vanhouten answered 14/3, 2013 at 7:16 Comment(0)
S
0

I am proposing O(|V| + |E|) time complexity solution, and i think this approach is correct otherwise please correct me.

Given directed acyclic graph, we need to find LCA of two vertices v and w.

Step1: Find shortest distance of all vertices from root vertex using bfs http://en.wikipedia.org/wiki/Breadth-first_search with time complexity O(|V| + |E|) and also find the parent of each vertices.

Step2: Find the common ancestors of both the vertices by using parent until we reach root vertex Time complexity- 2|v|

Step3: LCA will be that common ancestor which have maximum shortest distance.

So, this is O(|V| + |E|) time complexity algorithm.

Please, correct me if i am wrong or any other suggestions are welcome.

Sarina answered 7/6, 2014 at 7:23 Comment(1)
How do you find common ancestors for both vertices by using parent? Can you elaborate on that?Dow
S
0
package FB;

import java.util.*;

public class commomAnsectorForGraph {
    public static void main(String[] args){
        commomAnsectorForGraph com = new commomAnsectorForGraph();
        graphNode g = new graphNode('g');
        graphNode d = new graphNode('d');
        graphNode f = new graphNode('f');
        graphNode c = new graphNode('c');
        graphNode e = new graphNode('e');
        graphNode a = new graphNode('a');
        graphNode b = new graphNode('b');

        List<graphNode> gc = new ArrayList<>();
        gc.add(d);
        gc.add(f);
        g.children = gc;

        List<graphNode> dc = new ArrayList<>();
        dc.add(c);
        d.children = dc;

        List<graphNode> cc = new ArrayList<>();
        cc.add(b);
        c.children = cc;

        List<graphNode> bc = new ArrayList<>();
        bc.add(a);
        b.children = bc;

        List<graphNode> fc = new ArrayList<>();
        fc.add(e);
        f.children = fc;

        List<graphNode> ec = new ArrayList<>();
        ec.add(b);
        e.children = ec;

        List<graphNode> ac = new ArrayList<>();
        a.children = ac;

        graphNode gn = com.findAncestor(g, c, d);
        System.out.println(gn.value);
    }

    public graphNode findAncestor(graphNode root, graphNode a, graphNode b){
        if(root == null) return null;
        if(root.value == a.value || root.value == b.value) return root;

        List<graphNode> list = root.children;

        int count = 0;
        List<graphNode> temp = new ArrayList<>();

        for(graphNode node : list){
            graphNode res = findAncestor(node, a, b);
            temp.add(res);
            if(res != null) {
                count++;
            }
        }

        if(count == 2) return root;

        for(graphNode t : temp){
            if(t != null) return t;
        }

        return null;
    }
}
class graphNode{
    char value;
    graphNode parent;
    List<graphNode> children;
    public graphNode(char value){
        this.value = value;
    }
}
Selsyn answered 10/6, 2021 at 20:47 Comment(0)
S
-2

Everyone. Try please in Java.

static String recentCommonAncestor(String[] commitHashes, String[][] ancestors, String strID, String strID1)
{
    HashSet<String> setOfAncestorsLower = new HashSet<String>();
    HashSet<String> setOfAncestorsUpper = new HashSet<String>();
    String[] arrPair= {strID, strID1};
    Arrays.sort(arrPair);
    Comparator<String> comp = new Comparator<String>(){
        @Override
        public int compare(String s1, String s2) {
           return s2.compareTo(s1);
        }};
    int indexUpper = Arrays.binarySearch(commitHashes, arrPair[0], comp);
    int indexLower = Arrays.binarySearch(commitHashes, arrPair[1], comp);
    setOfAncestorsLower.addAll(Arrays.asList(ancestors[indexLower]));
    setOfAncestorsUpper.addAll(Arrays.asList(ancestors[indexUpper]));
    HashSet<String>[] sets = new HashSet[] {setOfAncestorsLower, setOfAncestorsUpper};
    for (int i = indexLower + 1; i < commitHashes.length; i++)
    {
        for (int j = 0; j < 2; j++)
        {
            if (sets[j].contains(commitHashes[i]))
            {
                if (i > indexUpper)
                    if(sets[1 - j].contains(commitHashes[i]))
                        return commitHashes[i];
                sets[j].addAll(Arrays.asList(ancestors[i]));
            }
        }
    }
    return null;
}

The idea is very simple. We suppose that commitHashes ordered in downgrade sequence. We find lowest and upper indexes of strings(hashes-does not mean). It is clearly that (considering descendant order) the common ancestor can be only after upper index (lower value among hashes). Then we start enumerating the hashes of commit and build chain of descendent parent chains . For this purpose we have two hashsets are initialised by parents of lowest and upper hash of commit. setOfAncestorsLower, setOfAncestorsUpper. If next hash -commit belongs to any of chains(hashsets), then if current index is upper than index of lowest hash, then if it is contained in another set (chain) we return the current hash as result. If not, we add its parents (ancestors[i]) to hashset, which traces set of ancestors of set,, where the current element contained. That is the all, basically

Shortwave answered 2/8, 2017 at 20:50 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.