Is there an option to automatically remove "redundant" edges in a dot graph?

Asked 9/7, 2019 at 8:39 Answered 8/12, 2023 at 12:10

I create a dot graph of dependencies for my Debian projects (see picture). The dependencies include redundant edges. I'd like to have a simpler graph without those redundant edges. I could calculate those on my own, but it's not too easy since I generate the .dot file in my CMakeLists.txt and .cmake extensions.

So I'm wondering whether there would be an option in dot or Graphviz to remove edges that are not required. So for example, the top snapwebsites project depends on csspp and advgetopt. Since the cspp package already depends on advgetopt, there is no need for the edge between snapwebsites and advgetopt.

In the digraph, this would mean:

"snapwebsites" -> "advgetopt";     <-- "auto-remove" this one
"snapwebsites" -> "csspp";

"csspp" -> "advgetopt";

So, is there such an option?

Zygophyllaceous answered 9/7, 2019 at 8:39 Comment(1)

Voting up, because I don't see a comment why this question was voted down. At least I came here with the same question. – Drawback 22/12, 2020 at 15:26

There is also the tred program distributed with graphviz.

tred computes the transitive reduction of directed graphs, and prints the resulting graphs to standard output. This removes edges implied by transitivity. Nodes and subgraphs are not otherwise affected. The ``meaning'' and validity of the reduced graphs is application dependent. tred is particularly useful as a preprocessor to dot to reduce clutter in dense layouts.

Chiffchaff answered 8/12, 2023 at 12:10 Comment(1)

Wow! Nothing to do! Just tred input.dot > output.dot. Done! Excellent. – Zygophyllaceous 8/12, 2023 at 15:48

Based on @marapet's answer, I created a script and I thought maybe others would benefit from having a copy. It's also in Snap! C++ as clean-dependencies.gvpr.

# Run with:
#
#    /usr/bin/gvpr -o clean-dependencies.dot -f clean-dependencies.gvpr dependencies.dot
#
# Clean up the dependencies.svg from double dependencies
# In other words if A depends on B and C, and B also depends on C, we
# can remove the link between A amd C, it's not necessary in our file.

BEG_G {
    edge_t direct_edges[int];
    node_t children[int];

    node_t n = fstnode($G);
    while(n != NULL) {

        // 1. extract the current node direct children
        //
        int direct_pos = 0;
        edge_t e = fstout_sg($G, n);
        while(e != NULL) {
            direct_edges[direct_pos] = e;
            children[direct_pos] = opp(e, n);
            direct_pos = direct_pos + 1;
            e = nxtout_sg($G, e);
        }

        // 2. find all of the grand children
        //    and see whether some are duplicates, if so delete the
        //    original (direct) edge
        //
        int child_pos = direct_pos;
        int c = 0;
        for(c = 0; c < child_pos; ++c) {
            e = fstout_sg($G, children[c]);
            while(e != NULL) {
                node_t o = opp(e, children[c]);
                int idx;
                for(idx = 0; idx < direct_pos; ++idx) {
                    if(children[idx] == o) {
                        if(direct_edges[idx] != NULL) {
                            delete($G, direct_edges[idx]);
                            direct_edges[idx] = NULL;
                        }
                        break;
                    }
                }
                e = nxtout_sg($G, e);
            }
        }

        n = nxtnode_sg($G, n);
    }
}

END_G {
    $O = $G;
}

A couple of things I want to mention: the gvpr functions do not seem to accept arrays as input. Also I first tried using a recursive approach and the local variables are smashed by further calls (i.e. on return of a recursive call, the value of a variable is the one from the child call... so the variables are "local" to a function, but still only one instance, no stack!)

Hopefully later versions will fix these problems.

$ gvpr -V
gvpr version 2.38.0 (20140413.2041)

It was already a much easier way to fix my graph than trying to do the same in CMake.

Update

I actually ended up writing a Python script to do the work. It's easy to remove the duplicates in a tree defined as a makefile. It's much harder to read the .dot data and fix it. However, in my case I have access to that data which you may not have.

Zygophyllaceous answered 18/7, 2019 at 16:0 Comment(0)

There is also the tred program distributed with graphviz.

tred computes the transitive reduction of directed graphs, and prints the resulting graphs to standard output. This removes edges implied by transitivity. Nodes and subgraphs are not otherwise affected. The ``meaning'' and validity of the reduced graphs is application dependent. tred is particularly useful as a preprocessor to dot to reduce clutter in dense layouts.

Chiffchaff answered 8/12, 2023 at 12:10 Comment(1)

Wow! Nothing to do! Just tred input.dot > output.dot. Done! Excellent. – Zygophyllaceous 8/12, 2023 at 15:48

There is no such option built-in as far as I know (and I could be wrong…).

The easiest way usually is to only include in the graphviz script the edges needed in the first place. If this is not possible, you could process your graph with gvpr (graphviz pattern scanning and processing language) before outputting its output to dot for the layout.

This of course means you'd have to implement the detection and suppression of unneeded edges with gvpr, a script you then could reuse whenever needed.

Wyandotte answered 9/7, 2019 at 13:30 Comment(2)

Right, you can also use some Graphviz library for one of the programming languages (e.g. pygraphviz for Python) to process and modify the graph. – Armijo 10/7, 2019 at 7:8

@Dany, that might have been better than gvpr. I ran in various problems (see my answer) which would probably not exist in pygraphviz. – Zygophyllaceous 18/7, 2019 at 16:2

Update

Recommended topics

Hot tags