How can I influence Graphviz/dot to make nicer control-flow graphs by removing snaking and better edge crossings?
Asked Answered
N

1

3

I am drawing control-flow graphs for Python programs and would like to influence which kind of edges should not be crossed over. Is there a way to do this?

Consider this simple Python program:

try:
    a += 1
except:
    a += 2
else:
    a = 3

And a dot program to represent the control flow for that generated via https://github.com/rocky/python-control-flow/

digraph G {
  mclimit=1.5;
  rankdir=TD; ordering=out;
  graph[fontsize=10 fontname="Verdana"];
  color="#efefef";
  node[shape=box style=filled fontsize=8 fontname="Verdana" fillcolor="#efefef"];
  edge[fontsize=8 fontname="Verdana"];

  node_0 [shape = "oval"][label="Basic Block 0\loffsets: 0..12\lflags=entry, block, unconditional, try\ljumps=[34]\l"];
  node_1 [label="Basic Block 1\loffsets: 14..30\lflags=except, unconditional\ljumps=[38]\l"];
  node_2 [label="Basic Block 2\loffsets: 32..32\lflags=end finally\l"];
  node_3 [label="Basic Block 3\loffsets: 34..36\l"];
  node_4 [label="Basic Block 4\loffsets: 38..40\lflags=no fallthrough\l"];

  node_0 -> node_2 [weight=1][color="red"];
  node_3 -> node_4 [weight=10];
  node_0 -> node_1 [weight=1][color="red"];
  node_2 -> node_3 [weight=10];
  node_0 -> node_1 [weight=10][style="invis"];
  node_1 -> node_2 [weight=10][style="invis"];
  node_1 -> node_4 [weight=1];
  node_0 -> node_3 [weight=1];
}

The image that dot produces for the above is enter image description here

Notice how one line snakes around and crosses a straight arrow down. Instead I would prefer if none of the straight downward arrows would be crossed. Splined edges would make better crossing places.

If you look at the dot, I have two invisible downward edges which I use for alignment. (In the bytecode these follow the linear sequence of instructions).

So if a downward straight line needs to be crossed (and here it doesn't), invisible edges would be preferred over those that are visible.

Thoughts?

Edit

The one excellent answer so far suggests changing the order that the edges are defined and specifying in certain situations where edge attachments should be made.

In this application, I do have hierarchical nesting information from a dominator tree, and I can classify the edges: those that loop, those that jump to the end of a compound structure, those that break out a loop, and so on.

So now the problem becomes exactly how one uses this information to avoid those snaking arrows, and ensures that breaks out of loops are preferred to cross over edges over say "if"/"else" jump edges.

This feels like VLSI design: come up with a set of patterns that work for each kind of (control-flow) structure and those will then nest and sequence properly.

I have experimented with the edge ordering and placement, and I just don't have an intuitive sense of when to put something earlier or later.

Guidance, or better, a design rule for structured control flow edges would be greatly appreciated.

Nevis answered 25/11, 2018 at 15:6 Comment(0)
W
3

You need to do two things to improve the situation:

  • draw (one of) the edges that you want to control earlier than the others,
  • tell graphviz where you want them to be attached (North, East... )

I have edited your code accordingly

digraph G {
  mclimit=1.5;
  rankdir=TD; ordering=out;
  graph[fontsize=10 fontname="Verdana"];
  color="#efefef";
  node[shape=box style=filled fontsize=8 fontname="Verdana" fillcolor="#efefef"];
  edge[fontsize=8 fontname="Verdana"];

  node_0 [shape = "oval"][label="Basic Block 0\loffsets: 0..12\lflags=entry, block, unconditional, try\ljumps=[34]\l"];
  node_1 [label="Basic Block 1\loffsets: 14..30\lflags=except, unconditional\ljumps=[38]\l"];
  node_2 [label="Basic Block 2\loffsets: 32..32\lflags=end finally\l"];
  node_3 [label="Basic Block 3\loffsets: 34..36\l"];
  node_4 [label="Basic Block 4\loffsets: 38..40\lflags=no fallthrough\l"];

  node_0 -> node_3:nw [weight=1];           /* moved up and added directions*/
  node_0 -> node_2 [weight=1][color="red"];
  node_3 -> node_4 [weight=10];
  node_0 -> node_1 [weight=1][color="red"];
  node_2 -> node_3 [weight=10];
  node_0 -> node_1 [weight=10][style="invis"];
  node_1 -> node_2 [weight=10][style="invis"];
  node_1:se -> node_4:ne [weight=1];            /* added directions */
}

which gives you

enter image description here

There's a bit of trial and error involved here but I'm sure this should help.

Waldon answered 25/11, 2018 at 22:59 Comment(4)
This is interesting. I can use the program structure (specifically the dominator tree) to assist me in determining the inner-to-outer structure. But I need to understand how the order changes things. Do I do the inner blocks first, and outer blocks last, or the other way around? Those things that are more okay to cross edges (like a break inside a loop which might cross the inner "if/try" statements) do I render those last? I suppose I could put all of the loop back edges on one side.Nevis
In my prior experiments I was using [headport=nw, tailport=sw] instead of the colon thing you have on the node which works better. What's the difference?Nevis
To your first point: I don't have a good answer, and the topic seems to be difficult for insiders as well. In general, my experience is that edges that are defined earlier have a higher "priority" but that's not real knowledge. To the second point, I believe the "colon thing" is just a shorthand, but again cannot confirm definitely.Waldon
Ok. This is helpful, even if it is what it is. I have been playing around with the layout manually, and I really don't have any good insight as to an algorithmic way to get good output. So I'll leave this unaccepted for a couple of days in case someone else has further detail. Thanks.Nevis

© 2022 - 2024 — McMap. All rights reserved.