The short answer is that gvpack
does not declare edges between the sub-graphs. Indeed, when there are common node names between sub-graphs, gvpack
renames them to avoid clashes. However, that is fixable.
For example, given three .dot
files 1.dot
:
digraph {
A -> B
A -> C
}
2.dot
:
digraph {
D -> E
E -> F
}
... and 3.dot
:
digraph {
D -> G
G -> A
}
... running gvpack -u 1.dot 2.dot 3.dot | dot -Tjpg -ogvp1.jpg
gives the following graph gvp1.jpg
:
As you can see, gvpack
has relabelled the duplicate node names. However, we can easily reverse the re-labelling using gvpack -u 1.dot 2.dot 3.dot | sed 's/_gv[0-9]\+//g' | dot -Tjpg -ogvsub.jpg
, which produces the following graph gvsub.jpg
:
This approach relies on the subgraphs having node names in common, so it might be necessary to insert additional nodes in the sub-graph .dot
files to achieve this.
(EDIT: The solution above showed the graph with nodes merged but not with the sub-graphs in clusters. The following solution shows the sub-graphs in clusters.)
Given .dot
files 1.dot
(these are the same as the files above, except I've given each digraph a name):
digraph g1 {
A -> B
A -> C
}
2.dot
:
digraph g2 {
D -> E
E -> F
}
... and 3.dot
:
digraph g3 {
D -> G
G -> A
}
... along with hdr.dot
:
digraph GMaster {
compound = true;
g1 [style=invisible, height = 0, width = 0, label=""];
g2 [style=invisible, height = 0, width = 0, label=""];
g3 [style=invisible, height = 0, width = 0, label=""];
g1 -> g2 [lhead=clusterg2, ltail=clusterg1];
g1 -> g3 [lhead=clusterg3, ltail=clusterg1]
... and tail.dot
:
}
... we can run cat 1.dot 2.dot 3.dot | sed 's/digraph \(\w*\) *{/subgraph cluster\1 { \1/' | cat hdr.dot - tail.dot | dot -Tjpg -oclust1.jpg
to give the file clust1.jpg
:
Thus, in the header file, I've added an invisble node to each subgraph, with the same name as the sub-graph, used compound=true
to allow edges between clusters. I've specified the edges to draw between the clusters and I've set lhead
and ltail
for each of the edges between the invisible nodes to ensure that the right cluster is used as the head and tail of each of those edges. I've also added the appropriate invisible node to each subgraph in the process of transforming each subgraph into a cluster using sed
.
The edges between nodes D, G and A are shown because those nodes are common between the clusters. Also, each of them is only shown in one cluster. If the nodes were unique to the clusters, the only edges that would be shown between clusters would be the edges between the invisible nodes. This can be seen in the following graph, where I've renamed the nodes in 3.dot
:
There is one remaining flaw that I haven't quite been able to fix. The invisible nodes still take up a little bit of space so the cluster boxes look lopsided because the invisble nodes sit alongside the visible nodes. This also means that the heads of the edges between the clusters point at one side of the cluster box rather than the middle. At present, I can't see what could be done about that, unless we are prepared to look at each subgraph and find a node that is already in that subgraph/cluster to serve as the representative node for that subgraph/cluster (i.e. the one to or from which we draw edges for that cluster). That could be done easily enough by hand for a few subgraphs but it would be tedious if there were many subgraphs.
In contrast, the approach I've used above requires only that we know the name of the cluster and can insert it into the hdr.dot
file.
I've constructed the hdr.dot
file by hand for this case but the content of the hdr.dot
file could be extracted from the other .dot
files with sed
, awk
, perl
or python
if there was a need. The script could also insert the edges to link the clusters into hdr.dot
if information about which clusters should be connected was available somewhere.