Generating strongly-connected, uniformly-distributed, random di-graphs
Asked Answered
T

2

10

So I've been building a program that uses Monte Carlo simulations to find properties of evolutionary graph theory. One of the key functions of this is to be able to generate uniformly-distributed random graphs, so that we can determine the generalised properties of graphs. For the case of connected undirected graphs I have implemented the solution outlined in this answer.

However for directed graphs, generating the one-directional uniform spanning tree you get from Wilson's algorithm doesn't ensure that the graph is strongly-connected, and it seems that adding extra edges to make the spanning tree bi-directional would introduce a bias into the graphs that you generate.

I feel like I might be missing something obvious/misunderstanding something, but essentially my request is, can someone recommend to me a high-level scheme that allows me to generate strongly-connected, uniformly-distributed, random di-graphs?

Tervalent answered 14/4, 2015 at 15:32 Comment(2)
Have you tested that that answer actually generates uniform undirected graphs? I'm skeptical.Hammons
Are the graphs unlabeled? Which two graphs do you consider equal, e.g. if they differ only in they labeling? This has a big impact on what "uniformly-distributed" actually means.Hag
A
3

Can someone recommend to me a high-level scheme that allows me to generate strongly-connected, uniformly-distributed, random di-graphs?

I had a similar problem generating expression trees for test data. I found that if you find out how to count unique trees then the problem becomes easy. What I mean by that is that I found for full binary trees with N internal nodes the number of unique trees based on N is the Catalan Numbers. Then for binary trees that have unary branches with N total nodes the number of unique trees based on N is the Motzkin Numbers.

Then I found The On-Line Encyclopedia of Integer Sequences®. So if you know a value, N, that can uniquely identify a graph and you know the corresponding count of unique graphs for that N and put those counts into a the OEIS search you should get back a page that will help you in your search. e.g. Catalan Numbers for full binary trees or Motzkin Numbers for regular binary tress. Along the way I found that one of the keys to generating them was a recurrence relation.

Or you can use keywords in the search but this may not get an exact hit. I only found Motzkin numbers using the number sequence and not via keywords.

Here is an OEIS query for strongly connected digraph

Now if you know the count for a given N and you either generate all graphs for a given N or can have a one to one correspondence between a value and the graph then you just generate random integers and get/generate the corresponding graph. If I understand your problem correctly this should solve it.

My best guess to the OEIS sequence for this question is:

Number of acyclic digraphs with n unlabeled nodes. A003087

Which has a reference to Uniform random generation of large acyclic digraphs

TL;DR

For some related history see my question: Algorithm improvement for enumerating binary trees

Ales answered 30/3, 2017 at 0:34 Comment(1)
@DaveGalvin I don't know which sequence is correct because some information is missing, e.g. rooted, labeled, cyclic, etc. I wish the person posting the bounty would make it clear which is the correct OEIS sequence. If they did that then I would work on this more.Ales
H
3

The most straightforward solution I can think of is to randomly generate uniformly-distributed digraphs and reject any that are not strongly connected. That will preserve uniform distribution and guarantee the property that you want. It's probably not terribly efficient but you'll know for sure if you run some tests.

Haymo answered 14/4, 2015 at 20:14 Comment(5)
Is it that easy to generate uniformly distributed digraphs? What if they are unlabeled? How can we make sure, that there is no bias?Hag
@Hag If unlabeled, accept each generated labeled graph with probability equal to one over the number of automorphisms (compute this with, e.g., nauty). Since almost all digraphs have only the trivial automorphism, the difference between distributions is perhaps negligible anyway for graphs of moderate size.Hammons
@Hag Generating uniformly distributed digraphs is the only real problem here, but it's not hard. Some thought should be given to the measure in which the distribution is uniform: in the Erdos-Renyi model, we start with a given number n vertices, and generate an edge from vertex a to vertex b with probability p (=0.5 for uniform case). Do that for all n^2 possible edges. The algorithm runs in time O(n^2), which is the best you can do anyway. Since almost all digraphs are strongly connected (paper by Moon and Moser) rejection will discard only a tiny proportion.Haymo
Almost all, yes, but not almost all if the density is controlled to be sufficiently sparse (either by dialing down p or by fixing the number of arcs). @EdwardDoolittle probably knows this already, though.Hammons
@DavidEisenstat Yes, that's what I meant by "thought should be given to the measure in which the distribution is uniform". "Almost all" applies to the distribution in which the probability of an edge is 0.5. That is what the poster meant, I assume, in the absence of other information.Haymo
A
3

Can someone recommend to me a high-level scheme that allows me to generate strongly-connected, uniformly-distributed, random di-graphs?

I had a similar problem generating expression trees for test data. I found that if you find out how to count unique trees then the problem becomes easy. What I mean by that is that I found for full binary trees with N internal nodes the number of unique trees based on N is the Catalan Numbers. Then for binary trees that have unary branches with N total nodes the number of unique trees based on N is the Motzkin Numbers.

Then I found The On-Line Encyclopedia of Integer Sequences®. So if you know a value, N, that can uniquely identify a graph and you know the corresponding count of unique graphs for that N and put those counts into a the OEIS search you should get back a page that will help you in your search. e.g. Catalan Numbers for full binary trees or Motzkin Numbers for regular binary tress. Along the way I found that one of the keys to generating them was a recurrence relation.

Or you can use keywords in the search but this may not get an exact hit. I only found Motzkin numbers using the number sequence and not via keywords.

Here is an OEIS query for strongly connected digraph

Now if you know the count for a given N and you either generate all graphs for a given N or can have a one to one correspondence between a value and the graph then you just generate random integers and get/generate the corresponding graph. If I understand your problem correctly this should solve it.

My best guess to the OEIS sequence for this question is:

Number of acyclic digraphs with n unlabeled nodes. A003087

Which has a reference to Uniform random generation of large acyclic digraphs

TL;DR

For some related history see my question: Algorithm improvement for enumerating binary trees

Ales answered 30/3, 2017 at 0:34 Comment(1)
@DaveGalvin I don't know which sequence is correct because some information is missing, e.g. rooted, labeled, cyclic, etc. I wish the person posting the bounty would make it clear which is the correct OEIS sequence. If they did that then I would work on this more.Ales

© 2022 - 2024 — McMap. All rights reserved.