Random simple connected graph generation with given sparseness
Asked Answered
D

5

22

I'm trying to find an efficient algorithm to generate a simple connected graph with given sparseness. Something like:

Input:
    N - size of generated graph
    S - sparseness (numer of edges actually; from N-1 to N(N-1)/2)
Output:
    simple connected graph G(v,e) with N vertices and S edges
Devout answered 11/1, 2010 at 11:37 Comment(1)
This paper claims to provide an efficient algorithm for generating random connected graphs. I haven't gone through the details but they claim better performance, especially for large networks. complexnetworks.fr/wp-content/uploads/2011/01/random.pdfArnettaarnette
T
23

For each node you need at least one edge.

Start with one node. In each iteration, create a new node and a new edge. The edge is to connect the new node with a random node from the previous node set.

After all nodes are created, create random edges until S is fulfilled. Make sure not to create double edges (for this you can use an adjacency matrix).

Random graph is done in O(S).

Triquetrous answered 11/1, 2010 at 11:42 Comment(4)
Just made something like this in Java. In addition, it's better to add random nodes while creating minimum connected graph. Otherwise, you'll always get the only vertical covering last node, no more than two verticals covering previous node and so on. Alternatively, you can just get isomorphic graph of generated one by matrix of adjacency manipulations.Amylolysis
This is good and simple, but it is not O(S) if the graph is dense, because of the double edges check. I mean, the worst case is (almost) never O(S), but if the graph is dense even the expected time complexity if not O(S).Tanto
Sadly this graph is not chosen uniformly at random in the set of all graphs with n nodes and m edges. It will generate a certain type of graph. Since some nodes are present in the graph earlier, these nodes will tend to have higher degrees.Greenlet
Prufer code can be used to generate a uniform random tree, not sure how you can extend it to connected graphs. The implementation for trees is very simple: en.wikipedia.org/wiki/Pr%C3%BCfer_sequenceSergio
L
47

High-Level Idea

  1. Generate a (uniformly chosen) random spanning tree with N nodes and N - 1 edges.
  2. Until the requested number of edges has been reached, add an edge between any two random nodes.

Creating the Spanning Tree

The partition-based answer by ypnos is a good start, but bias is introduced by always selecting a visited node for one end of a new edge. By randomly selecting a visited node at each iteration, nodes that are visited towards the beginning have more iterations from which they have a chance to be chosen. Therefore, earlier nodes are more likely to have a high degree (number of edges) than those picked later.

Example of Bias

As an example, for a 4 node connected graph rather than generating a linear path graph, which is what 75% of the possible spanning trees are, this kind of bias will cause the star graph to be generated with greater than the 25% probability that it should be.

the possible spanning trees for a graph of size 2, 3, and 4 nodes

Bias isn't always a bad thing. It turns out this kind of bias is good for generating spanning trees that are similar to real world computer networks. However, in order to create a truly random connected graph the initial spanning tree must be picked uniformly from the set of possible spanning trees (see Wikipedia's Uniform Spanning Tree article).

Random Walk Approach

One approach to generating a uniform spanning tree is through a random walk. Below is a quote from the paper Generating Random Spanning Trees More Quickly than the Cover Time by Wilson describing simple random walk algorithm.

Start at any vertex and do a simple random walk on the graph. Each time a vertex is first encountered, mark the edge from which it was discovered. When all the vertices are discovered, the marked edges form a random spanning tree. This algorithm is easy to code up, has small running time constants, and has a nice proof that it generates trees with the right probabilities.

This works well for a simple connected graph, however if you need an algorithm for a directed graph then read the paper further as it describes Wilson's Algorithm. Here is another resource for random spanning trees and Wilson's Algorithm.

Implementation

As I was also interested in this problem, I coded Python implementations of various approaches, including the random walk approach. Feel free to take a look at the Gist of the code on GitHub.

Below is an excerpt from the code of the random walk approach:

# Create two partitions, S and T. Initially store all nodes in S.
S, T = set(nodes), set()

# Pick a random node, and mark it as visited and the current node.
current_node = random.sample(S, 1).pop()
S.remove(current_node)
T.add(current_node)

graph = Graph(nodes)

# Create a random connected graph.
while S:
    # Randomly pick the next node from the neighbors of the current node.
    # As we are generating a connected graph, we assume a complete graph.
    neighbor_node = random.sample(nodes, 1).pop()
    # If the new node hasn't been visited, add the edge from current to new.
    if neighbor_node not in T:
        edge = (current_node, neighbor_node)
        graph.add_edge(edge)
        S.remove(neighbor_node)
        T.add(neighbor_node)
    # Set the new node as the current node.
    current_node = neighbor_node

# Add random edges until the number of desired edges is reached.
graph.add_random_edges(num_edges)
Lewls answered 31/1, 2013 at 4:29 Comment(4)
What is the complexity of the random walk approach? Doesn't the "pick a node randomly; ignore it if it was already used" pattern lead to potentially unbound running time?Aldosterone
@JulesOlléon As the paper by Wilson discusses, the Andrei Broder algorithm runs within the cover time of an undirected graph, which is bounded by O(n^3) for the worst graphs, but is as small as O(n lg n) for almost all graphs---where n is the number of vertices.Lewls
What is the use of graph.add_random_edges(num_edges)? Because in the edge is already added in graph.add_edge(edge).Rennin
@want_to_be_calm: The earlier code handles part-1 of the high-level idea by creating a graph with N-1 edges. The second part of the original question asks how to create a graph of a requested sparseness i.e., a specific number of edges. If the ` |E| < (|N| - 1)` then additional edges will need to be added (point-2 of the high-level idea). The value of num_edges in the example code is ambiguous on its own. It depends on the implementation of add_random_edged() as to whether or not the argument specifies the total number of edges requested or the number of new edges that should be added.Lewls
T
23

For each node you need at least one edge.

Start with one node. In each iteration, create a new node and a new edge. The edge is to connect the new node with a random node from the previous node set.

After all nodes are created, create random edges until S is fulfilled. Make sure not to create double edges (for this you can use an adjacency matrix).

Random graph is done in O(S).

Triquetrous answered 11/1, 2010 at 11:42 Comment(4)
Just made something like this in Java. In addition, it's better to add random nodes while creating minimum connected graph. Otherwise, you'll always get the only vertical covering last node, no more than two verticals covering previous node and so on. Alternatively, you can just get isomorphic graph of generated one by matrix of adjacency manipulations.Amylolysis
This is good and simple, but it is not O(S) if the graph is dense, because of the double edges check. I mean, the worst case is (almost) never O(S), but if the graph is dense even the expected time complexity if not O(S).Tanto
Sadly this graph is not chosen uniformly at random in the set of all graphs with n nodes and m edges. It will generate a certain type of graph. Since some nodes are present in the graph earlier, these nodes will tend to have higher degrees.Greenlet
Prufer code can be used to generate a uniform random tree, not sure how you can extend it to connected graphs. The implementation for trees is very simple: en.wikipedia.org/wiki/Pr%C3%BCfer_sequenceSergio
M
4

Based on Wesley Baugh's answer I came up with the following javascript implementation with cytoscape.js to handle graphs:

function generateRandomGraph(cy, numNode, avgDegree, weightMin, weightMax) {
  // create nodes
  for (var i = 0; i < numNode; i++) {
    cy.add({
      group: "nodes",
      data: {
        id: "n" + i
      }
    });
  }

  // perform random walks to connect edges
  var nodes = cy.nodes(),
    S = nodes.toArray(),
    T = []; // visited

  var currNodeIdx = randomIntBetween(0, S.length);
  var currNode = S[currNodeIdx];
  S.splice(currNodeIdx, 1);
  T.push(currNode);

  while (S.length > 0) {
    var neighbourNodeIdx = randomIntBetween(0, S.length);
    var neighbourNode = S[neighbourNodeIdx];
    cy.add({
      group: "edges",
      data: {
        weight: randomIntBetweenInclusive(weightMin, weightMax),
        source: currNode.id(),
        target: neighbourNode.id()
      }
    });
    S.splice(neighbourNodeIdx, 1);
    T.push(neighbourNode);
    currNode = neighbourNode;
  }

  // add random edges until avgDegree is satisfied
  while (nodes.totalDegree() / nodes.length < avgDegree) {
    var temp = sampleInPlace(nodes, 2);
    if (temp[0].edgesWith(temp[1]).length === 0) {
      cy.add({
        group: "edges",
        data: {
          weight: randomIntBetweenInclusive(weightMin, weightMax),
          source: temp[0].id(),
          target: temp[1].id()
        }
      })
    }
  }
}

generateRandomGraph(cy, 20, 2.8, 1, 20);

For complete example source code, visit my github repo :)

Minne answered 6/4, 2016 at 1:21 Comment(0)
F
1

Generate a minimum spanning tree using something like Prim's algorithm, and from there randomly generate additional links to the according to the sparseness you want.

Frustule answered 2/9, 2014 at 5:13 Comment(0)
T
0

There is an optimal solution to this task that works efficiently even for worst case scenario (dense graphs).

First step is to start with a random spanning tree, generating a random spanning tree of N nodes is possible in O(N log N), You can even do it in O(N) using Prufer's Sequence and Fisher–Yates shuffle, I'll be discussing the O(N log N) approach.

We use a Divide and Conquer approach, indirectly, let us assume we've 2 random trees, we can combine these to form a new random tree, only valid edge is from any node of first tree to any node of second tree, so we pick 2 random nodes from both and create an edge. Since each node is a valid tree in itself we can assume we start with N random trees, in each step we combine any 2 random trees to form a new tree, decreasing the number of trees by 1, performing this operation N - 1 times results in a random spanning tree, actual implementation requires optimisation such as small to large merging.

Now that we've a random spanning tree of N nodes our task is to add additional X = M - (N - 1) edges to our tree, where M is the number of edges we want graph to have, let us rephrase the task, we have a universal set of all edges U, we've already picked N - 1 edges, let's call this set E, and we want to pick additional X edges from U \ E, ie. remaining edges.

The size of U is O(N^2) and thus it's not feasible to traverse all remaining edges to pick, had that been feasible we could use Reservoir Sampling algorithm to pick X edges without repetition, but now we need another sampling method that doesn't require traversing the universal set. We use Floyd's Sampling algorithm which is a modification of Reservoir Sampling that works in O(X) instead of O(N^2).

Once we've enumerated X random edges from remaining edges, we map them back to original graph, overall algorithm works in O((N + M) log(N + M)) and does not rely on sparsity of graph.

Implementation in Python

Tinatinamou answered 24/1, 2024 at 10:35 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.