Finding separate graphs within a graph object in networkx

E

3

31

I have an enormous graph dataset - let's say it is like this, but on a much bigger level:

1 -> 2
3 -> 4

1,2,3,4 are nodes and the arrows are directed edges. Let's say that they are all in a single graph object:

import networkx as nx
G = nx.DiGraph()
G.add_nodes_from([1,2,3,4])
G.add_edge(1,2)
G.add_edge(3,4)

Given an object like this, which has two mini graphs within a graph, how can we pull out each mini graph? I feel like there must be some word for this? My end result would look like:

for mini_graph in G:
    print mini_graph.nodes()

...
[1,2]
[3,4]

Expletive answered 12/2, 2014 at 21:4 Comment(2)

I think you can use weakly_connected_component_subgraphs and if so this is a duplicate of this: #18644289 – Chondrule 13/2, 2014 at 8:18

Also related: #13915420. It depends how you define subgraphs here – Chondrule 13/2, 2014 at 8:19

Z

30

If the parts of the graph are truly disjoint (as per your small example), then consider extracting the subgraphs with connected_component_subgraphs().

This only works on an undirected graph, so if you are using a directed graph then you'll need to convert to undirected first.

import networkx as nx
G = nx.DiGraph()
G.add_nodes_from([1,2,3,4])
G.add_edge(1,2)
G.add_edge(3,4)

# make an undirected copy of the digraph
UG = G.to_undirected()

# extract subgraphs
sub_graphs = nx.connected_component_subgraphs(UG)

for i, sg in enumerate(sub_graphs):
    print "subgraph {} has {} nodes".format(i, sg.number_of_nodes())
    print "\tNodes:", sg.nodes(data=True)
    print "\tEdges:", sg.edges()

which yields:

subgraph 1 has 2 nodes
    Nodes: [(1, {}), (2, {})]
    Edges: [(1, 2)]
subgraph 1 has 2 nodes
    Nodes: [(3, {}), (4, {})]
    Edges: [(3, 4)]

and you could use the subgraph node labels to operate on your data in the initial graph,

sg.nodes()[0] in G
>>>  True

Reading the answer linked by EdChum, it appears that weakly_connected_component_subgraphs() operates on a directed graph but treats it as undirected, so saving the copy might be crucial. However, the docs on this and the related function weakly_connected_components() are a bit thin at present.

Zebec answered 13/2, 2014 at 10:30 Comment(0)

S

36

As of 2018 the above answer is deprecated (link to docs). You are advised to use:

(G.subgraph(c) for c in connected_components(G))

or

(G.subgraph(c).copy() for c in connected_components(G))

Spitter answered 9/4, 2018 at 13:29 Comment(1)

Is there any reason you used list generators and not list comprehentions ? – Overwhelm 28/10, 2022 at 13:46

Z

30

If the parts of the graph are truly disjoint (as per your small example), then consider extracting the subgraphs with connected_component_subgraphs().

This only works on an undirected graph, so if you are using a directed graph then you'll need to convert to undirected first.

import networkx as nx
G = nx.DiGraph()
G.add_nodes_from([1,2,3,4])
G.add_edge(1,2)
G.add_edge(3,4)

# make an undirected copy of the digraph
UG = G.to_undirected()

# extract subgraphs
sub_graphs = nx.connected_component_subgraphs(UG)

for i, sg in enumerate(sub_graphs):
    print "subgraph {} has {} nodes".format(i, sg.number_of_nodes())
    print "\tNodes:", sg.nodes(data=True)
    print "\tEdges:", sg.edges()

which yields:

subgraph 1 has 2 nodes
    Nodes: [(1, {}), (2, {})]
    Edges: [(1, 2)]
subgraph 1 has 2 nodes
    Nodes: [(3, {}), (4, {})]
    Edges: [(3, 4)]

and you could use the subgraph node labels to operate on your data in the initial graph,

sg.nodes()[0] in G
>>>  True

Reading the answer linked by EdChum, it appears that weakly_connected_component_subgraphs() operates on a directed graph but treats it as undirected, so saving the copy might be crucial. However, the docs on this and the related function weakly_connected_components() are a bit thin at present.

Zebec answered 13/2, 2014 at 10:30 Comment(0)

F

6

As previous answers are made for undirected graphs, we will lose vital information of the direction, because of the conversion to an undirected graph. I've had the same issue and finally the method weakly_connected_components did it.

>>> G = nx.DiGraph()
>>> G.add_nodes_from([1,2,3,4])
>>> G.add_edge(1,2)
>>> G.add_edge(3,4)
>>> list(nx.weakly_connected_components(G))
[{1, 2}, {3, 4}]

It works with directed graphs and its performance is quite decent. If you like to split your graph and continue computation (like me), then you can also build subgraphs of the result above with:

h = nx.weakly_connected_component_subgraphs(G)

j = []
for foo in h:
    j.append(foo)

(Very explicit, to show how this can be accessed). For whatever reason, h seems to be destroyed by listing it?! The above way is stable instead.

Fruitarian answered 18/3, 2019 at 7:43 Comment(1)

It appears that nx.weakly_connected_component_subgraphs is no longer maintained in version 2.8. I would guess that it was replaced by weakly_connected_components which returns a generator. Assuming weakly_connected_component_subgraphs also returned a generator, you can define h = list(nx.weakly_connected_component_subgraphs(G)) and do away with the for-loop. – Edwardoedwards 12/5, 2022 at 14:58

Recommended topics

Hot tags