How can I extract all possible induced subgraphs from a given graph with networkx
Asked Answered
M

2

10

I am wondering whether can I use networkx to extract all possible induced subgraphs (graphlets) with specific number of nodes in the subgraphs from an input large graph, or is there another package that can do the job? For example, if I have a large graph, which is illustrated in networkx adjacency list format,

graph G:

1 2 3 7
2 1 4
3 1 4 6 5
4 2 3 5
5 3 4 6
6 3 5 7
7 1 6

which will be look like

enter image description here

if I want to extract graphlet with 3 nodes the algorithm should return me

subgraph1:

1 2 3
2 1
3 1

[(1,2),(1,3)] enter image description here subgraph2:

1 3 7
3 1
7 1

[(1,3),(1,7)] enter image description here subgraph3:

3 4 5
4 3 5
5 3 4

[(3,4),(3,5),(4,5)] enter image description here

subgraph4,subgraph5,subgraph6...

The following is the code of the question suggested by @Hooked. Let's say n=3

import itertools
target = nx.complete_graph(3)
for sub_nodes in itertools.combinations(g.nodes(),len(target.nodes())):
    subg = g.subgraph(sub_nodes)
    if nx.is_connected(subg):
        print subg.edges()

the the output will look like

[(1, 2), (1, 3)]
[(1, 2), (2, 4)]
[(1, 2), (1, 7)]
[(1, 3), (3, 4)]
[(1, 3), (3, 5)]
[(1, 3), (3, 6)]
[(1, 3), (1, 7)]
[(1, 7), (6, 7)]
[(2, 4), (3, 4)]
[(2, 4), (4, 5)]
[(3, 4), (3, 5), (4, 5)]
[(3, 4), (3, 6)]
[(3, 5), (3, 6), (5, 6)]
[(3, 6), (6, 7)]
[(4, 5), (5, 6)]
[(5, 6), (6, 7)]
Menu answered 4/3, 2014 at 18:50 Comment(11)
Do you mean non-isomorphic induced subgraph ? Why not triangle on 3 4 5 ?Tsana
@Tsana Graphlets was a new term for me too. Wikipedia's says that they are "... small connected non-isomorphic induced subgraphs of a large network."Percheron
@Percheron : triangle 3 4 5 is connected.Tsana
@Tsana Sorry I wasn't clear, I was responding to your question if OP meant non-isomorphic induced subgraphs but trying to interpret the meaning of a "graphlet". Yes 3,4,5 is connected and yes I think it should be included if I'm understanding the question.Percheron
@Percheron : since the word connected doesn't appear anywhere in the question, I'm not sure OP mean it as well !Tsana
@Tsana ... "possible induced subgraphs (graphlets)"...Percheron
@Percheron Sorry, I am new for posting the question on stackoverflow. I just list a few example subgraphs that would be generated by the algoithm.Menu
@Tsana I think OP intends to ask for network motifs however...Percheron
@tohnperfect Your question is fine and welcome to Stack Overflow! I've attempted to answer your question, but I think (as the discussion shows) that there were a few parts that were unclear. Once you get enough rep you'll be able to post images yourself - be sure to upvote good answers/questions and accept an answer when it's complete.Percheron
@Percheron Thank you for your answer. Bare with me because I will spend sometime, I am new to networkx also, to try whether your solution is work for me. Thanks againMenu
@tohnperfect Feel free to ask questions (but do so as a comment to my answer). Explore the networkx manual for each of the functions I call and make sure you understand what itertools.combinations does.Percheron
P
13

This assumes you want all matching subgraphs of a given target which you'll have to define. The native way is to loop over all combinations of nodes, find those connected then check for an isomorphism. It's unclear if you want a network motif or a graphlet. In a graphlet all edges present in the original graph must be there - this would exclude 3-4-5 from your target. This method finds graphlets, to find motifs you'll have to check for each combination if there is an induced subgraph (and how many!).

import networkx as nx

g = nx.Graph()
g.add_edge(1,2);g.add_edge(1,3)
g.add_edge(1,7);g.add_edge(2,4)
g.add_edge(3,4);g.add_edge(3,5)
g.add_edge(3,6);g.add_edge(4,5)
g.add_edge(5,6);g.add_edge(6,7)

import itertools

target = nx.Graph()
target.add_edge(1,2)
target.add_edge(2,3)

for sub_nodes in itertools.combinations(g.nodes(),len(target.nodes())):
    subg = g.subgraph(sub_nodes)
    if nx.is_connected(subg) and nx.is_isomorphic(subg, target):
        print subg.edges()

For me, this gives the edge set matches of:

[(1, 2), (1, 3)]
[(1, 2), (2, 4)]
[(1, 2), (1, 7)]
[(1, 3), (3, 4)]
[(1, 3), (3, 5)]
[(1, 3), (3, 6)]
[(1, 3), (1, 7)]
[(1, 7), (6, 7)]
[(2, 4), (3, 4)]
[(2, 4), (4, 5)]
[(3, 4), (3, 6)]
[(3, 6), (6, 7)]
[(4, 5), (5, 6)]
[(5, 6), (6, 7)]

Your examples are listed in here.

Percheron answered 4/3, 2014 at 19:20 Comment(8)
Thank for the answer but my puropose is that for the given large graph G and number of nodes in subgraph n, I want to extract all possible 3-node-connected-subgraphs which contain the same edges as the original large graph.Menu
@tohnperfect I'm not sure I understand. Each set listed is a proper induced subgraph. Your two examples are [(1, 2), (1, 3)], [(1, 3), (1, 7)] in that list. [(3, 4), (4, 5)] is NOT in that listed as the comments suggested as it wouldn't be an induced subgraph since the edge (5,3) would need to be present.Percheron
[(3,4),(4,5),(3,5)] should be in the list also. There will be a lot of subgraphs that generated by the algorithm. My example is not clear enough because there is only one subgraph style.Menu
As my understanding, let's say, for graphlet, a subgraph with 3 nodes extract from a large graph has to preserve all the links show in the original large graph.Menu
@tohnperfect If you change target to the complete graph on three nodes (K3), this algorithm will match it up. Your example only had matches against the line graph on three nodes which is what I was looking at. If you remove the check against target, nx.is_isomorphic(subg, target) you'll get all matches for a given size n.Percheron
I got it now. Thank you very much. This is exactly what I want. If you don't mind, could you please edit the code to match to my question and post again or i can post the edited version; so that I can check the matched solution to my question. Thank you.Menu
everything works fine with extracting up to 4 nodes graphlets from the 100-nodes-graph but for bigger graphlet there is an error in subg = g.subgraph(sub_nodes) function which the compiler said about ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all(). I think it might be limitation of networkx itself, any idea?Menu
If your graphlet is larger than a few nodes, you are will start running into other problems. itertools.combinations is a combinatorial procedure that tries every combination of nodes, with for example, 100 master nodes and a 10 node subgraph this is approximately 10^13 combinations. There are smarter ways to do this problem, but this is outside the scope of the question. Try math stack exchange for some ideas as this is straying outside a programing domain.Percheron
S
2

For people who ended up here having the same problem but have too many nodes, here are few simple improvements on @Hooked's answer (although I am sure there are better solutions out there as @Hooked mentioned in comments, this is just a quick copy-paste fix for people who ended up here with the same reason as I did and had scaling problems)

1) igraph scales way better than networkx

2) we can only take a neighborhood of a node to eliminate most of the unnecessary combinations

For example if we are looking for a motif in larger network (both igraph objects)

    motif_rank = max(max(motif.shortest_paths_dijkstra()))
    result = collections.OrderedDict.fromkeys(network.vs['label'], 0)

    for node in self.network.vs:
        # Get relevant nodes around node of interest that might create the motif of interest
        nodes_to_expand = {node}
        for rank in range(motif_rank):
            nodes_expanded = nodes_to_expand
            for node_to_expand in nodes_to_expand:
                nodes_expanded = set.union(nodes_expanded, set(node_to_expand.neighbors()))
            nodes_to_expand = nodes_expanded

        # Look at all combinations
        for sub_nodes in itertools.combinations(nodes_to_expand, motif.vcount()):
            subg = network.subgraph(sub_nodes)
            if subg.is_connected() and subg.isomorphic(motif):
                result[node['label']] = result[node['label']]+1
Shane answered 19/6, 2018 at 11:0 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.