What is the best way to count the cliques of size k in an undirected graph using networkx in python?
Asked Answered
D

3

5

I am surprised that networkx does not seem to have a built in function to do this, but maybe I am missing some clever way to do this using the built-in algorithms?

Docilla answered 9/11, 2019 at 2:21 Comment(0)
C
5

When using the find_cliques function you need to be carfull when you are going through all the possibilities (itertools.combinations) - in some cases you will count the same clique more than once. For example, if you have a graph of six nodes (let's name them A-G). Four of them are fully connected (A-D) and E is connected to A-D, and G is also connected to A-D (but E is not connected to G). In this situation you have two 5-cliques that share 4 nodes (A,B,C,D,E and A,B,C,D,G). Now let's say that you are looking for 4-cliques in this suggested garph, by using find_cliques you will go over the two 5-cliques, and in each one of them you will count every 4-clique, which includes the 4-clique A,B,C,D, so it will be counted twice (!).

here is a version of the suggested function that fix this problem by using set so you will count each clique only once:

def find_cliques_size_k(G, k):
    all_cliques = set()
    for clique in nx.find_cliques(G):
        if len(clique) == k:
            all_cliques.add(tuple(sorted(clique)))
        elif len(clique) > k:
            for mini_clique in itertools.combinations(clique, k):
                all_cliques.add(tuple(sorted(mini_clique)))
    return len(all_cliques)

(If you want the cliques themselves you can return the 'all_cliques' itself)

Cotton answered 30/3, 2021 at 7:12 Comment(1)
It doesn't seem like there is anyway to speed this up by only counting similar to what @Cooker suggests since you are relying on the add to only add unique elements. Any thoughts?Docilla
B
7

You can use one of these built in functions: enumerate_all_cliques or find_cliques in order to get all k-clique in an un-directed graph.

The difference between these functions is that enumerate_all_cliques goes over all possible cliques and find_cliques goes over only the maximal cliques. We will see in the end it affects the run time.

Option 1 using enumerate_all_cliques:

import networkx as nx

def enumerate_all_cliques_size_k(G, k):
    i = 0
    for clique in nx.enumerate_all_cliques(G):
        if len(clique) == k:
            i += 1
        elif len(clique) > k:
            return i
    return i

Option 2 using find_cliques:

import networkx as nx
import itertools

def find_cliques_size_k(G, k):
    i = 0
    for clique in nx.find_cliques(G):
        if len(clique) == k:
            i += 1
        elif len(clique) > k:
            i += len(list(itertools.combinations(clique, k)))
    return i

The first option is more straight forward but it's run time is problematic since we go over all possible sub-sets of the maximal cliques, even if the maximal clique size is less than k. We can see enumerate_all_cliques_size_k takes 10 times longer to run on a complete graph of size 20:

G = nx.complete_graph(20)


@timing
def test_enumerate_all_cliques_size_k(G,k):
    print(enumerate_all_cliques_size_k(G, k))

@timing
def test_find_cliques_size_k(G, k):
    print(find_cliques_size_k(G, k))

test_enumerate_all_cliques_size_k(G,5)
test_find_cliques_size_k(G,5)

# --------------------Result-----------------------

15504
test_enumerate_all_cliques_size_k function took 616.645 ms
15504
test_find_cliques_size_k function took 56.967 ms
Bartels answered 9/11, 2019 at 18:9 Comment(5)
As OP is interested only in counting, your second approach could also be simplified as len(itertools.combinations(clique, k)) is easy to compute.Cooker
You are right, the only thing len won't work on an iterator but it is possible to use cardinality.count or to implement n choose k functionBartels
Fair enough; I was actually just thinking of using the exact formula.Cooker
@MichalYanko As expected there is a significant speed up using the comb function to compute number of combinations directly from scipy.special import comb and i += comb(len(clique), k, exact=True)Docilla
Not sure why but I don't get the same result from these two functions from a random graph H=nx.gnp_random_graph(30,.5)Docilla
C
5

When using the find_cliques function you need to be carfull when you are going through all the possibilities (itertools.combinations) - in some cases you will count the same clique more than once. For example, if you have a graph of six nodes (let's name them A-G). Four of them are fully connected (A-D) and E is connected to A-D, and G is also connected to A-D (but E is not connected to G). In this situation you have two 5-cliques that share 4 nodes (A,B,C,D,E and A,B,C,D,G). Now let's say that you are looking for 4-cliques in this suggested garph, by using find_cliques you will go over the two 5-cliques, and in each one of them you will count every 4-clique, which includes the 4-clique A,B,C,D, so it will be counted twice (!).

here is a version of the suggested function that fix this problem by using set so you will count each clique only once:

def find_cliques_size_k(G, k):
    all_cliques = set()
    for clique in nx.find_cliques(G):
        if len(clique) == k:
            all_cliques.add(tuple(sorted(clique)))
        elif len(clique) > k:
            for mini_clique in itertools.combinations(clique, k):
                all_cliques.add(tuple(sorted(mini_clique)))
    return len(all_cliques)

(If you want the cliques themselves you can return the 'all_cliques' itself)

Cotton answered 30/3, 2021 at 7:12 Comment(1)
It doesn't seem like there is anyway to speed this up by only counting similar to what @Cooker suggests since you are relying on the add to only add unique elements. Any thoughts?Docilla
F
1

Welcome to SO.

Based on this reference, I think currently there is no existing function to do this. If you want to use nx functions you can do something like this:

def count_k_cliques(G, k):
    k_cliques_count = 0
    for clique in nx.enumerate_all_cliques(G): 
        if len(clique) > k: 
            break 
        elif len(clique) == k: 
            k_cliques_count += 1
    return k_cliques_count

Edit: I recommend considering option 2 in Michal's answer

Faizabad answered 9/11, 2019 at 17:48 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.