What can cause different results in communities detected after scaling edge weights of a graph with python-louvain?
Asked Answered
F

1

6

I noticed that if I change all the edge weights in the graph with the same value, community.best_partition doesn't always result in the same communities.

I used the same random state in all cases and the graph is exactly the same just that instead of all edge weights to be equal to 1 for example they may be equal to 5. The definition of modularity cancels out this factor that multiplies the adjacency matrix and as I read about the algorithm I can't find a step that should change the result. Is there a reason that causes that difference?

import networkx as nx
import community

from sklearn.metrics import adjusted_rand_score


def main():
    g = nx.davis_southern_women_graph()
    nodes = g.nodes()
    clusters_init = community.best_partition(g, random_state=10)
    print("modularity with initial clusters = %.15f" % community.modularity(clusters_init, g))

    labels_init = [clusters_init[n] for n in nodes]
    for num in range(1, 9):

        for u, v in g.edges():
            g[u][v]["weight"] = num

        clusters = community.best_partition(g, random_state=10)
        labels = [clusters[n] for n in nodes]
        print("value of edge weight = %d," % num, "modularity = %.15f," % community.modularity(clusters, g),
              "modularity with initial clusters = %.15f," % community.modularity(clusters_init, g),
              "adjusted rand score = %.3f" % adjusted_rand_score(labels_pred=labels, labels_true=labels_init))


if __name__ == "__main__":
    main()

modularity with initial clusters = 0.334869334679965

value of edge weight = 1, modularity = 0.334869334679965, modularity with initial clusters = 0.334869334679965, adjusted rand score = 1.000

value of edge weight = 2, modularity = 0.334869334679965, modularity with initial clusters = 0.334869334679965, adjusted rand score = 1.000

value of edge weight = 3, modularity = 0.334869334679965, modularity with initial clusters = 0.334869334679965, adjusted rand score = 1.000

value of edge weight = 4, modularity = 0.334869334679965, modularity with initial clusters = 0.334869334679965, adjusted rand score = 1.000

value of edge weight = 5, modularity = 0.332470647645499, modularity with initial clusters = 0.334869334679965, adjusted rand score = 0.676

value of edge weight = 6, modularity = 0.334869334679965, modularity with initial clusters = 0.334869334679965, adjusted rand score = 1.000

value of edge weight = 7, modularity = 0.332470647645499, modularity with initial clusters = 0.334869334679965, adjusted rand score = 0.676

value of edge weight = 8, modularity = 0.334869334679965, modularity with initial clusters = 0.334869334679965, adjusted rand score = 1.000

Fortunia answered 19/8, 2019 at 16:44 Comment(2)
This seems like an interesting issue. Perhaps you could report this to the package owners here: github.com/taynaud/python-louvain/issuesAltissimo
Maybe you observe numerical instability. You'll likely need to study the implementation in detail for numerical problems.Sigurd
S
0

Louvain, which is a modularity-based approach, is inherently non-deterministic. Many implementations of Louvain play tricks to try and make Louvain seem deterministic. Just a guess is that the code could be including edge weights in how they order the edges for processing. Changing that order can change the clusters detected. For example, just changing vertex ID (which does not change the structure of the graph) results in different clustering. The non-deterministic nature of Louvain, Leiden, and modularity in general is well studied. A google search will uncover a number of papers.

Saylor answered 25/4, 2023 at 21:22 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.