Different results in eigenvector centrality numpy
Asked Answered
W

2

4

The following example gives different results obtained with eigenvector_centrality and eigenvector_centrality_numpy. Is there a way to make such calculation more robust? I'm using networkx 2.4, numpy 1.18.5 and scipy 1.5.0.

import numpy as np
import networkx as nx

AdjacencyMatrix = {
    0: {
        1: 0.6,
    },
    1: {
        2: 0,
        3: 0,
    },
    2: {
        4: 0.5,
        5: 0.5,
    },
    3: {
        6: 0.5,
        7: 0.5,
        8: 0.5,
    },
    4: {},
    5: {},
    6: {},
    7: {},
    8: {},
}

G = nx.DiGraph()

for nodeID in AdjacencyMatrix.keys():
    G.add_node(nodeID)

for k1 in AdjacencyMatrix.keys():
    for k2 in AdjacencyMatrix[k1]:
        weight = AdjacencyMatrix[k1][k2]
        split_factor = len(AdjacencyMatrix[k1])
        G.add_edge(k1, k2, weight=weight / split_factor, reciprocal=1.0 / (split_factor * weight) if weight != 0 else np.inf)

eigenvector_centrality = {v[0]: v[1] for v in sorted(nx.eigenvector_centrality(G.reverse() if G.is_directed() else G, max_iter=10000, weight="weight").items(), key=lambda x: x[1], reverse=True)}
print(eigenvector_centrality)

eigenvector_centrality_numpy = {v[0]: v[1] for v in sorted(nx.eigenvector_centrality_numpy(G.reverse() if G.is_directed() else G, max_iter=10000, weight="weight").items(), key=lambda x: x[1], reverse=True)}
print(eigenvector_centrality_numpy)

Here's my output:

{0: 0.6468489798823026, 3: 0.5392481399595738, 2: 0.5392481399595732, 1: 0.0012439403459275048, 4: 0.0012439403459275048, 5: 0.0012439403459275048, 6: 0.0012439403459275048, 7: 0.0012439403459275048, 8: 0.0012439403459275048}
{3: 0.9637027924175013, 0: 0.0031436862826891288, 6: 9.593026373266866e-11, 8: 3.5132785569658154e-11, 4: 1.2627565659784068e-11, 1: 9.433263632036004e-14, 7: -2.6958851817582286e-11, 5: -3.185304797703736e-11, 2: -0.26695888283266833}
Wilen answered 24/6, 2020 at 18:13 Comment(0)
P
3

edit - see the response by dshult. He's one of the main people who maintains/updates networkx.


I think this may be a bug, but not the way you think. This graph is directed and acyclic. So for this graph, I don't think there is a nonzero eigenvalue.

It looks like the algorithm seems to implicitly assume an undirected graph, or at least that if it's directed it has cycles. And I would expect the algorithm to break if there's no cycle.

I'm going to encourage the networkx people to look at this in more detail.

I'm actually surprised that it converges for the non-numpy version.

Phoney answered 25/6, 2020 at 3:24 Comment(9)
So are you saying that the eigencentrality can't be calculated for directed and acyclic graphs? What centrality measure should make more sense then?Wilen
My expectation is that the larger the sum of a node's outgoing edges, the more important it is. I'm trying to capture that with centrality measures.Wilen
I'm experimenting with nx.pagerank_numpy and it seems to be working fine and robustly for my application so far. While nx.eigenvector_centrality also didn't have convergence problems, it seems to be more sensitive to the weight values. For example, I get the same solution with pagerank if I set AdjacencyMatrix[0][1] to 0.5 or 0.6, whereas eigenvector_centrality converges to different values.Wilen
If you're interested in outward edges mattering, pagerank doesn't really count that. It's more about the edges/paths coming in. PageRank measures how likely you are to end up somewhere following edges, but the edges out don't affect this. This is one of the important features which makes the algorithm hard to cheat for websites - increasing links out (which you control) don't help your rank.Phoney
What if I reverse the directed graph? I'm calling it as nx.pagerank_numpy(G.reverse(), ...). I only care about outward edges in my application.Wilen
What you'd be measuring is the following concept: Many individuals start at random pages and follow randomly chosen edges backwards. Every once in a while they just jump to a random page. Then the pagerank measure tells you how many people are expected to be at each page. Depending on your application, this might not be the right measure. It will tend to give nodes higher weight if they have more edges out, especially if they link to nodes that have many edges out.Phoney
But can't that be controlled with the edge weights? For example, node i has 2 outward edges whose weights are larger than the 10 outward edges from node j. In the end, both characteristics (number of outgoing edges and their weights) are important to me. Any comments? I also looked into betweenness centrality, since it has a shortest-path component and which is interesting to me. I understand it doesn't have the same philosophy as PageRank. I'm trying to figure what makes more sense in my application.Wilen
It depends on what you mean by controlled with the edgeweights. I'm not positive that networkx's pagerank allows weights, but if so, then presumably it's like walkers doing a weighted random walk.Phoney
It does. So if the graph structure remains the same, but the edge values change, I'd like to see node centralities changing too.Wilen
C
3

Joel is right to say that eigenvector_centrality isn't a useful measure for directed acyclic graphs. See this nice description of centrality. This should be useless for both the numpy and non-numpy versions of the code.

Celsacelsius answered 25/6, 2020 at 19:0 Comment(4)
Yes, I found that link after his comment. Thanks!Wilen
Wondering if PageRank is a safe and robust option...Wilen
PageRank is pretty robust (it's designed for directed networks and has techniques that specifically deal with the problems I highlighted for your network).Phoney
The link is broken, but I found this one that seems to be the same resource: sites.pitt.edu/~kpele/Materials15/module2.pdfStricken

© 2022 - 2024 — McMap. All rights reserved.