Python/NetworkX: Add Weights to Edges by Frequency of Edge Occurance
Asked Answered
I

1

11

I have a MultiDiGraph created in networkx for which I am trying to add weights to the edges, after which I assign a new weight based on the frequency/count of the edge occurance. I used the following code to create the graph and add weights, but I'm not sure how to tackle reassigning weights based on count:

g = nx.MultiDiGraph()

df = pd.read_csv('G:\cluster_centroids.csv', delimiter=',')
df['pos'] = list(zip(df.longitude,df.latitude))
dict_pos = dict(zip(df.cluster_label,df.pos))
#print dict_pos


for row in csv.reader(open('G:\edges.csv', 'r')):
    if '[' in row[1]:       #
        g.add_edges_from(eval(row[1]))

for u, v, d in g.edges(data=True):
    d['weight'] = 1
for u,v,d in g.edges(data=True):
    print u,v,d

Edit

I was able to successfully assign weights to each edge, first part of my original question, with the following:

for u, v, d in g.edges(data=True):
    d['weight'] = 1
for u,v,d in g.edges(data=True):
    print u,v,d

However, I am still unable to reassign weights based on the number of times an edge occurs (a single edge in my graph can occur multiple times)? I need to accomplish this in order to visualize edges with a higher count differently than edges with a lower count (using edge color or width). I'm not sure how to proceed with reassigning weights based on count, please advise. Below are sample data, and links to my full data set.

Data

Sample Centroids(nodes):

cluster_label,latitude,longitude
0,39.18193382,-77.51885109
1,39.18,-77.27
2,39.17917928,-76.6688633
3,39.1782,-77.2617
4,39.1765,-77.1927
5,39.1762375,-76.8675441
6,39.17468,-76.8204499
7,39.17457332,-77.2807235
8,39.17406072,-77.274685
9,39.1731621,-77.2716502
10,39.17,-77.27

Sample Edges:

user_id,edges
11011,"[[340, 269], [269, 340]]"
80973,"[[398, 279]]"
608473,"[[69, 28]]"
2139671,"[[382, 27], [27, 285]]"
3945641,"[[120, 422], [422, 217], [217, 340], [340, 340]]"
5820642,"[[458, 442]]"
6060732,"[[291, 431]]"
6912362,"[[68, 27]]"
7362602,"[[112, 269]]"

Full data:

Centroids(nodes):https://drive.google.com/open?id=0B1lvsCnLWydEdldYc3FQTmdQMmc

Edges: https://drive.google.com/open?id=0B1lvsCnLWydEdEtfM2E3eXViYkk

UPDATE

I was able to solve, at least temporarily, the issue of overly disproportional edge widths due to high edge weight by setting a minLineWidth and multiplying it by the weight:

minLineWidth = 0.25

for u, v, d in g.edges(data=True):
    d['weight'] = c[u, v]*minLineWidth
edges,weights = zip(*nx.get_edge_attributes(g,'weight').items())

and using width=[d['weight'] for u,v, d in g.edges(data=True)] in nx.draw_networkx_edges() as provided in the solution below.

Additionally, I was able to scale color using the following:

# Set Edge Color based on weight
values = range(7958) #this is based on the number of edges in the graph, use print len(g.edges()) to determine this
jet = cm = plt.get_cmap('YlOrRd')
cNorm  = colors.Normalize(vmin=0, vmax=values[-1])
scalarMap = cmx.ScalarMappable(norm=cNorm, cmap=jet)
colorList = []

for i in range(7958):
    colorVal = scalarMap.to_rgba(values[i])
    colorList.append(colorVal)

And then using the argument edge_color=colorList in nx.draw_networkx_edges().

enter image description here

Introject answered 26/4, 2017 at 20:45 Comment(4)
Please provide a minimal reproducible example. We don't have your input files, so it's harder for us to work with your code. Extract out just enough code to isolate the problem.Representative
I apologize, I meant to provide samples initially. I provided samples of both files as well as links to the full data on Google Drive.Introject
Let's say an edge between node A and node B appears 3 times in your data. Do you want to have multiple weighted edges between the nodes (i.e. there are 3 edges beween A and B and each has a weight 3), or do you want to have a single weighted edge (i.e. one edge between A and B with weight 3)?Tal
Is it possible to see both? I would prefer to have multiple weighted edges since the graph is directed, however, I'm not sure if that will complicate the edge visualization since the edge width or color will be dependent on the weight and at that point the edges between two nodes overlapIntroject
R
8

Try this on for size.

Note: I added a duplicate of an existing edge, just to show the behavior when there are repeats in your multigraph.

from collections import Counter
c = Counter(g.edges())  # Contains frequencies of each directed edge.

for u, v, d in g.edges(data=True):
    d['weight'] = c[u, v]

print(list(g.edges(data=True)))
#[(340, 269, {'weight': 1}),
# (340, 340, {'weight': 1}),
# (269, 340, {'weight': 1}),
# (398, 279, {'weight': 1}),
# (69, 28, {'weight': 1}),
# (382, 27, {'weight': 1}),
# (27, 285, {'weight': 2}),
# (27, 285, {'weight': 2}),
# (120, 422, {'weight': 1}),
# (422, 217, {'weight': 1}),
# (217, 340, {'weight': 1}),
# (458, 442, {'weight': 1}),
# (291, 431, {'weight': 1}),
# (68, 27, {'weight': 1}),
# (112, 269, {'weight': 1})]

Edit: To visualize the graph with edge weights as thicknesses, use this:

nx.draw_networkx(g, width=[d['weight'] for _, _, d in g.edges(data=True)])
Representative answered 27/4, 2017 at 20:39 Comment(9)
That worked, thank you. As a side, I mentioned I'm trying to use the weight to determine edge width or color, how can I use your solution to do so? I tried using solutions provided here #17632651 and here #22967586 but I'm not having any luck tailoring your solution to them. Thanks again!Introject
Updated my answer; should address your question.Representative
I received a value error: need more than 2 values to unpackIntroject
Did you include data=True?Representative
Ah, that's it! Do you know if this works the same for color?Introject
In principle, it's the same idea. It'll use whatever colormap you provide, or the default, and assign the values you pass in to values in the colormap. See the docs.Representative
Okay, I believe I got that to work and visualize properly. One last question related to the weights. Some edges have a weight of like 7-800, is it possible to normalize it so those edges aren't enormously disproportional? I'm not sure if I've explained that clearly, so I'm adding an image in the questionIntroject
You can always scale it, e.g. with np.log(d['width']), or order them by rank with scipy's stats.rankdata.Representative
If you have follow-ups, I encourage you to post them as new questions. More people will see those than just me.Representative

© 2022 - 2024 — McMap. All rights reserved.