I am using R and the igraph library to learn about network graph data. In particular, I am trying to understand the concept of a "weighted graph" - from what I have read, the "weights" are generally associated with the "Edges" in the graph. But can the "weights" ever be associated with the "nodes"? (sometimes, I see that "nodes" are also referred to as "vertexes")
Suppose I have two datasets : one for the nodes and one for the edges.
library(igraph)
library(visNetwork)
Nodes <-data.frame(
"Source" = c("123","124","125","122","111", "126"),
"Salary" = c("100","150","200","200","100", "100"),
"Debt" = c("10","15","20","20","10", "10"),
"Savings" = c("1000","1500","2000","2000","1000", "1000")
)
Nodes$Salary= as.numeric(Nodes$Salary)
Nodes$Debt = as.numeric(Nodes$Debt)
Nodes$Savings = as.numeric(Nodes$Savings)
mydata <-data.frame(
"source" = c("123","124","123","125","123"),
"target" = c("126", "123", "125", "122", "111"),
"color" = c("red","red","green","blue","red"),
"food" = c("pizza","pizza","cake","pizza","cake")
)
Normally, I would have made a simple binary graph for this data, in which the entire analysis would only involve two columns:
#make graph
graph <- graph_from_data_frame(mydata[,c(1:2)], directed=FALSE)
simple_graph<- simplify(graph)
plot(simple_graph)
#do some clustering on the graph#
fc <- fastgreedy.community(simple_graph)
V(simple_graph)$community <- fc$membership
nodes <- data.frame(id = V(simple_graph)$name, title = V(simple_graph)$name, group = V(simple_graph)$community)
nodes <- nodes[order(nodes$id, decreasing = F),]
edges <- get.data.frame(simple_graph, what="edges")[1:2]
visNetwork(nodes, edges) %>%
visOptions(highlightNearest = TRUE, nodesIdSelection = TRUE)
Now, I want to explore the concept of a "weighted graph". I want to make a graph such that I can use the financial information (salary, debt, savings) for each node in the analysis. The way I see it, this would assign a notion of "weight" to the nodes and not the edges, correct?
A very basic way to approach this problem, would be to take the average(salary, debt and savings) for each node and considering this average amount as a weight. This way, we could begin to ask questions such as "are nodes with larger average financial amounts more likely to form relationships with one another, compared to nodes with smaller average financial amounts?" (in network science, I believe this concept is referred to as "homophily")
Thus, we can modify the file containing information about the nodes (calculate average financial amount for each node) :
nodes_avg = data.frame(ID=Nodes[,1], Means=rowMeans(Nodes[,-1]))
Now, we need to create a new graph in which this averaged financial information is considered as a "weight". This is where I begin to get confused.
This way does not work:
set_vertex_attr(simple_graph, Weight, index = V(graph), nodes_avg$Means)
Error in as.igraph.vs(graph, index) :
Cannot use a vertex sequence from another graph.
I tried the following command, but I got a warning message:
E(simple_graph)$weight <- nodes_avg$Means
Warning message:
In eattrs[[name]][index] <- value :
number of items to replace is not a multiple of replacement length
Finally, I tried this command, but I don't think it is using the averaged financial amounts as node weights:
weighted_graph <- graph_from_data_frame(mydata, directed=TRUE, vertices=nodes_avg)
Does anyone know how can I make a "weighted_graph" with the averaged financial amounts, and then run a clustering algorithm on the network graph which takes into consideration the node weights? Something like this:
simple_weighted_graph<- simplify(weighted_graph)
plot(simple_weighted_graph)
#do some clustering on the weighted_graph#
fc <- fastgreedy.community(simple_weighted_graph)
V(simple_weighted_graph)$community <- fc$membership
nodes <- data.frame(id = V(simple_weighted_graph)$name, title = V(simple_weighted_graph)$name, group = V(simple_weighted_graph)$community)
nodes <- nodes[order(nodes$id, decreasing = F),]
edges <- get.data.frame(simple_weighted_graph, what="edges")[1:2]
visNetwork(nodes, edges) %>%
visOptions(highlightNearest = TRUE, nodesIdSelection = TRUE)
Or is this not possible? That is, weighted graphs are only made using "edge weights" and CAN NOT be done using "node weights" ... and therefore, graph network clustering can not be done on a weighted graph made of node weights.
Thanks