How to use subgraph function to get a graph that would include only vertexes and edges from the specific connected component? Let's say I know the connected component id, the final goal is to create a new graph based on the connected component. I'd like to keep the vertex attributes from the original graph.
You have to join the graph with the component IDs to the original graph, filter (take the subgraph) by the component ID, and then discard the component ID.
import scala.reflect._
import org.apache.spark.graphx._
import org.apache.spark.graphx.lib.ConnectedComponents
def getComponent[VD: ClassTag, ED: ClassTag](
g: Graph[VD, ED], component: VertexId): Graph[VD, ED] = {
val cc: Graph[VertexId, ED] = ConnectedComponents.run(g)
// Join component ID to the original graph.
val joined = g.outerJoinVertices(cc.vertices) {
(vid, vd, cc) => (vd, cc)
}
// Filter by component ID.
val filtered = joined.subgraph(vpred = {
(vid, vdcc) => vdcc._2 == Some(component)
})
// Discard component IDs.
filtered.mapVertices {
(vid, vdcc) => vdcc._1
}
}
ConnectedComponents
. I guess I just wanted to include ConnectedComponents
somewhere. I'll rework the code to be more sensible. For your other question: VertexId
is an alias for Long
. A plain old number should work there. What do you mean by "fails" here? Do you get an exception? –
Workable cc.vertices.values.distinct.count
. –
Workable I take your question to be, given a VertexId in a source graph, create a new graph with the nodes and edges connected to this VertexId from the source graph.
Given that, here's what I would do:
val targetVertexId = ...
val graph = Graph(..., ...)
val newGraph = Graph(
graph.vertices.filter{case (vid,attr) => vid == targetVertexId} ++
graph.collectNeighbors(EdgeDirection.Either)
.filter{ case (vid,arr) => vid == targetVertexId}
.flatMap{ case (vid,arr) => arr},
graph.edges
).subgraph(vpred = { case (vid,attr) => attr != null})
Couple of things to note:
You can change EdgeDirection.Either
to EdgeDirection.In
or EdgeDirection.Out
as needed.
The .subgraph
at the end removes all Vertices where the attribute is set to null
. If the original val graph
has Vertices with attributes set to null
this isn't going to work. Otherwise this works, without having to know the Vertex attribute type in advance.
© 2022 - 2024 — McMap. All rights reserved.
getComponent
's arguments saving the potentially expensive task of recomputing. I am curious as to why this function works, but when I did the steps manually putting aLong
or whatever in as thecomponent
it fails. Must it be castSome
orVertexId
? – Dg