Finding a Minimum Spanning Tree from an Adjacency List where the Adjacency List is in a string array using Prims Algorithm
Asked Answered
C

2

5

So I need some help coming up with a way to find a Minimum spanning tree. Suppose I have my graph in the form of an adjacency list:

A 2 B 12 I 25
B 3 C 10 H 40 I 8
C 2 D 18 G 55
D 1 E 44
E 2 F 60 G 38
F 0
G 1 H 35
H 1 I 35

The first letter tells which node you are looking at, and the number tells how many connections to any other node there are. For example, A has two connections - one each to B and I. After that, the number that follows the letters simply tell the weight of an edge. B has weight 12 and I has weight 25. So I had originally planned to represent this entire thing as a String array called Graph[8]. Each line would be a different slot in the array. I am having difficulties figuring out how to accomplish this with either Prims or Kruskalls algorithm.

Contexture answered 26/2, 2012 at 2:17 Comment(6)
Looks like edges, not nodes, have a weight.Enwrap
"Algorithms" book has description of the algorithm and code in JavaKaitlin
Strongly related #1764149Rhomboid
Also: do you have difficulties understanding the algorithms or specific problems with the implementation?Rhomboid
Its more of a problem with the implementation. I know this can be done another way by building the tree, and by using parents and child nodes. But I want to know if a solution can be found using the format listed hereContexture
Using structured strings directly, without parsing them into an object structure first, is extremely tedious. For every operation, you have to find the corresponding token, extract it, and maybe convert it into a number. You will end up parsing the string multiple times, and never reusing the result. It is highly recommended to parse the strings into an object model once, and use that instead.Psychographer
S
2

Suppose I have my tree in the form of an adjacency list

It is important (for your understanding) to note that you have a connected graph in this kind of an adjacency list, but I think it was just a typo. I will propose this as an edit, but I just want you to be aware of it. The fact that it is a graph and not a tree can be seen from those lines:

A 2 B 12 I 25
B 3 C 10 H 40 I 8

Here you have a circle already at A-B-I:

     A
  12/_\25
   B 8 I

Having a circle by definition means it cannot be a tree! There is one more thing one can see from the way I presented this subgraph: the edges have weights, not the nodes.


Now let's take a look at your provided example:

A 2 B 12 I 25
B 3 C 10 H 40 I 8
C 2 D 18 G 55
D 1 E 44
E 2 F 60 G 38
F 0
G 1 H 35
H 1 I 35

First of all we can see that this adjacency list is either incorrect or already optimized for your needs. This can (for example) be seen from the lines

E 2 F 60 G 38
F 0

The first line here shows an edge from E to F, yet the second line says F had a degree of 0, but the degree is defined by the edges incident to it!

This is what the adjacency list would look like if it was 'complete':

A 2 B 12 I 25
B 4 A 12 C 10 H 40 I 8
C 3 B 10 D 18 G 55
D 2 C 18 E 44
E 3 D 44 F 60 G 38
F 1 E 60
G 3 C 55 E 38 H 35
H 3 B 40 G 35 I 35

We can see lots of redundancy because every edge occurs twice, this is why your 'optimized' adjacency list better suits your needs - would it be this 'complete' example one would do many useless checks. However, you should only rely on this if you can be sure that all data given to your code is already 'optimized' (or rather in this format)! I will take this as a given from now on.


Let's talk about your data structure. You may of course use your proposed array of strings, but to be honest, i would rather recommend something like @AmirAfghani's proposed data structure. Using his approach would make your work easier (as it - as he already pointed out - would be closer to your mental representation) and even more efficient (i guess, don't rely on this guess ;)) as you would be doing many operations on strings otherwise. In the title you asked after Prim's algorithm, but in your actual question you said either Prim's or Kruskal's. I will go with Kruskal simply because his algorithm is way easier and you seem to accept both.


Kruskal's algorithm

Kruskal's algorithm is fairly simple, it's basically:

  • Start with every node, but no edges

Repeat the following as often as possible:

  • From all of the unused/unchosen edges choose the one with the fewest weight (if there are more than one: just pick one of them)
  • Check if this edge would cause a circle, if it does, mark it as chosen and 'discard' it.
  • If it doesn't cause a circle, use it and mark it as used/chosen.

That's all. It's really that simple. However i would like to mention one thing at this point, i think it best fits here: there is no "the minimum spanning tree" in general, but "a minimum spanning tree" as there can be many, so your results may vary!


Back to your data structure! You may now see why it would be a bad idea to use an array of strings as a data structure. You would repeat many operations on strings (for example checking the weight of every edge) and strings are immutable, that means you cannot simply 'throw away' one of the edges or mark one of the edges in whatever way. Using a different approach you could just set the weight to -1, or even remove the entry for the edge!


Depth-First Search

From what I have seen I think your main problem is deciding whether an edge would cause a circle or not, right? To decide this, you will probably have to call a Depth-First Search algorithm and use its return value. The basic idea is this: start from one of the nodes (i will call this node the root) and try to find a way to the node at the other end of the chosen edge (i will call this node the target node). (of course in the graph which doesn't have this edge yet)

  • Choose one unvisited edge incident to your root
  • Mark this chosen edge as visited (or something like that, whatever you like)
  • repeat the last two steps for the node on the other side of the visited edge
  • whenever there are no unvisited edges, go back one edge
  • if you are back at your root and have no unvisited edges incident to it remaining, you are done. return false.
  • if you at any point visit your target node, you are done. return true.

now, if this method returns true this edge would cause a circle.

Seeseebeck answered 9/4, 2012 at 6:5 Comment(0)
I
6

This isn't a direct answer to your question per-say (seems like you're doing schoolwork), but I think it will help you get started. Why not create a data structure that more closely matches your mental model and build up from there?

class GraphNode { 

    final String name;
    final List<GraphEdge> adjacentNodes;

    public GraphNode(String name) { 
        this.name = name;
        adjacentNodes = new ArrayList<GraphEdge>();
    }

    public void addAdjacency(GraphNode node, int weight) { 
        adjacentNodes.add(new GraphEdge(node, weight));
    }

}

class GraphEdge {

    final GraphNode node;
    final int weight;

    public GraphEdge(GraphEdge node, int weight) {
        this.node = node;
        this.weight = weight;
    }

}
Inbreed answered 26/2, 2012 at 2:57 Comment(2)
Moved the weights from GraphNode to the new GraphEdge.Psychographer
The problem of this structure is: It implies a directed graph as going from one node to the successors is done via adjacentNodes but going to the predecessor would require a full scan over all edges. At least the Prim algo will require "symetric" access to the graph.Influence
S
2

Suppose I have my tree in the form of an adjacency list

It is important (for your understanding) to note that you have a connected graph in this kind of an adjacency list, but I think it was just a typo. I will propose this as an edit, but I just want you to be aware of it. The fact that it is a graph and not a tree can be seen from those lines:

A 2 B 12 I 25
B 3 C 10 H 40 I 8

Here you have a circle already at A-B-I:

     A
  12/_\25
   B 8 I

Having a circle by definition means it cannot be a tree! There is one more thing one can see from the way I presented this subgraph: the edges have weights, not the nodes.


Now let's take a look at your provided example:

A 2 B 12 I 25
B 3 C 10 H 40 I 8
C 2 D 18 G 55
D 1 E 44
E 2 F 60 G 38
F 0
G 1 H 35
H 1 I 35

First of all we can see that this adjacency list is either incorrect or already optimized for your needs. This can (for example) be seen from the lines

E 2 F 60 G 38
F 0

The first line here shows an edge from E to F, yet the second line says F had a degree of 0, but the degree is defined by the edges incident to it!

This is what the adjacency list would look like if it was 'complete':

A 2 B 12 I 25
B 4 A 12 C 10 H 40 I 8
C 3 B 10 D 18 G 55
D 2 C 18 E 44
E 3 D 44 F 60 G 38
F 1 E 60
G 3 C 55 E 38 H 35
H 3 B 40 G 35 I 35

We can see lots of redundancy because every edge occurs twice, this is why your 'optimized' adjacency list better suits your needs - would it be this 'complete' example one would do many useless checks. However, you should only rely on this if you can be sure that all data given to your code is already 'optimized' (or rather in this format)! I will take this as a given from now on.


Let's talk about your data structure. You may of course use your proposed array of strings, but to be honest, i would rather recommend something like @AmirAfghani's proposed data structure. Using his approach would make your work easier (as it - as he already pointed out - would be closer to your mental representation) and even more efficient (i guess, don't rely on this guess ;)) as you would be doing many operations on strings otherwise. In the title you asked after Prim's algorithm, but in your actual question you said either Prim's or Kruskal's. I will go with Kruskal simply because his algorithm is way easier and you seem to accept both.


Kruskal's algorithm

Kruskal's algorithm is fairly simple, it's basically:

  • Start with every node, but no edges

Repeat the following as often as possible:

  • From all of the unused/unchosen edges choose the one with the fewest weight (if there are more than one: just pick one of them)
  • Check if this edge would cause a circle, if it does, mark it as chosen and 'discard' it.
  • If it doesn't cause a circle, use it and mark it as used/chosen.

That's all. It's really that simple. However i would like to mention one thing at this point, i think it best fits here: there is no "the minimum spanning tree" in general, but "a minimum spanning tree" as there can be many, so your results may vary!


Back to your data structure! You may now see why it would be a bad idea to use an array of strings as a data structure. You would repeat many operations on strings (for example checking the weight of every edge) and strings are immutable, that means you cannot simply 'throw away' one of the edges or mark one of the edges in whatever way. Using a different approach you could just set the weight to -1, or even remove the entry for the edge!


Depth-First Search

From what I have seen I think your main problem is deciding whether an edge would cause a circle or not, right? To decide this, you will probably have to call a Depth-First Search algorithm and use its return value. The basic idea is this: start from one of the nodes (i will call this node the root) and try to find a way to the node at the other end of the chosen edge (i will call this node the target node). (of course in the graph which doesn't have this edge yet)

  • Choose one unvisited edge incident to your root
  • Mark this chosen edge as visited (or something like that, whatever you like)
  • repeat the last two steps for the node on the other side of the visited edge
  • whenever there are no unvisited edges, go back one edge
  • if you are back at your root and have no unvisited edges incident to it remaining, you are done. return false.
  • if you at any point visit your target node, you are done. return true.

now, if this method returns true this edge would cause a circle.

Seeseebeck answered 9/4, 2012 at 6:5 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.