Most efficient way of creating tree from adjacency list
Asked Answered
L

2

7

I have an adjacency list of objects (rows loaded from SQL database with the key and it's parent key) that I need to use to build an unordered tree. It's guaranteed to not have cycles.

This is taking wayyy too long (processed only ~3K out of 870K nodes in about 5 minutes). Running on my workstation Core 2 Duo with plenty of RAM.

Any ideas on how to make this faster?

public class StampHierarchy {
    private StampNode _root;
    private SortedList<int, StampNode> _keyNodeIndex;

    // takes a list of nodes and builds a tree
    // starting at _root
    private void BuildHierarchy(List<StampNode> nodes)
    {
        Stack<StampNode> processor = new Stack<StampNode>();
        _keyNodeIndex = new SortedList<int, StampNode>(nodes.Count);

        // find the root
        _root = nodes.Find(n => n.Parent == 0);

        // find children...
        processor.Push(_root);
        while (processor.Count != 0)
        {
            StampNode current = processor.Pop();

            // keep a direct link to the node via the key
            _keyNodeIndex.Add(current.Key, current);  

            // add children
            current.Children.AddRange(nodes.Where(n => n.Parent == current.Key));

            // queue the children
            foreach (StampNode child in current.Children)
            {
                processor.Push(child);
                nodes.Remove(child); // thought this might help the Where above
            }
        }
    }
}

    public class StampNode {
         // properties: int Key, int Parent, string Name, List<StampNode> Children
    }
Libido answered 16/4, 2010 at 16:34 Comment(2)
Do you absolutely have to do this in C#? Because it's going to be a lot faster to order the nodes by path in SQL, with which you can then build a tree in O(N) time.Nard
how can I order by path in SQL? My data is like an org chart... lots of children and lots of ragged levels.Libido
C
5
  1. Put the nodes into a sorted list or dictionary.

  2. Scan that list, pick up each node, find its parent node in the same list (binary search or dictionary lookup), add it to the Children collection of the parent node.

There's no need for a Stack to put this into a tree.

Critta answered 16/4, 2010 at 16:48 Comment(1)
It's worth to note that sorting the nodes by key before putting them into a sorted list makes a huge difference in speed. Go with dictionary is also another alternative if memory is not a primary constraint.Richmal
G
2

SortedList is not a good container to use in this context. It is O(n) for insertion operations (the repeated calls to Add()), as it is internally represented as a flat list. Using Dictionary instead of SortedList will be a large improvement, as it is O(1) amortized insertion time.

Gulch answered 16/4, 2010 at 16:48 Comment(1)
Ah, also I missed the current.Children.AddRange line. You don't want to re-scan your entire node list searching for each parent. As Hightechrider suggested, putting the nodes into a Dictionary first would speed things up dramatically, as again, you change an O(n) operation into an O(1) operation.Gulch

© 2022 - 2024 — McMap. All rights reserved.