A fast algorithm for minimum spanning trees when edge lengths are constrained?

Asked 15/1, 2012 at 23:25 Answered 16/1, 2012 at 18:22

Solved algorithm integer minimum-spanning-tree

Suppose that you have a directed graph with nonnegative, integer edge lengths that are in the range 0 to U - 1, inclusive. What is the fastest algorithm for computing a minimum spanning tree of this graph? We can still use existing minimum spanning tree algorithms, such as Kruskal's algorithm O(m log n)) or Prim's algorithm (O(m + n log n)). However, for cases where U is small, I think it should be possible to do much better this.

Are there any algorithms that are competitive with more traditional MST algorithms that are able to exploit the fact that the edge lengths are constrained to be in some range?

Thanks!

Selfridge answered 15/1, 2012 at 23:25 Comment(7)

Are the lengths also restricted to be integers, or just restricted to that range? – Flooring 15/1, 2012 at 23:44

@harold- They're integers. I'll post a correction. – Selfridge 15/1, 2012 at 23:45

Several sources mention there's a linear time algorithm for that, but link to something I can't view. – Flooring 15/1, 2012 at 23:48

If the edge weights are restricted to integers, can you use a hash-table based priority queue, rather than a heap (or whatever comparison based O(log(n)) style structure). As usual with a hash queue all items with the same integer cost are placed in the same bucket, and you can get worst-case O(1) operations. I assume this would remove the O(log(n)) complexity from the algorithms you've mentioned, leaving something like O(m + n)... – Francophile 16/1, 2012 at 0:12

@DarrenEngwirda is spot on here - if U is generally small, you can bucketize in linear time, and then add scan through adding an edge from the lowest weight bucket if and only if it connects a new vertex. Excellent (in terms of both improvement and simplicity) optimization. – Score 16/1, 2012 at 0:17

@DarrenEngwirda- Can you elaborate on this? The fact that the runtime is independent of U is very surprising and I'd need to see the full details of what you mean. – Selfridge 16/1, 2012 at 0:17

@templatetypedef: I'll post as an answer... – Francophile 16/1, 2012 at 0:23

Fredman–Willard gave an O(m + n) algorithm for integer edge lengths on the unit-cost RAM.

That's arguably not much of an improvement: without the restriction on edge lengths (i.e., the lengths are an opaque data type that supports only comparisons), Chazelle gave an O(m alpha(m, n) + n) algorithm (alpha is the inverse Ackermann function) and Karger–Klein–Tarjan gave a randomized O(m + n) algorithm.

I don't think Darren's idea leads to a O(m + n + U)-time algorithm. Jarnik ("Prim") does not use its priority queue monotonically, so buckets may be scanned multiple times; Kruskal requires a disjoint-set data structure, which cannot be O(m + n).

Vicechairman answered 16/1, 2012 at 18:22 Comment(0)

With integer edge weights you can use bucketing to achieve a priority queue with worst-case O(1) complexity, but additional O(U) space complexity.

Within the MST algorithms you've mentioned you should be able to replace the comparison based priority queues with this integer structure, and hence remove the O(log(n)) depenedence in the complexity requirements. I expect you'd end up with an overall complexity in the style of O(n + m).

Essentially you setup a set of compressed linked-lists, where each list is indexed by the (integer!) cost associated with that bucket:

struct bucket_list
{
    _cost; // array[0..N-1] holding current cost for each item

    _head; // array[0..U-1] holding index of first item in each bucket

    _next; // array[0..N-1] where _next[i] is the next item 
           // in a list for the ith item

    _prev; // array[0..N-1] where _prev[i] is the last item 
           // in a list for the ith item
};

This structure is based on the fact that each item can only be in a single bucket list at once.

Based on this structure you can achieve worst-case O(1) complexity for these operations:

push(item, cost); // push an item onto the head of the appropriate bucket list

_pop(item, cost); // _pop an item from (anywhere!) within a bucket list

update(item, old_cost, new_cost); // move an item between buckets by combining
                                  // push and _pop

To use this structure as a priority queue you simply maintain an index pointing at the current minimum bucket to scan. When you want to get the next lowest cost item, you simply pop the head item from this bucket. If the bucket is empty you increment your bucket index until you find a non-empty one.

Of course if U becomes very large the extra space complexity, and the possiblity of a sparse distribution of items over the buckets may make this kind of approach unattractive.

Francophile answered 16/1, 2012 at 0:39 Comment(2)

The complexity of this implementation also includes U, since you have to iterate across O(U) buckets. – Selfridge 16/1, 2012 at 1:58

You could say that the total complexity would be O(n + m + U) - the buckets are only traversed once throughout the whole algorithm, not at each step. – Francophile 16/1, 2012 at 2:7

Recommended topics

Hot tags