Why is the Inverse Ackermann function used to describe complexity of Kruskal's algorithm?

In a class for analysis of algorithms, we are presented with this pseudocode for Kruskal's algorithm:

He then states the following, for disjoint-set forests:

A sequence of m MAKE-SET, UNION, and FIND-SET operations, n of which are MAKE-SET operations, can be performed on a disjoint-set forest with union by rank and path compression in worst-case time O(m α(n)).

Used to compute the complexity of Step 2, and steps 5-8

For connected G: |E| ≥ |V| -1; m = O(V + E), n = O(V);

So Steps 2, 5-8: O((V + E) α(V)) = O(E α(V))

α(V) = O(lg V) = O(lg E); so we obtain O(E lg E) ----- // how is α(V) equal here?

Kruskal: Steps 3, 5-8, and step 4: O(E lg E)

Observe: |E| < |V|2 -> lg E = O(lg V)

So, Kruskal complexity: O(E lg V)

I have attempted to understand the logic behind this "alpha(n)"/"α(n)" function, and from what I've read it seems that, simplistically, the Ackermann function is one that grows exponentially incredibly fast, and the inverse is one that grows logarithmically incredibly slowly.

If my interpretation is correct, what does "α(n)" represent? Does it mean that MAKE-SET operations are at most O(lg n)? How/Why is using inverse-Ackermann necessary? I was under the impression this operation is performed V times (for each vertex). Following this, α(V) is also simplified to O(lg V) = O(lg E), does this mean that, at a maximum, α(V) may be represented by O(lg V)?

Also, why is the |E| < |V|^2 -> lg E = O(lg V) statement made, how is it known that that |E| < |V|^2?

I think my question really boils down to, why is it that a "forest" representation of disjoint sets seems to be more efficient than those implemented with linked lists when my lecturer states they are both O(E log V)? Therefore is there a point in the increased difficulty of implementing disjoint sets with forests?

α(V) = O(lg V) is a common abuse of notation, really we have α(V) ∈ O(lg V) (inverse-Ackerman of V is a member of the set of functions O(lg V)). They're not equal, they're not even the same type, one is a function and the other is a set of functions.

how is it known that that |E| < |V|²?

How many edges does a complete undirected graph have? You can't have more than that. You could in a multigraph, but that's not what the algorithm operates on, and it's useless to extend it to multigraphs - just throw out all but the best edge between a pair of nodes.

why is it that a "forest" representation of disjoint sets seems to be more efficient than those implemented with linked lists when my lecturer states they are both O(E log V)?

This is a weird thing to ask for several reasons. First, you're effectively measuring the efficiency of disjoint sets through Kruskals algorithm, not by its own. The "they" is your question is two implementations of Kruskals algorithm. Secondly, as you surely realized, the derivation of an upper bound used α(V) ∈ O(lg V). So it deliberately ignores a significant difference. That makes sense, because the time complexity is asymptotically dominated by the sorting step, but just because a difference is invisible in a big O doesn't mean it isn't there.

Therefore is there a point in the increased difficulty of implementing disjoint sets with forests?

There is no increased difficulty really. It's a super easy data structure that you can write in 5 minutes, just two arrays and some simple code - linked lists may actually be harder, especially if you have to do manual memory management. Note that outside the context of Kruskals algorithm, the difference is huge in terms of both asymptotic time and actual time.

But even in the context of Kruskals algorithm, improving the second stage of the algorithm obviously makes the total time better, even if it doesn't show in the worst case asymptotic time. FWIW you can improve the first stage too, you can use a heap (or one of its fancier drop-in replacements) and only heapify the edges in linear time. Then the second stage of the algorithm will extract them one by one but, crucially, you typically don't have to extract every edge - you can keep track of how many disjoint sets are left and stop when it drops to 1, potentially leaving many (even most) edges unused. In the worst case that doesn't help, but in real life it does. And in special cases you can sort the edges faster than O(E log E), when any of the fast sorts (counting sort, bucket sort, etc) apply.

Recommended topics

Hot tags