The formula is not magic, the AVL tree was specifically designed for the purpose of having logarithmic height, search, insertion and deletion.
The minimum height is the best case for AVL the tree, when it's densely packed. In particular, a perfect and complete AVL tree has all balance factors of all nodes equal to zero and it has a simple exact formula for the height: h = lg(N+1). This can be proved by finite induction, or more simply, by using the formula of the finite sum of a geometric progression, because each level in the tree has twice the number of nodes of the level above, forming a geometric progression of factor 2: 1, 2, 4, 8, 16, ...
1 + 2 + 4 + 8 + ... = ∑ a.qⁱ = a₁ (1 - qⁿ) / (1 - q)
In the equation, a₁ = 1, q = 2 and n is the tree height, h, so:
N = 1.(1 - 2ⁿ)/(1 - 2) = 2ⁿ - 1
h = lg(N+1) ∎
As the AVL tree starts to become less dense, the absolute value of the balance factor starts increasing to 1 in some nodes. Due to the "holes" in the node allocation, the height will start increasing relative to a dense tree with the same number of elements. In the worst case, all internal nodes have a balance factor different from zero and the tree height reaches its maximum when this happens. It's possible to recursively construct trees from smaller trees, all having internal nodes with balance factor equal to -1 and each time a new tree is constructed, the height increases by 1. We start with trees with height 0 and 1, T0 and T1. For tree T_h, we build it by placing T_h-1 as the left child of the root and T_h-2 as the right child of the root. All such trees T_h have their internal nodes balance factor equal to -1, which is the worst case. Such trees are called Fibonacci trees and the number of elements in them T(h) is given by the following series:
T(0) = 0, T(1) = 1
T(h) = T(h-1) + T(h-2) + 1 (#elements left tree + #elements right tree + root)
This series are very similar to the Fibonacci sequence, but the difference is the additional element +1. In can be shown by finite induction that T(h) = F(h+2) - 1, where F(n) is the nth Fibonacci number. This gives a closed formula for the number of elements in a Fibonacci tree of height h:
N = T(h) = (ϕⁿ⁺² - ψⁿ⁺²) / √5 - 1
Where ϕ is the golden ratio and ψ its conjugate. This formula already shows an exponential relation between the number of elements in the tree N and its height, but it's not possible to rewrite it in terms of the height h. However, -ψⁿ⁺² / √5 is a small number that tends to zero when h tends to infinity. We can replace this term by a constant ε that is assumed to be the largest value that this term can assume:
N = T(h) = ϕⁿ⁺² / √5 + ε - 1
ϕⁿ⁺² = √5.(N + 1 - ε)
h+2 = log_ϕ (√5.(N + 1 - ε))
h = log_ϕ (√5.(N + 1 - ε)) - 2
The constants can be rearranged and simplified to give:
h = log_ϕ (N + a) + b
Where a = 1 - ε and b = log_ϕ (√5) - 2. Replacing ε with the maximum possible constant, the upper limit for the height h can be found:
h ≤ log_ϕ (N + a) + b
Where a = ψ² / √5 + 1 and b = log_ϕ (√5) - 2. If desired, the logarithm in base ϕ can be transformed to base 2 by adding a constant:
h ≤ c.lg (N + a) + b ∎
Where a = ψ² / √5 + 1 and b = log_ϕ (√5) - 2 and c = 1/lg ϕ. c is approximatelly equal to 1,44042, which means that, compared to the formula for the minimum height, the worst case height of the AVL tree is around 44% bigger.