How can std::make_heap be implemented while making at most 3N comparisons?
Asked Answered
A

2

29

I looked in to the C++0x standard and found the requirement that make_heap should do no more than 3*N comparisons.

I.e. heapify an unordered collection can be done in O(N)

   /*  @brief  Construct a heap over a range using comparison functor.

Why is this?

The source gives me no clues (g++ 4.4.3)

The while (true) + __parent == 0 are not clues but rather a guess for O(N) behaviour

template<typename _RandomAccessIterator, typename _Compare>
void
make_heap(_RandomAccessIterator __first, _RandomAccessIterator __last,
          _Compare __comp)
{

  const _DistanceType __len = __last - __first;
  _DistanceType __parent = (__len - 2) / 2;
  while (true)
    {
      _ValueType __value = _GLIBCXX_MOVE(*(__first + __parent));
      std::__adjust_heap(__first, __parent, __len, _GLIBCXX_MOVE(__value),
                 __comp);
      if (__parent == 0)
        return;
      __parent--;
    }
}

__adjust_heap looks like a log N method:

while ( __secondChild < (__len - 1) / 2)
{
    __secondChild = 2 * (__secondChild + 1);

Is a bog standard log N to me.

  template<typename _RandomAccessIterator, typename _Distance,
       typename _Tp, typename _Compare>
    void
    __adjust_heap(_RandomAccessIterator __first, _Distance __holeIndex,
          _Distance __len, _Tp __value, _Compare __comp)
    {
      const _Distance __topIndex = __holeIndex;
      _Distance __secondChild = __holeIndex;
      while (__secondChild < (__len - 1) / 2)
      {
        __secondChild = 2 * (__secondChild + 1);
          if (__comp(*(__first + __secondChild),
             *(__first + (__secondChild - 1))))
          __secondChild--;
          *(__first + __holeIndex) = _GLIBCXX_MOVE(*(__first + __secondChild));
          __holeIndex = __secondChild;
      }
      if ((__len & 1) == 0 && __secondChild == (__len - 2) / 2)
      {
        __secondChild = 2 * (__secondChild + 1);
        *(__first + __holeIndex) = _GLIBCXX_MOVE(*(__first
                             + (__secondChild - 1)));
        __holeIndex = __secondChild - 1;
      }
      std::__push_heap(__first, __holeIndex, __topIndex, 
               _GLIBCXX_MOVE(__value), __comp);      
      }

Any clues to why this is O <= 3N will be appreciated.
EDIT:

Experimental results:

This actual implementation uses

  • <2N comparisons for heapifying heaps
  • <1.5N for heapifying heaps in the reverse order.
Afterbirth answered 9/6, 2011 at 22:3 Comment(7)
Indicate and explain why the source you posted supports your conjecture. I'm not saying that it doesn't, just that I can't be bothered to wade through it.Add
You might want to look at the explanation of why heapify is O(n) on Wikipedia.Detour
@Detour O(n) isn't mentioned in that article, have I missed something. < very likely.Proboscis
@truth Where does the factor 3 come from?Afterbirth
@Mike: Itis in the "Building a Heap" section.Detour
@Detour ahh the big picture of the equation.... oh dear. +1 for not taking the p*ssProboscis
@Captain: I think that the three comes from the fact that at each step you compare a node to its children and its parent (2+1).Detour
W
53

A binary heap over n elements can be created in O(n) time using a clever algorithm and a clever analysis. In what follows I'm just going to talk about how this works assuming that you have explicit nodes and explicit left and right child pointers, but this analysis is still perfectly valid once you compress it into an array.

The algorithm works as follows. Start off by taking about half of the nodes and treating them as singleton max-heaps - since there's only one element, the tree containing just that element must automatically be a max-heap. Now, take these trees and pair them off with one another. For each pair of trees, take one of the values that you haven't used yet and execute the following algorithm:

  1. Make the new node the root of the heap, having its left and right child pointers refer to the two max-heaps.

  2. While this node has a child that's larger than it, swap the child with its larger child.

My claim is that this procedure ends up producing a new max heap containing the elements of the two input max-heaps, and it does so in time O(h), where h is the height of the two heaps. The proof is an induction on the height of the heaps. As a base case, if the subheaps have size zero, then the algorithm terminates immediately with a singleton max-heap, and it does so in O(1) time. For the inductive step, assume that for some h, this procedure works on any subheaps of size h and consider what happens when you execute it on two heaps of size h + 1. When we add a new root to join together two subtrees of size h + 1, there are three possibilities:

  1. The new root is larger than the roots of both subtrees. Then in this case we have a new max-heap, since the root is larger than any of the nodes in either subtree (by transitivity)

  2. The new root is larger than one child and smaller than the other. Then we swap the root with the larger subchild and recursively execute this procedure again, using the old root and the child's two subtrees, each of which are of height h. By the inductive hypothesis, this means that the subtree we swapped into is now a max-heap. Thus the overall heap is a max-heap, since the new root is larger than everything in the subtree we swapped with (since it's larger than the node we added and was already larger than everything in that subtree), and it's also larger than everything in the other subtree (since it's larger than the root and the root was larger than everything in the other subtree).

  3. The new root is smaller than both its children. Then using a slightly modified version of the above analysis, we can show that the resulting tree is indeed a heap.

Moreover, since at each step the heights of the child heaps decreases by one, the overall runtime for this algorithm must be O(h).


At this point, we have a simple algorithm for making a heap:

  1. Take about half the nodes and create singleton heaps. (You can compute explicitly how many nodes will be needed here, but it's about half).
  2. Pair those heaps off, then merge them together by using one of the unused nodes and the above procedure.
  3. Repeat step 2 until a single heap remains.

Since at each step we know that the heaps we have so far are valid max-heaps, eventually this produces a valid overall max-heap. If we're clever with how we pick how many singleton heaps to make, this will end up creating a complete binary tree as well.

However, it seems like this should run in O(n lg n) time, since we do O(n) merges, each of which runs in O(h), and in the worst case the height of the trees we're merging is O(lg n). But this bound is not tight and we can do a lot better by being more precise with the analysis.

In particular, let's think about how deep all the trees we merge are. About half the heaps have depth zero, then half of what's left has depth one, then half of what's left has depth two, etc. If we sum this up, we get the sum

0 * n/2 + 1 * n/4 + 2 * n/8 + ... + nk/(2k) = Σk = 0⌈log n⌉ (nk / 2k) = n Σk = 0⌈log n⌉ (k / 2k+1)

This upper-bounds the number of swaps made. Each swap requires at most two comparisons. Therefore, if we multiply the above sum by two, we get the following summation, which upper-bounds the number of swaps made:

n Σk = 0 (k / 2k)

The summation here is the summation 0 / 20 + 1 / 21 + 2 / 22 + 3 / 23 + ... . This is a famous summation that can be evaluated in multiple different ways. One way to evaluate this is given in these lecture slides, slides 45-47. It ends up coming out to exactly 2n, which means that the number of comparisons that end up getting made is certainly bounded from above by 3n.

Hope this helps!

Wallboard answered 9/6, 2011 at 22:26 Comment(13)
There is also a proof in chapter 6 of CLRS, 2nd edition.Friable
Excellent! But why "If we're clever with how we pick how many singleton heaps to make, this will end up creating a complete binary tree as well." and Why the factor 3?Afterbirth
@Captain Giraffe: The proof in CLRS shows that __adjust_heap is O(h), where h is the height of the node in the tree. It then shows that due to the distribution of the heights of the nodes this works out to O(n) total. Now, the version in the book uses 2 comparisons per iteration (to find the largest of parent, left child and right child), so you could use the same argument to say that if __adjust_heap requires at most 2h comparisons, make_heap should require at most 2n. I'm guessing the 3 is there to give implementers some slack although it seems only 2 is needed.Friable
@Friable It's probably the first time c++ std docs would give implementors "some slack", but I have been on the same train of thought.Afterbirth
@Captain Giraffe: On further investigation it appears that a factor of 2 is indeed not quite enough. I've hacked together a calculation of the worst case number of comparisons. It starts giving results larger than 2n at n = 16.Friable
@Captain Giraffe: Actually, there was a bug in my code. I was counting comparisons against the left and right children even if they were not present. The corrected version stays below 2n.Friable
@Friable This seems like a "perfect Log2(N) = int" tree, is the max 3N for ceil( log 2(N) )?Afterbirth
@Captain Giraffe: My code gives less than 2n for any n I've tried. These lecture notes I stumbled upon while searching around contain a proof that the number of comparisons needed are 2(n-⌈log n⌉).Friable
@Friable Those notes shows why I get my measurements. If you would be kind enough to supply that as an answer I'd be happy to mark this as solved.Afterbirth
Can someone elaborate on the step from summing k/2^k to summing sqrt(2)^k/2^k+O(1)? That's the only step I don't really understand.Titular
@Mason- The idea is that for sufficiently large k, k < sqrt(2)^k, because k grows linearly and sqrt(2)^k grows exponentially. The "for sufficiently large k" here means that you can't just say that k < sqrt(2)^k for all k, and for some smaller terms it might be true that k > sqrt(2)^k. Since there are only finitely many smaller terms, the "+ O(1)" term here ensures that we absorb the times that k > sqrt(2)^k. Does that make sense?Wallboard
Maybe I'm missing something, but that sum accounts for the number of swaps that we need to do, however for each swap we need to check which of the children (left or right) is larger, which introduces a factor of 2x the number of comparisons. I don't see that taken into account anywhere.Medico
You're right that I'm off by a factor of two. At the same time, the summation here is slightly off and can be reduced by a factor of two. I'm going to update this answer later today to correct for this. Thanks for spotting that!Wallboard
F
17

@templatetypedef has already given a good answer for why the asymptotic run time of build_heap is O(n). There is also a proof in chapter 6 of CLRS, 2nd edition.

As for why the C++ standard requires that at most 3n comparisons are used:

From my experiments (see code below), it appears that actually less than 2n comparisons are needed. In fact, these lecture notes contain a proof that build_heap only uses 2(n-⌈log n⌉) comparisons.

The bound from the standard seems to be more generous than required.


def parent(i):
    return i/2

def left(i):
    return 2*i

def right(i):
    return 2*i+1

def heapify_cost(n, i):
    most = 0
    if left(i) <= n:
        most = 1 + heapify_cost(n, left(i))
    if right(i) <= n:
        most = 1 + max(most, heapify_cost(n, right(i)))
    return most

def build_heap_cost(n):
    return sum(heapify_cost(n, i) for i in xrange(n/2, 1, -1))

Some results:

n                     10  20  50  100  1000  10000
build_heap_cost(n)     9  26  83  180  1967  19960
Friable answered 11/6, 2011 at 0:5 Comment(2)
It is great to see Python used in unexpected places. Thanks for the example code!Talesman
Thanks for posting the lecture notes. Page 7 shows a very concise proof.Medico

© 2022 - 2025 — McMap. All rights reserved.