Find all keys smaller than x in an array min-heap
Asked Answered
N

3

6

Can some one describe an algorithm that finds all keys smaller than x in an array implementation of a min-heap. I want the running time to be at least O(k) where k is the number of keys reported.

I've been scratching my head for a while now with this.

Nostology answered 20/10, 2010 at 17:49 Comment(0)
V
4

There is a simple recursive algorithm for a tree min-heap:

void smaller_than(Node *node, int x)
{
    if (node->value >= x)
    {
        /* Skip this node and its descendants, as they are all >= x . */
        return;
    }

    printf("%d\n", node->value);

    if (node->left != NULL)
        smaller_than(node->left, x);
    if (node->right != NULL)
        smaller_than(node->right, x);
}

If a subtree's root has a value that is greater than or equal to x, then, by definition of a min-heap, all of its descendants will also have values greater than or equal to x. The algorithm need not explore deeper than the items its traversing, hence it is O(k).

Of course, it's a trivial matter translating this to an array algorithm:

#define ROOT       0
#define LEFT(pos)  ((pos)*2 + 1)
#define RIGHT(pos) ((pos)*2 + 2)

void smaller_than(int x, int pos, int heap[], int count)
{
    /* Make sure item exists */
    if (pos >= count)
        return;

    if (heap[pos] >= x)
    {
        /* Skip this node and its descendants, as they are all >= x . */
        return;
    }

    printf("%d\n", heap[pos]);

    smaller_than(x, LEFT(pos), heap, count);
    smaller_than(x, RIGHT(pos), heap, count);
}
Valuer answered 20/10, 2010 at 18:18 Comment(2)
is the pos variable necessary? and is count the length of the array?Nostology
pos is used to refer to the current "node" and is also needed in the calculations of the left and right "nodes". And yes, count is the length of the array. Of course, the array must be in heap form for this algorithm to work.Valuer
R
4

To get a running time of 'at least' something is not so difficult, I assume you mean 'at most'.

Unfortunately, a min-heap is not very good at finding anything else than the smallest value.

You can do a breadth-first depth-first scan of the tree-view of your heap, and terminate every branch where you have reached X. This will be O(k), but complicated.

To find all Y where Y <= X you might as well scan the entire array, this will be O(n) but with much less overhead.

The choice depends on the ratio n/k

Ruthy answered 20/10, 2010 at 17:59 Comment(0)
V
4

There is a simple recursive algorithm for a tree min-heap:

void smaller_than(Node *node, int x)
{
    if (node->value >= x)
    {
        /* Skip this node and its descendants, as they are all >= x . */
        return;
    }

    printf("%d\n", node->value);

    if (node->left != NULL)
        smaller_than(node->left, x);
    if (node->right != NULL)
        smaller_than(node->right, x);
}

If a subtree's root has a value that is greater than or equal to x, then, by definition of a min-heap, all of its descendants will also have values greater than or equal to x. The algorithm need not explore deeper than the items its traversing, hence it is O(k).

Of course, it's a trivial matter translating this to an array algorithm:

#define ROOT       0
#define LEFT(pos)  ((pos)*2 + 1)
#define RIGHT(pos) ((pos)*2 + 2)

void smaller_than(int x, int pos, int heap[], int count)
{
    /* Make sure item exists */
    if (pos >= count)
        return;

    if (heap[pos] >= x)
    {
        /* Skip this node and its descendants, as they are all >= x . */
        return;
    }

    printf("%d\n", heap[pos]);

    smaller_than(x, LEFT(pos), heap, count);
    smaller_than(x, RIGHT(pos), heap, count);
}
Valuer answered 20/10, 2010 at 18:18 Comment(2)
is the pos variable necessary? and is count the length of the array?Nostology
pos is used to refer to the current "node" and is also needed in the calculations of the left and right "nodes". And yes, count is the length of the array. Of course, the array must be in heap form for this algorithm to work.Valuer
L
1

The implementation as array is irrelevant, you can still do a top-down tree-search. Just instead of using the "classic" pointers, you need to calculate the respective child-node indexes.

With that out of the way do a recursive search from the top and stop recursing on each branch where the current node is greater than x. That will in general prune away a lot of values you don't need to check.

With k return values O(k) is obviously your best case. If your top node is <= x, you start getting results right away. If its greater, you are done - the result is empty.

From there you get results all the way down the subtree until you hit branches with values > x. You need to do at most 2*k Checks to prune those branches, so it looks like O(k) in total to me.

Leanaleanard answered 20/10, 2010 at 18:18 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.