Removing an element from a known heap array position has O(log n)
complexity (which is optimal for a heap). Thus, this operation has the same complexity as extracting (i.e. removing) the root element.
The basic steps for removing the i-th element (where 0<=i<n
) from heap A (with n
elements) are:
- swap element
A[i]
with element A[n-1]
- set
n=n-1
- possibly fix the heap such that the heap-property is satisfied for all elements
Which is pretty similar to how the extraction of the root element works.
Remember that the heap-property is defined in a max-heap as:
A[parent(i)] >= A[i], for 0 < i < n
Whereas in a min-heap it's:
A[parent(i)] <= A[i], for 0 < i < n
In the following we assume a max-heap to simplify the description. But everything works analogously with a min-heap.
After the swap we have to distinguish 3 cases:
- new key in
A[i]
equals the old key - nothing changes, done
- new key in
A[i]
is greater than the old key. Nothing changes for the sub-trees l
and r
of i
. If previously A[parent(i)] >= A[j]
was true then now A[parent(i)]+c >= A[j]
must be true as well (for j in (l, r)
and c>=0
). But the ancestors of element i
might need fixing. This fix-up procedure is basically the same as when increasing A[i]
.
- new key in
A[i]
is smaller than the old key. Nothing changes for the ancestors of element i, because if the previous value already satisfied the heap property, a smaller value values does it as well. But the sub-trees might now need fixing, i.e. in the same way as when extracting the maximum element (i.e. the root).
An example implementation:
void heap_remove(A, i, &n)
{
assert(i < n);
assert(is_heap(A, i));
--n;
if (i == n)
return;
bool is_gt = A[n] > A[i];
A[i] = A[n];
if (is_gt)
heapify_up(A, i);
else
heapify(A, i, n);
}
Where heapifiy_up()
basically is the textbook increase()
function - modulo writing the key:
void heapify_up(A, i)
{
while (i > 0) {
j = parent(i);
if (A[i] > A[j]) {
swap(A, i, j);
i = j;
} else {
break;
}
}
}
And heapify()
is the text-book sift-down function:
void heapify(A, i, n)
{
for (;;) {
l = left(i);
r = right(i);
maxi = i;
if (l < n && A[l] > A[i])
maxi = l;
if (r < n && A[r] > A[i])
maxi = r;
if (maxi == i)
break;
swap(A, i, maxi);
i = maxi;
}
}
Since the heap is an (almost) complete binary tree, its height is in O(log n)
. Both heapify functions have to visit all tree levels, in the worst case, thus the removal by index is in O(log n)
.
Note that finding the element with a certain key in a heap is in O(n)
. Thus, removal by key value is in O(n)
because of the find complexity, in general.
So how can we keep track of the array position of an element we've inserted? After all, further inserts/removals might move it around.
We can keep track by also storing a pointer to an element record next to the key, on the heap, for each element. The element record then contains a field with the current position - which thus has to be maintained by modified heap-insert and heap-swap functions. If we retain the pointer to the element record after insert, we can get the element's current position in the heap in constant time. Thus, in that way, we can also implement element removal in O(log n)
.