I'm trying to implement an algorithm to solve the skyline problem that involves removing specific elements from the middle of a max heap. The way I currently do it is maxheap.remove(index)
but I have to follow up with a heapify(maxheap)
otherwise the order is thrown off. I know in java you can use something like a treemap
to do that. Is there anyway to do that in python more efficiently than calling two separate methods each of which takes O(n) time?
Removing an arbitrary item from a heap is an O(log n) operation, provided you know where the item is in the heap. The algorithm is:
Move the last item in the heap to the position that contains the item to remove.
Decrement heap count.
If the item is smaller than its parent
bubble it up the heap
else
sift it down the heap
The primary problem is finding the item's position in the heap. As you've noted, doing so is an O(n) operation unless you maintain more information.
An efficient solution to this is to create a dictionary that contains the item key, and the value is the index of that item in the heap. You have to maintain the dictionary, however:
- When you insert an item into the heap, add a dictionary entry
- When you remove an item from the heap, remove the dictionary entry
- Whenever you change the position of an item in the heap, update that item's value in the dictionary.
With that dictionary in place, you have O(1) access to an item's position in the heap, and you can remove it in O(log n).
I would have have each element be a data structure with a flag for whether to ignore it. When you heappop, you'll just pop again if it is an element that got flagged. This is very easy, obvious, and involves knowing nothing about how the heap works internally. For example you don't need to know where the element actually is in the heap to flag it.
The downside of this approach is that the flagged elements will tend to accumulate over time. Occasionally you can just filter them out then heapify.
If this solution is not sufficient for your needs, you should look for a btree implementation in Python of some sort. That will behave like the treemap that you are used to in Java.
Yes, there is a more efficient way - if you have its index or pointer (depending on implementation method).
Replace the number on the index/pointer you need to remove with its largest child and repeat the process recursively (replace the child with its largest child,etc...) until you get to a node that has no children, which you remove easily.
The complexity of this algorithm is O(log n).
This is an old question, but it feels incomplete without code. I have implemented a happypath heap with the ability to remove an element based on its value.
Hope this will be useful to someone who looks here later.
from collections import defaultdict
class HeapSet:
def __init__(self):
self.heap = []
self.positions = defaultdict(set)
def _swap(self, i: int, j: int):
obj = self.heap[i]
swap_with_obj = self.heap[j]
if obj == swap_with_obj:
return
self.heap[i], self.heap[j] = self.heap[j], self.heap[i]
self.positions[obj].add(j)
self.positions[obj].remove(i)
self.positions[swap_with_obj].add(i)
self.positions[swap_with_obj].remove(j)
def _sink(self, i: int):
while i < self.size():
swap_with = i
if i * 2 + 1 < self.size() and self.heap[swap_with] > self.heap[i * 2 + 1]:
swap_with = i * 2 + 1
if i * 2 + 2 < self.size() and self.heap[swap_with] > self.heap[i * 2 + 2]:
swap_with = i * 2 + 2
if swap_with == i:
break
else:
self._swap(i, swap_with)
i = swap_with
def _swim(self, i: int):
while i > 0:
swap_with = i
if (i - 1) // 2 >= 0 and self.heap[swap_with] < self.heap[(i - 1) // 2]:
swap_with = (i - 1) // 2
if swap_with == i:
break
else:
self._swap(i, swap_with)
i = swap_with
def add(self, obj):
self.heap.append(obj)
self.positions[obj].add(self.size() - 1)
self._swim(self.size() - 1)
def remove(self, obj):
pos = next(iter(self.positions[obj]))
self._swap(pos, self.size() - 1)
self.positions[obj].remove(self.size() - 1)
self.heap.pop()
if pos != self.size():
self._sink(pos)
self._swim(pos)
def get_top(self):
if not self.heap:
return None
return self.heap[0]
def size(self):
return len(self.heap)
© 2022 - 2024 — McMap. All rights reserved.
remove
operator (that denotes explicit access to heap elements), it might also containchange priority
one. – Relationship