How can I implement decrease-key functionality in Python's heapq?
Asked Answered
W

7

47

I know it is possible to realize decrease-key functionality in O(log n) but I don't know how?

Weatherproof answered 23/9, 2009 at 12:23 Comment(1)
See #9256120 and my answer on why it's not necessaryPelops
B
47

To implement "decrease-key" effectively, you'd need to access the functionality "decrement this element AND swap this element with a child until heap condition is restore". In heapq.py, that's called _siftdown (and similarly _siftup for INcrementing). So the good news is that the functions are there... the bad news is that their names start with an underscore, indicating they're considered "internal implementation details" and should not be accessed directly by application code (the next release of the standard library might change things around and break code using such "internals").

It's up to you to decide whether you want to ignore the warning leading-_, use O(N) heapify instead of O(log N) sifting, or reimplement some or all of heapq's functionality to make the sifting primitives "exposed as public parts of the interface". Since heapq's data structure is documented and public (just a list), I think the best choice is probably a partial-reimplementation -- copy the sifting functions from heapq.py into your application code, essentially.

Bustup answered 23/9, 2009 at 14:41 Comment(3)
The link to heapq.py seems to be stale. For convenience here is another link to the python implementation: hg.python.org/cpython/file/2.7/Lib/heapq.pyVoice
do you mean "swap this element with its parent until heap condition is restored"? (i assumed if there were elements, [2, 3, 5], then 2 would be the parent, and 3 and 5 would be its two children)Autobus
It should be noted that even if you can implement "decrease-key" or more generically "update-key", that functionality presumes that you have a way to track indices on the heap, so that you can pinpoint which item you want to operate on (otherwise you might have to search for it in linear time). The first obvious solution would be to augment your heap structure with a key-to-index hashmap. From then, heap changing operations (such as _siftup and _siftdown) should trigger an update of the map.Mccary
S
14

Decrease-key is a must-have operation for many algorithms (Dijkstra's Algorithm, A*, OPTICS), i wonder why Python's built-in priority queue does not support it.

Unfortunately, i wasn't able to download math4tots's package.

But, i was able to find this implementation by Daniel Stutzbach. Worked perfectly for me with Python 3.5.

hd = heapdict()
hd[obj1] = priority
hd[obj1] = lower_priority
# ...
obj = hd.pop()
Sanson answered 3/5, 2016 at 15:12 Comment(1)
It's not a must-have as there are workarounds #46996564Pelops
B
7

The heapq documentation has an entry on exactly how to do this.

However, I have written a heap package that does exactly this (it is a wrapper around heapq). So if you have pip or easy_install you could do something like

pip install heap

Then in your code write

from heap.heap import heap

h = heap()

h['hello'] = 4 # Insert item with priority 4.

h['hello'] = 2 # Update priority/decrease-key has same syntax as insert. 

It is pretty new though, so might be full of bugs.

Bogosian answered 17/5, 2014 at 3:24 Comment(1)
Admittedly, the solution mentioned in the docs is pretty hacky. It suggest that when changing a value in the heap, one keeps the old one and sets a certain marker on it to indicate that it is invalid, while at the same time inserting the new element. When you change values frequently (without any pop in between) your heap will just grow and grow, containing lots of "dead" junk. Then there is the overhead of the "invalid" markers themselves: When storing e.g. just integers, these "invalid" markers are a significant overhead. Not a great solution.Tother
A
6

Imagine you are using a heap as a priority queue, where you have a bunch of tasks represented by strings and each task has a key. For concreteness, look at: task_list = [[7,"do laundry"], [3, "clean room"], [6, "call parents"]] where each task in task_list is a list with a priority and description. If you run heapq.heapify(task_list), you get your array to maintain the heap invariant. However, if you want to change the priority of "do laundry" to 1, you have no way of knowing where "do laundry" is in the heap without a linear scan through the heap (hence can't do decrease_key in logarithmic time). Note decrease_key(heap, i, new_key) requires you to know the index of the value to change in the heap.

Even if you maintain a reference to each sublist and actually change the key, you still can't do it in log time. Since a list is just a reference to a bunch of mutable objects, you could try doing something like remember the original order of the task: (in this case that "do laundry" was the 0th task in your original task_list):

task_list = [[7, "do laundry"], [3, "clean room"], [6, "call parents"]]
task_list_heap = task_list[:] # make a non-deep copy
heapq.heapify(task_list_heap)
# at this point:
# task_list = [[7, 'do laundry'], [3, 'clean room'], [6, 'call parents']]
# task_list_heap = [3, 'clean room'], [7, 'do laundry'], [6, 'call parents']]
# Change key of first item of task_list (which was "do laundry") from 7 to 1.
task_list[0][0] = 1
# Now:
# task_list = [[1, 'do laundry'], [3, 'clean room'], [6, 'call parents']]
# task_list_heap = [3, 'clean room'], [1, 'do laundry'], [6, 'call parents']]
# task_list_heap violates heap invariant at the moment

However, you now need to call heapq._siftdown(task_list_heap, 1) to maintain the heap invariant in log time (heapq.heapify is linear time), but unfortunately we don't know the index of "do laundry" in task_list_heap (the heap_index in this case is 1).

So we need to implement our heap keeps track of the heap_index of each object; e.g., have an list (for the heap) and a dict mapping each object to its index in the heap/list (that gets updated as the heap positions get swapped adding a constant factor to each swap). You could read through heapq.py and implement yourself as the procedure is straightforward; however, others have implement this sort of HeapDict already.

Angloirish answered 10/12, 2012 at 15:47 Comment(0)
M
4

It might be unnecessary to have the decrease_key function (albeit it's nice to have it).

You can just push your (priority, item) into the priority queue anyway, and use a set to check whether you have seen it. For example:

pq = []  # heapq is a min heap
seen = set()
heappush(pq, (2, "item1"))
heappush(pq, (3, "item2"))
heappush(pq, (1, "item3"))
heappush(pq, (4, "item4"))
heappush(pq, (2, "item2"))

while pq:
    p, item = heappop(pq)
    if item not in seen:
        seen.add(item)
        print(item, p)
    else:
        print(item, "is already handled with a higher priority!")

The output is:

item3 1
item1 2
item2 2
item2 is already handled with a higher priority!
item4 4
Microtome answered 3/2, 2021 at 9:46 Comment(2)
Dijkstra can also be implemented without decrease key (gist.github.com/kachayev/5990802), it's nice to have, but not necessary. Additionally, decrease key introduces extra time complexity and saves some space, it's more or less a trade-offMicrotome
Your "not a solution not a good idea" assumption is not valid at all. As another answer has pointed out, not having decrease key does not only impact runtime in theory and in practice but also works better ("our results show that using a standard priority queue without the decrease-key operation results in better performance than using one with the decrease-key operation in most cases" in www3.cs.stonybrook.edu/%7Erezaul/papers/TR-07-54.pdf)Microtome
P
3

This functionality is also missing from C++ and Java standard library priority queues. The standard workaround is to push a new key-value pair and either implicitly or explicitly mark the original key-value pair as invalid.

See How to update elements within a heap? (priority queue) and Why does Dijkstra's algorithm use decrease-key? (the conclusion there is that not having decrease-key won't significantly impact runtime in theory and in practice; in particular see the paper https://www3.cs.stonybrook.edu/~rezaul/papers/TR-07-54.pdf)

Pelops answered 7/10, 2021 at 5:13 Comment(0)
O
-1

Priority Queue Simple Implementation with Min heap with unique keys. This implementation updates the priority of the key during push() operation if key already exists in priority queue.

import heapq

class HeapPQ:
    """
    Only hashable key type is supported
    """

    def __init__(self):
        self.pq = []
        self.pq_set = set()
    
    def push(self, priority, key):
        if key not in self.pq_set:
            heapq.heappush(self.pq, (priority, key))
            self.pq_set.add(key)
        else:
            index = list(map(lambda x:x[1], self.pq)).index(key)
            self.pq[index] = (priority, key)
            heapq.heapify(self.pq)
    
    def pop(self):
        priority, key = heapq.heappop(self.pq)
        self.pq_set.remove(key)
        return priority, key
    
    def empty(self) -> bool:
        return len(self.pq) == 0

Example Usage:

pq = HeapPQ()

pq.push(5, "A")
pq.push(3, "B")
pq.push(1, "A")

while not pq.empty():
    print(pq.pop())
Odie answered 10/5, 2022 at 3:16 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.