How can I use binary heap in the Dijkstra algorithm?
Asked Answered
E

8

35

I am writing code of dijkstra algorithm, for the part where we are supposed to find the node with minimum distance from the currently being used node, I am using a array over there and traversing it fully to figure out the node.

This part can be replaced by binary heap and we can figure out the node in O(1) time, but We also update the distance of the node in further iterations, How will I incorporate that heap?

In case of array, all I have to do is go to the (ith -1) index and update the value of that node, but same thing can't be done in Binary heap, I will have to do the full search to figure out the position of the node and then update it.

What is workaround of this problem?

Emissive answered 10/1, 2013 at 7:11 Comment(1)
possible duplicate of Does a binary heap support the decrease-key operation?Shackleford
R
31

This is just some information I found while doing this in a class, that I shared with my classmates. I thought I'd make it easier for folks to find it, and I had left this post up so that I could answer it when I found a solution.

Note: I'm assuming for this example that your graph's vertices have an ID to keep track of which is which. This could be a name, a number, whatever, just make sure you change the type in the struct below. If you have no such means of distinction, then you can use pointers to the vertices and compare their pointed-to addresses.

The problem you are faced with here is the fact that, in Dijkstra's algorithm, we are asked to store the graphs vertices and their keys in this priority queue, then update the keys of the ones left in the queue. But... Heap data-structures have no way of getting at any particular node that is not the minimum or the last node!
The best we'd be able to do is traverse the heap in O(n) time to find it, then update its key and bubble-it-up, at O(Logn). That makes updating all vertices O(n) for every single edge, making our implementation of Dijkstra O(mn), way worse than the optimal O(mLogn).

Bleh! There has to be a better way!

So, what we need to implement isn't exactly a standard min-heap-based priority queue. We need one more operation than the standard 4 pq operations:

  1. IsEmpty
  2. Add
  3. PopMin
  4. PeekMin
  5. and DecreaseKey

In order to DecreaseKey, we need to:

  • find a particular vertex inside the Heap
  • lower its key-value
  • "heap-up" or "bubble-up" the vertex

Essentially, since you were (I'm assuming it has been implemented sometime in the past 4 months) probably going to use an "array-based" heap implementation, this means that we need the heap to keep track of each vertex and its index in the array in order for this operation to be possible.

Devising a struct like: (c++)

struct VertLocInHeap
{
    int vertex_id;
    int index_in_heap;
};

would allow you to keep track of it, but storing those in an array would still give you O(n) time for finding the vertex in the heap. No complexity improvement, and it's more complicated than before. >.<
My suggestion (if optimization is the goal here):

  1. Store this info in a Binary Search Tree whose key value is the `vertex_id`
  2. do a binary-search to find the vertex's location in the Heap in O(Logn)
  3. use the index to access the vertex and update its key in O(1)
  4. bubble-up the vertex in O(Logn)

I actually used a std::map declared as: std::map m_locations; in the heap instead of using the struct. The first parameter (Key) is the vertex_id, and the second parameter (Value) is the index in the heap's array. Since std::map guarantees O(Logn) searches, this works nicely out-of-the-box. Then whenever you insert or bubble, you just m_locations[vertexID] = newLocationInHeap;
Easy money.

Analysis:
Upside: we now have O(Logn) for finding any given vertex in the p-q. For the bubble-up we do O(Log(n)) movements, for each swap doing a O(Log(n)) search in the map of array indexes, resulting in a O(Log^2(n) operation for bubble-up.
So, we have a Log(n) + Log^2(n) = O(Log^2(n)) operation for updating the key values in the Heap for a single edge. That makes our Dijkstra alg take O(mLog^2(n)). That's pretty close to the theoretical optimum, at least as close as I can get it. Awesome Possum!
Downside: We are storing literally twice as much information in-memory for the heap. Is it a "modern" problem? Not really; my desky can store over 8 billion integers, and many modern computers come with at least 8GB of RAM; however, it is still a factor. If you did this implementation with a graph of 4 billion vertices, which can happen a lot more often than you'd think, then it causes a problem. Also, all those extra reads/writes, which may not affect the complexity in analysis, may still take time on some machines, especially if the information is being stored externally.

I hope this helps someone in the future, because I had a devil of a time finding all this information, then piecing the bits I got from here, there, and everywhere together to form this. I'm blaming the internet and lack of sleep.

Rover answered 14/4, 2013 at 16:45 Comment(4)
>>Actually, the time-analysis is wrong. I found this out a few days later and haven't been back. It actually ends up being a total of O(log^2(n)), because the bubble-up function also uses the O(log(n)) search to update the index in the std::map as it's performing O(log(n)) operations. That's a O(log(n)) operation, O(log(n)) times = O(log^2(n)). That's my bad, and I'll eventually edit the actual answer to reflect this... when I've had a few fewer martinis.Rover
Just noting that I fixed the aforementioned time-analysis mistake in the actual body of the answer. Hopefully that helps.Rover
One huge thing you forget to mention is that if you use a HashTable, you no longer can store duplicate elements inside the heap due to the fact that elements in the hash table must be unique.Pleven
@Pleven I suppose I did fail to mention in my top note there that the ID I assume you have is unique, didn't I? Thanks! I'll edit that in momentarily.Rover
B
4

The problem I ran into with using any form of heap is that, you need to reorder the nodes in the heap. In order to do that, you would have to keep popping everything from the heap until you found the node you need, then change the weight, and push it back in (along with everything else you popped). Honestly, just using an array would probably be more efficient and easier to code than that.

The way I got around this was I used a Red-Black tree (in C++ it's just the set<> data type of the STL). The data structure contained a pair<> element which had a double (cost) and string (node). Because of the tree structure, it is very efficient to access the minimum element (I believe C++ makes it even more efficient by maintaining a pointer to the minimum element).

Along with the tree, I also kept an array of doubles that contained the distance for a given node. So, when I needed to reorder a node in the tree, I simply used the old distance from the dist array along with the node name to find it in the set. I would then remove that element from the tree and re-insert it into the tree with the new distance. To search for a node O(log n) and to insert a node O(log n), so the cost to reorder a node is O(2 * log n) = O(log n). For a binary heap, it also has a O(log n) for both insert and delete (and doesn't support search). So with the cost of deleting all of the nodes until you find the node you want, change its weight, then insert all nodes back in. Once the node has been reordered, I would then change the distance in the array to reflect the new distance.

I honestly can't think of a way to modify a heap in such a way to allow it to dynamically change the weights of a node, because the whole structure of the heap is based on the weights the nodes maintain.

Boneset answered 10/1, 2013 at 17:29 Comment(1)
You can modify the heap to contain a hash-table that can give you the index of the nodes in the min-heap for key decrease in O(1) time. You need to do some additional bookkeeping in the min-heap methods, but their asymptotic running time is still the same. While your method achieves the same asymptotic running time as well, the constants will be higher. See my answer for a full explanation.Lemay
L
4

I would do this using a hash table in addition to the Min-Heap array.

The hash table has keys that are hash coded to be the node objects and values that are the indices of where those nodes are in the min-heap arrray.

Then anytime you move something in the min-heap you just need to update the hash table accordingly. Since at most 2 elements will be moved per operation in the min-heap (that is they are exchanged), and our cost per move is O(1) to update the hash table, then we will not have damaged the asymptotic bound of the min-heap operations. For example, minHeapify is O(lgn). We just added 2 O(1) hash table operations per minHeapify operation. Therefore the overall complexity is still O(lgn).

Keep in mind you would need to modify any method that moves your nodes in the min-heap to do this tracking! For example, minHeapify() requires a modification that looks like this using Java:

Nodes[] nodes;
Map<Node, int> indexMap = new HashMap<>();

private minHeapify(Node[] nodes,int i) {
    int smallest;
    l = 2*i; // left child index
    r = 2*i + 1; // right child index
    if(l <= heapSize && nodes[l].getTime() < nodes[i].getTime()) {
        smallest = l;
    }
    else {
        smallest = i;
    }
    if(r <= heapSize && nodes[r].getTime() < nodes[smallest].getTime()) {
        smallest = r;
    }
    if(smallest != i) {
        temp = nodes[smallest];
        nodes[smallest] = nodes[i];
        nodes[i] = temp;
        indexMap.put(nodes[smallest],i); // Added index tracking in O(1)
        indexMap.put(nodes[i], smallest); // Added index tracking in O(1)
        minHeapify(nodes,smallest);
    }
}

buildMinHeap, heapExtract should be dependent on minHeapify, so that one is mostly fixed, but you do need the extracted key to be removed from the hash table as well. You'd also need to modify decreaseKey to track these changes as well. Once that's fixed then insert should also be fixed since it should be using the decreaseKey method. That should cover all your bases and you will not have altered the asymptotic bounds of your algorithm and you still get to keep using a heap for your priority queue.

Note that a Fibonacci Min Heap is actually preferred to a standard Min Heap in this implementation, but that's a totally different can of worms.

Lemay answered 19/7, 2013 at 17:11 Comment(0)
W
4

Another solution is "lazy deletion". Instead of decrease key operation you simply insert the node once again to heap with new priority. So, in the heap there will be another copy of node. But, that node will be higher in the heap than any previous copy. Then when getting next minimum node you can simply check if node is already being accepted. If it is, then simply omit the loop and continue (lazy deletion).

This has a little worse performance/higher memory usage due to copies inside the heap. But, it is still limited (to number of connections) and may be faster than other implementations for some problem sizes.

Wouldst answered 23/11, 2017 at 20:2 Comment(0)
W
2

This algorithm: http://algs4.cs.princeton.edu/44sp/DijkstraSP.java.html works around this problem by using "indexed heap": http://algs4.cs.princeton.edu/24pq/IndexMinPQ.java.html which essentially maintains the list of mappings from key to array index.

Winker answered 23/6, 2013 at 15:40 Comment(0)
G
1

I believe the main difficulty is being able to achieve O(log n) time complexity when we have to update vertex distance. Here are the steps on how you could do that:

  1. For heap implementation, you could use an array.
  2. For indexing, use a Hash Map, with Vertex number as the key and its index in heap as the value.
  3. When we want to update a vertex, search its index in the Hash Map in O(1) time.
  4. Reduce the vertex distance in heap and then keep traversing up (Check its new distance against its root, if root's value is greater swap root and current vertex). This step would also take O(log n).
  5. Update the vertex's index in Hash Map as you make changes while traversing up the heap.

I think this should work and the overall time complexity would be O((E+V)*log V), just as the theory implies.

Gressorial answered 26/5, 2021 at 15:34 Comment(0)
H
0

I am using the following approach. Whenever I insert something into the heap I pass a pointer to an integer (this memory location is ownned by me, not the heap) which should contain the position of the element in the array managed by the heap. So if the sequence of elements in the heap is rearranged it is supposed to update the values pointed to by these pointers.

So for the Dijkstra algirithm I am creating a posInHeap array of sizeN.

Hopefully, the code will make it more clear.

template <typename T, class Comparison = std::less<T>> class cTrackingHeap
{
public:
    cTrackingHeap(Comparison c) : m_c(c), m_v() {}
    cTrackingHeap(const cTrackingHeap&) = delete;
    cTrackingHeap& operator=(const cTrackingHeap&) = delete;

    void DecreaseVal(size_t pos, const T& newValue)
    {
        m_v[pos].first = newValue;
        while (pos > 0)
        {
            size_t iPar = (pos - 1) / 2;
            if (newValue < m_v[iPar].first)
            {
                swap(m_v[pos], m_v[iPar]);
                *m_v[pos].second = pos;
                *m_v[iPar].second = iPar;
                pos = iPar;
            }
            else
                break;
        }
    }

    void Delete(size_t pos)
    {
        *(m_v[pos].second) = numeric_limits<size_t>::max();// indicate that the element is no longer in the heap

        m_v[pos] = m_v.back();
        m_v.resize(m_v.size() - 1);

        if (pos == m_v.size())
            return;

        *(m_v[pos].second) = pos;

        bool makingProgress = true;
        while (makingProgress)
        {
            makingProgress = false;
            size_t exchangeWith = pos;
            if (2 * pos + 1 < m_v.size() && m_c(m_v[2 * pos + 1].first, m_v[pos].first))
                exchangeWith = 2 * pos + 1;
            if (2 * pos + 2 < m_v.size() && m_c(m_v[2 * pos + 2].first, m_v[exchangeWith].first))
                exchangeWith = 2 * pos + 2;
            if (pos > 0 && m_c(m_v[pos].first, m_v[(pos - 1) / 2].first))
                exchangeWith = (pos - 1) / 2;

            if (exchangeWith != pos)
            {
                makingProgress = true;
                swap(m_v[pos], m_v[exchangeWith]);
                *m_v[pos].second = pos;
                *m_v[exchangeWith].second = exchangeWith;
                pos = exchangeWith;
            }
        }
    }

    void Insert(const T& value, size_t* posTracker)
    {
        m_v.push_back(make_pair(value, posTracker));
        *posTracker = m_v.size() - 1;

        size_t pos = m_v.size() - 1;

        bool makingProgress = true;
        while (makingProgress)
        {
            makingProgress = false;

            if (pos > 0 && m_c(m_v[pos].first, m_v[(pos - 1) / 2].first))
            {
                makingProgress = true;
                swap(m_v[pos], m_v[(pos - 1) / 2]);
                *m_v[pos].second = pos;
                *m_v[(pos - 1) / 2].second = (pos - 1) / 2;
                pos = (pos - 1) / 2;
            }
        }
    }

    const T& GetMin() const
    {
        return m_v[0].first;
    }

    const T& Get(size_t i) const
    {
        return m_v[i].first;
    }

    size_t GetSize() const
    {
        return m_v.size();
    }

private:
    Comparison m_c;
    vector< pair<T, size_t*> > m_v;
};
Hookup answered 3/4, 2017 at 6:49 Comment(0)
A
0

You can also use Fibonacci or Pairing heaps, which support decrease key in O(1). Boost provide an implementation for both

Angelinaangeline answered 16/10, 2023 at 13:48 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.