I'm wondering if a max or min heap tree is allowed to have duplicate values? I've been unsuccessful in trying to find information regarding this with online resources alone.
Yes, they can. You can read about this in 'Introduction to Algorithms' (by Charles E. Leiserson, Clifford Stein, Thomas H. Cormen, and Ronald Rivest). According to the definition of binary heaps in Wikipedia:
All nodes are either [greater than or equal to](max heaps) or [less than or equal to](min heaps) each of its children, according to a comparison predicate defined for the heap.
Yes they can have duplicates. From wikipedia definition of Heap:
Either the keys of parent nodes are always greater than or equal to those of the children and the highest key is in the root node (this kind of heap is called max heap) or the keys of parent nodes are less than or equal to those of the children and the lowest key is in the root node (min heap)
So if they have children nodes that are equal means that they can have duplicated.
Yes, but I would say no. For efficiency they shouldn't have different nodes with duplicate values or it loses it's purpose a bit (you would have to search child nodes and such). However, you could design each node to contain a variable that declares how many copies of that value you have in your data.
Again this is my opinion. If this is a bad way of doing it I would love if someone could explain why. I might just have to do some efficiency testing. If you are storing simple data types like ints then I would see it being less efficient but for larger object nodes that have ids it's been nice, it seems.
My short answer is that heaps can have duplicate keys but no duplicate elements. What does that mean?
In heaps, we have two different entities: (i) the element (which Cormen calls an object); and (ii) the key. Each element has a key associated with it. The element is an application object stored in the heap, and the key is a comparable value used in heap organization.
In Cormen's book, in the section about Heaps, he uses integers as keys in his examples. In this case, the elements are implicit, so we can have duplicate values because these values are keys.
On the other hand, in the section about Priority Queues, he explains:
"When you use a heap to implement a priority queue within a given application, elements of the priority queue correspond to objects in the application. Each object contains a key".
One example of this is the Prim algorithm. In this algorithm, the heap (priority queue) stores vertices as elements, and each vertex has a key value (the weight of a light edge connecting this vertex to the tree). In other words, the vertices are elements, and the keys are numbers associated with these vertices. Different vertices can have the same key value.
Ok! But why can't we have duplicate elements? Cormen explains:
"If the priority queue is implemented by a heap, you need to determine which application object corresponds to a given heap element, and vice versa. Because the heap elements are stored in an array, you need a way to map application objects to and from array indices. One way to map between application objects and heap elements uses handles... As an alternative to incorporating handles in application objects, you can store within the priority queue a mapping from application objects to array indices in the heap… One option for the mapping is a hash table."
In other words, given an element, the heap must find its heap index in constant time. We can do this using a hash table. How can we have the same element in two different places on the heap if the hash function will generate the same index for it!? The answer is that we cannot. Using handles, we have the same limitation.
© 2022 - 2025 — McMap. All rights reserved.