What is the difference between chaining and probing in hash tables?
Asked Answered
V

1

6

How do they work? What is their major differences? What are their respective trade-offs? What are their types (if any)? When is one preferred to another (if at all)?

PS: I've already gone through Anagrams - Hashing with chaining and probing in C and Why do we use linear probing in Hash tables when there is separate chaining linked with lists?, but neither seems to draw a contrast between the two methods.

Vesting answered 10/4, 2016 at 14:38 Comment(1)
Possible duplicate of Chained Hash Tables vs. Open-Addressed Hash TablesAmalamalbena
A
18

Chaining and open-addressing (a simple implementation of which is based on linear-probing) are used in Hashtables to resolve collisions. A collision happens whenever the hash function for two different keys points to the same location to store the value.

In order to store both values, with different keys that would have been stored in the same location, chaining and open-addressing take different approaches: while chaining resolves the conflict by created a linked list of values with the same hash; open-addressing tries to attempts to find a different location to store the values with the same hash.

An interesting alternative to linear-probing for open-addressing conflict resolution is what is known as double-hashing.

The main difference that arises is in the speed of retrieving the value being hashed under different conditions.

Let's start with chaining as collision resolution. Notice here that after calculating the hash function for Lisa, you need to get the first element from the list to get the value required. Therefore, you access the pointer to the head of the list and then the value: 2 operations.

On the other hand, with open-addressing, such as linear-probing, when there is no collision, you immediately obtain the value you are seeking. That is, you require only 1 operation, which is faster.

However, when your HashTable starts to get full, and you have a high load factor, due to collisions happening more often, probing will require you to check more Hashtable locations before you find the actual value that you want. At about a load factor of 0.8, chaining starts to become more efficient due to multiple collisions: you would have to probe a lot of empty cells in order to find the actual value you want with probing, while with chaining you have a list of values that have the same hash key.

This is just a quick overview, as the actual data, the distribution of the keys, the hash function used and the precise implementation of collision resolution will make a difference in your actual speed.

Amalamalbena answered 10/4, 2016 at 14:54 Comment(4)
thanks for the answer. Can you please add a little detail about how exactly they work, and their types? TIA.Vesting
Sure, but I'll have to add that information later. Any particular information you are looking for? From your question I gathered that you know how they work.Amalamalbena
I've edited the question to containt how do they work? :D thanks for pointing out.Vesting
Added the requested extra information. I kept the description introductory: more information on hash tables can be found in the book by Cormen et al., Introduction to Algorithms.Amalamalbena

© 2022 - 2024 — McMap. All rights reserved.