hash table lookup time

Asked 17/1, 2013 at 5:54 Answered 10/4, 2020 at 13:48

When we are insert/lookup an key in a hash table, textbook said it is O(1) time. Yet, how is possible to have an O(1) lookup time? If the hash table store the key in a vector, it will cost O(N), if in a binary tree, it will be O(logN). I just can't image some data structure with O(1) accessing time.

Thanks!

Hewett answered 17/1, 2013 at 5:54 Comment(0)

The hashtable hashes your key and put it in array.

For example, hash(x) = 3, where x is your key. The table then puts it into array[3]. Accessing from array is O(1).

Unclassical answered 17/1, 2013 at 5:59 Comment(1)

The worst case for lookup on a hash table is O(n). Consider: hash(x) = 3, hash(y) = 3 hash(z) = 3, etc etc – Tiepolo 28/10, 2013 at 4:55

At a minimum, hash tables consist of an array and hash function. When an object is added to the table, the hash function is computed on the object and it is stored in the array at the index of the corresponding value that was computed. e.g., if hash(obj) = 2 then arr[2] = obj.

The average insert/lookup on a hash table is O(1).

However it is possible to have collisions when objects compute the same hash value.

In the general case there are "buckets" at each index of the array to handle these collisions. Meaning, all three objects are stored in some other data structure (maybe a linked list or another array) at the index of the hash table.

Therefore, the worst case for lookup on a hash table is O(n) because it is possible that all objects stored in the hash table have collided and are stored in the same bucket.

Tiepolo answered 28/10, 2013 at 4:55 Comment(0)

Technically speaking, hash table lookup, if there is no collision, is O(logn). This is because hashing time is linear with respect to the size (in bytes) of the identifier, and the smallest that a new identifier added to the hash table can be, for that identifier to be unique, is O(logn).

However, the log of all the computer memory in the world is just such a small number, which means that we have very good upper bounds on hash table identifier size. Case in point, log10 of the number of particles in the observable universe is estimated at slightly over 80; in log2 it's about 3.3 times as much. Logarithms grow very slowly.

As a result, most log terms can be treated as constant terms. It's just that traditionally we only apply this fact to hash tables, but not to search trees, in order to teach recurrence relations to students.

Bonnett answered 10/4, 2020 at 13:48 Comment(0)

Recommended topics

Hot tags