I am trying to get my head around the details of a HAMT. I'd have implemented one myself in Java just to understand. I am familiar with Tries and I think I get the main concept of the HAMT.
Basically,
Two types of nodes:
Key/Value
Key Value Node:
K key
V value
Index
Index Node:
int bitmap (32 bits)
Node[] table (max length of 32)
- Generate a 32-bit hash for an object.
- Step through the hash 5-bits at a time.
(0-4, 5-9, 10-14, 15-19, 20-24, 25-29, 30-31)
note: the last step (7th) is only 2 bits. - At each step, find the position of that 5-bit integer in the bitmap. e.g.
integer==5 bitmap==00001
- If the bit is a 1 then that part of the hash exist.
- If the bit is a 0 then key doesn't exist.
- If the key exists, find it's index into the table by counting the number of 1s in the bitmap between 0 and the position. e.g.
integer==6 bitmap==0101010101 index==3
- If the table points to a key/value node then compare the keys.
- If the table points to a index node then go to 2 moving ahead one step.
The part I don't quite understand is collision detection and mitigation. In the linked paper he alludes to it:
The existing key is then inserted in the new sub-hash table and the new key added. Each time 5 more bits of the hash are used the probability of a collision reduces by a factor of 1/32. Occasionally an entire 32 bit hash may be consumed and a new one must be computed to differentiate the two keys.
If I were to compute a "new" hash and store the object at that new hash; how would you ever be able to look-up the object in the structure? When doing a look-up, wouldn't it generate the "initial" hash and not the "re-computed hash".
I must be missing something.....
BTW: The HAMT performs fairly well, it's sits between a hash map and tree map in my tests.
Data Structure Add time Remove time Sorted add time Sorted remove time Lookup time Size
Java's Hash Map 38.67 ms 18 ms 30 ms 15 ms 16.33 ms 8.19 MB
HAMT 68.67 ms 30 ms 57.33 ms 35.33 ms 33 ms 10.18 MB
Java's Tree Map 86.33 ms 78 ms 75 ms 39.33 ms 76 ms 8.79 MB
Java's Skip List Map 111.33 ms 106 ms 72 ms 41 ms 72.33 ms 9.19 MB