How can I evenly distribute distinct keys in a hashtable?
Asked Answered
S

2

7

I have this formula:

index = (a * k) % M

which maps a number 'k', from an input set K of distinct numbers, into it's position in a hashtable. I was wondering how to write a non-brute force program that finds such 'M' and 'a' so that 'M' is minimal, and there are no collisions for the given set K.

Smoke answered 9/8, 2015 at 16:34 Comment(13)
I'm assuming you have the set of all k available when finding these numbers? Otherwise I don't see how it would be possible to find the numbers under your constraints (namely no collisions).Cateyed
@JohnPirie I need answer for this specific question and not how to construct a good hash function.Smoke
Look up Perfect hash function.Czerny
Is a a constant given as an input to the algorithm?Czerny
@MarkPeters the input is {k1,k2,k3,k4,k5,...} so k refers to every element from that set.Smoke
@dasblinkenlight no, 'a' needs to be found tooSmoke
Then I must not understand your question. Is k a positive integer? If so, the answer is trivial, with a = 1 and M = k. Beyond integers, it depends on the domain.Dairen
@JohnPirie then why is M minimal in that case?Smoke
Your problem is not completely stated. If you have a fixed set of keys k, then it's a complete problem. As @dasblinkenlight says, perfect hash functions are very well studied.Clearstory
@Clearstory k is a key, from a set of keys, which happens to get an integer.Smoke
I edited your question to make this clearer. Please check out drdobbs.com/architecture-and-design/…. A simple and good heuristic. I'm pretty sure true minimization is NP hard. There are also some references there.Clearstory
Why is minimising M desirable? Surely minimising the maximum index is a better goal as it indicates better compaction of the hash values....Ane
@TonyD if M is minimal less space is being used, I guess.Smoke
U
1

If, instead of a numeric multiplication you could perform a logic computation (and / or /not), I think that the optimal solution (minimum value of M) would be as small as card(K) if you could get a function that related each value of K (once ordered) with its position in the set.

Theoretically, it must be possible to write a truth table for such a relation (bit a bit), and then simplify the minterms through a Karnaugh Table with a proper program. Depending on the desired number of bits, the computational complexity would be affordable... or not.

Unknown answered 9/8, 2015 at 19:11 Comment(0)
P
0

If a is co-prime to M then a * k = a * k' mod M if, and only if, k = k' mod M, so you might as well use a = 1, which is always co-prime to M. This also covers all the cases in which M is prime, because all the numbers except 0 are then co-prime to M.

If a and M are not co-prime, then they share a common factor, say b, so a = x * b and M = y * b. In this case anything multiplied by a will also be divisible by b mod M, and you might as well by working mod y, not mod M, so there is nothing to be gained by using an a not co-prime to M.

So for the problem you state, you could save some time by leaving a=1 and trying all possible values of M.

If you are e.g. using 32-bit integers and really calculating not (a * k) mod M but ((a * k) mod 2^32) mod M you might be able to find cases where values of a other than 1 do better than a=1 because of what happens in (a * k) mod 2^32.

Psychoneurotic answered 9/8, 2015 at 18:38 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.