If you are going to call the distance function many times during one execution of your program, you can gain some speed by using a precomputed table of bit counts. Here's (yet another) version of the Hamming distance function:
# _nbits[k] is the number of 1s in the binary representation of k for 0 <= k < 256.
_nbits = np.array(
[0, 1, 1, 2, 1, 2, 2, 3, 1, 2, 2, 3, 2, 3, 3, 4, 1, 2, 2, 3, 2, 3, 3,
4, 2, 3, 3, 4, 3, 4, 4, 5, 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4,
4, 5, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 1, 2, 2, 3, 2,
3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5,
4, 5, 5, 6, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4,
5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 1, 2, 2, 3, 2, 3, 3, 4, 2, 3,
3, 4, 3, 4, 4, 5, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 2,
3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6,
4, 5, 5, 6, 5, 6, 6, 7, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5,
6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 3, 4, 4, 5, 4, 5,
5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6,
7, 7, 8], dtype=np.uint8)
def hamming_distance1(a, b):
c = np.bitwise_xor(a, b)
n = _nbits[c].sum()
return n
In the following, a
and b
are the Python lists of length 32 given in a comment to the question. divakar_hamming_distance()
and divakar_hamming_distance_v2()
are from @Divakar's answer.
Here are the timings of @Divakar's functions:
In [116]: %timeit divakar_hamming_distance(a, b)
The slowest run took 5.57 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 11.3 µs per loop
In [117]: %timeit divakar_hamming_distance_v2(a, b)
The slowest run took 5.35 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 10.3 µs per loop
hamming_distance1(a, b)
is a bit faster:
In [118]: %timeit hamming_distance1(a, b)
The slowest run took 6.04 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 7.42 µs per loop
On my computer, initializing _nbits
takes about 11 µs, so there is no advantage to using hamming_distance1
if you only call the function once. If you call it three or more times, there is a net gain in performance.
If the inputs are already numpy arrays, all the functions are significantly faster:
In [119]: aa = np.array(a)
In [120]: bb = np.array(b)
In [121]: %timeit divakar_hamming_distance_v2(aa, bb)
The slowest run took 8.22 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 5.72 µs per loop
In [122]: %timeit hamming_distance1(aa, bb)
The slowest run took 12.67 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 2.77 µs per loop
Of course, if you always do that immediately before you compute the Hamming distance, the time to do the conversion must be included in the overall timing. However, if you write the code that generates a
and b
to take advantage of numpy earlier, you might already have them as numpy arrays by the time you compute the Hamming distance.
(I also experimented a bit with a 2-d array of precomputed Hamming distances between 8 bit values--an array with shape (256, 256)--but the initialization cost is higher and the performance gains are small.)
0 + 1
instead because 254 is all 1s except at one-bit, whereas 255 is all 1s? – Neighborlya
andb
? – Somewhere