I'm attempting to calculate the Hamming distance between an input hash and database-stored hashes. These are perceptual hashes, so the Hamming distance between them are important to me and tell me how similar two different images are (see http://en.wikipedia.org/wiki/Perceptual_hashing, http://jenssegers.com/61/perceptual-image-hashes, http://stackoverflow.com/questions/21037578/). Hashes are 16 hexadecimal characters long, and look like this:
b1d0c44a4eb5b5a9
1f69f25228ed4a31
751a0b19f0c2783f
My database looks like this:
CREATE TABLE `hashes` (
`id` int(11) NOT NULL,
`hash` binary(8) NOT NULL
) ENGINE=InnoDB AUTO_INCREMENT=4 DEFAULT CHARSET=latin1;
INSERT INTO `hashes` (`id`, `hash`) VALUES
(1, 0xb1d0c44a4eb5b5a9),
(2, 0x1f69f25228ed4a31),
(3, 0x751a0b19f0c2783f);
Now, I know I can query for a Hamming distance like so:
SELECT BIT_COUNT(0xb1d0c44a4eb5b5a9 ^ 0x751a0b19f0c2783f)
Which will output 38, as expected. However, I can't seem to reference a column name for this comparison. The following does not work as expected.
SELECT BIT_COUNT(hash ^ 0x751a0b19f0c2783f) FROM hashes
Does anyone know how I can calculate a Hamming distance like in my first SELECT
query above using the columns in my database? I've tried a myriad of scenarios using hex()
, unhex()
, conv()
, and cast()
in different ways. This is in MySQL.
Update My query above appears to work as expected when running in MySQL v8 (thanks to @LukStorms for pointing this out). You can use my fiddle below and change the version in the top left. My question now is: how can I ensure the behavior works in all versions of MySQL?
Fiddle: https://www.db-fiddle.com/f/mpqsUpZ1sv2kmvRwJrK5xL/0