I have some hashes stored in mysql, which I would fetch with comparison by hamming distance.
Hashes stored are these:
qw 1 ffe71b001820a1fd
qw 2 ffffb81c1c3838a0
qw 3 fff8381c1c3e3828
qw 4 fffa181c3c2e3920
qw 5 fffa981c1c3e2820
qw 6 ff5f1c38387c1c04
qw 7 fff1e0c1c38387ef
qw 8 fffa181c1c3e3820
qw 9 fffa381c1c3e3828
I normally fetch like:
SELECT product_id, HAMMING_DISTANCE(phash, 'phashfromuserinput') ;
But in mysql hamming distance is bitwise operator which I can do if strings were only numbers:
SELECT pagedata,BIT_COUNT(pagecontent^'$encrypted')searchengine WHERE pagecontent > 2 ; ")
It only works in integer (number) but my requirement is work with numbers and alphabets, for example:
74898fababfbef46 and 95efabfeba752545
From my little research I know that first I have to convert field to binary
and then use bitcount
by using CAST
or CONVERT
like:
SELECT BIT_COUNT( CONV( hash, 2, 10 ) ^
0b0000000101100111111100011110000011100000111100011011111110011011 )
or
SELECT BIT_COUNT(CAST(hash AS BINARY)) FROM data;
This is ok as converting data to binary
and using bitcount
. Now question arises that varbinary
characters/hashes stored in mysql
already are alphanumeric and if I convert field to varbinary
and bitcount
then it will not work as stored hashes are not binary strings.
What should I do?
I was refering as php hamming distance matching example of:
function HammingDistance($bin1, $bin2) {
$a1 = str_split($bin1);
$a2 = str_split($bin2);
$dh = 0;
for ($i = 0; $i < count($a1); $i++)
if($a1[$i] != $a2[$i]) $dh++;
return $dh;
}
echo HammingDistance('10101010','01010101'); //returns 8
But I'm not understanding how to match with mysql and fetch, because I can't implement it in mysql.
'v'
and'g'
, and those aren't valid hex digits. What in the plastic? Before your question can be answered, you need to explain what binary value95gfgdgd75425456
is supposed to represent. – Reinhardt74898acvdf566556
and95gfgdgd7542545
are supposed to represent. (Those are sixteeen characters, and most of the characters are valid hex digits, but the charactersv
andg
are not valid hexadecimal digits.) As to your "main doubt is it work in alphanumeric or not"... No, it doesn't work. Hamming distance works on binary values. It's easy to convert hexadecimal string representation to binary... – ReinhardtCONVERT(CONV('ffe71b001820a1fd',16,10),UNSIGNED)
Reference: CONV() function and CONVERT() function. – ReinhardtBIGINT UNSIGNED
. You can use either theCAST
orCONVERT
function, and specifyUNSIGNED
as the target datatype. The MySQL expression you use to do that conversion depends on the datatype/representation you are converting from. See the answer from Rick James to an example of converting a literal string or varchar column that contains 16 hexadecimal digits. – Reinhardt