Generating numbers with high hamming distance
Asked Answered
C

2

6

I am looking for a fast way to generate k non-negative integers smaller than 2^64, of which, in base 2, the minimal Hamming distance between any two of the numbers is as high as possible.

For example, if I were looking for k=4 numbers and they should be smaller than 2^4 they could be:
0000
0011
1100
1111
And the minimal Hamming distance would be 2.

Is there a fast algorithm to generate those numbers for a given k? I will have k's in the order of 10^4.

Alternatively an algorithm generating a set of numbers, which have all pairwise an Hamming distance greater than a given value would work fine too.

Cymene answered 16/10, 2015 at 16:36 Comment(1)
This is a vast area. You are in effect looking for an optimal error-correcting code that can code 10^4 pieces of data using 64 bits. See this: en.wikipedia.org/wiki/Hamming_bound (with related links) as a gateway into the topic.Septi
G
4

Here's a fairly trivial way. Find the smallest number of bits = b that can express k different numbers. E.g. for k=4 use b = 2 bits. Divide 64 bits up into chunks of size 2. For each chunk, give each number to be generated a different number among the 2^b >= k available.

E.g. for k=4 numbers the b bits are 00, 01, 10, 11 and here are 4 possibilities:

0000

0101

1010

1111

If you have c chunks, then each number is different from each other number in at least one bit of each of c chunks, so the minimum guaranteed hamming separation is c.

You could also permute the selection of numbers within each chunk, which would produce more random looking examples such as

0011

0101

1000

1110

Galven answered 16/10, 2015 at 17:3 Comment(0)
S
4

The answer due to mcdowella is a very good way to rapidly generate numbers with a certain minimum Hamming distance from each other. But, it doesn't guarantee that the resulting Hamming distances are particularly large (if I understand it correctly it would guarantee a Hamming distance of at least 4 between any two of 10^4 64-bit numbers, although the actual minimum Hamming distance could be larger). If you really want to get the minimum Hamming distance as large as possible, and are willing to expend some more CPU cycles in doing so, I would recommend using a Reed-Solomon code. If you need 10^4 numbers, you can interpret each number in 1,2,...,10000 as a binary message of length 14 to be coded in binary code words of length 64. For that you would use the Reed-Solomon code [64,14,51]_2, which would guarantee a Hamming distance of at least 51 between any two of the resulting 64-bit numbers. As you can see from the linked Wikipedia article, the construction is fairly complex, though you should be able to use an open-source implementation (the Wikipedia article gives some links) so that you wouldn't have to reinvent the wheel.

Septi answered 16/10, 2015 at 23:12 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.