UPDATED ANSWER
Looking at the example output of Walter Tross's code, I think that generating a random solution can be simplified to this:
Take any vector to start with, e.g. for n=8, m=3, k=5:
A: 01001100
After every step, sum the vectors to get the number of times each position has been used:
SUM: 01001100
Then, for the next vector, place the ones at positions that have been used least (in this case zero times), e.g.:
B: 00110001
to get:
A: 01001100
B: 00110001
SUM: 01111101
Then, there are 2 least-used positions left, so for the 3 ones in the next vector, use those 2 positions, and then put the third one anywhere:
C: 10010010
to get:
A: 01001100
B: 00110001
C: 10010010
SUM: 11121111 (or reset to 00010000 at this point)
Then for the next vector, you have 7 least-used positions (the ones in the sum), so choose any 3, e.g.:
D: 10100010
to get:
A: 01001100
B: 00110001
C: 10010010
D: 10100010
SUM: 21221121
And for the final vector, choose any of the 4 least-used positions, e.g.:
E: 01000101
To generate all solutions, simply generate every possible vector in each step:
A: 11100000, 11010000, 11001000, ... 00000111
Then, e.g. when A and SUM are 11100000:
B: 00011100, 00011010, 00011001, ... 00000111
Then, e.g. when B is 00011100 and SUM is 11111100:
C: 10000011, 01000011, 00100011, 00010011, 00001011, 00000111
Then, e.g. when C is 10000011 and SUM is 21111111:
D: 01110000, 01101000, 01100100, ... 00000111
And finally, e.g. when D is 01110000 and SUM is 22221111:
E: 00001110, 00001101, 00001011, 00000111
This would result in C(8,3) × C(5,3) × C(8,1) × C(7,3) × C(4,3) = 56 × 10 × 8 × 35 × 4 = 627,200 solutions for n=8, m=3, k=5.
Actually, you need to add a method to avoid repeating the same vector, and avoid painting yourself into a corner; so I don't think this will be simpler than Walter's answer.
INITIAL ANSWER - HAS MAJOR ISSUES
(I will assume than m is not greater than n/2, i.e. the number of ones is not greater than the number of zeros. Otherwise, use a symmetrical approach.)
When k×m is not greater than n, there obviously are optimal solutions, e.g.:
n=10, m=3, k=3:
A: 1110000000
B: 0001110000
C: 0000001110
where the Hamming distances are all 2×m:
|AB|=6, |AC|=6, |BC|=6, total=18
When k×m is greater than n, solutions where the difference in Hamming distances between consecutive vectors are minimized offer the greatest total distance:
n=8, m=3, k=4:
A: 11100000
B: 00111000
C: 00001110
D: 10000011
|AB|=4, |AC|=6, |AD|=4, |BC|=4, |BD|=6, |CD|=4, total=28
n=8, m=3, k=4:
A: 11100000
B: 00011100
C: 00001110
D: 00000111
|AB|=6, |AC|=6, |AD|=6, |BC|=2, |BD|=4, |CD|=2, total=26
So, practically, you take m×k and see how much greater it is than n, let's call it x = m×k−n, and this x is the number of overlaps, i.e. how often a vector will have a one in the same position as the previous vector. You then spread out the overlap over the different vectors as evenly as possible to maximize the total distance.
In the example above, x = 3×4−8 = 4 and we have 4 vectors, so we can spread out the overlap evenly and every vector has 1 one in the same position as the previous vector.
To generate all unique solutions, you could:
- Calculate x = m×k−n and generate all partitions of x into k parts, with the lowest possible maximum value:
n=8, m=3, k=5 -> x=7
22111, 21211, 21121, 21112, 12211, 12121, 12112, 11221, 11212, 11122
(discard partitions with value 3)
- Generate all vectors to be used as vector A, e.g.:
A: 11100000, 11010000, 11001000, 11000100, ... 00000111
- For each of these, generate all vectors B, which are lexicographically smaller than vector A, and have the correct number of overlapping ones with vector A (in the example that is 1 and 2), e.g.:
A: 10100100
overlap=1:
B: 10011000, 10010010, 10010001, 10001010, 10001001, 10000011, 01110000, ... 00000111
overlap=2:
B: 10100010, 10100001, 10010100, 10001100, 10000110, 10000101, 01100100, ... 00100101
- For each of these, generate all vectors C, and so on, until you have sets of k vectors. When generating the last vector, you have to take into account the overlapping with the previous as well as the next (i.e. first) vector.
I assume it's best to treat the partitions of x into k as a binary tree:
1 2
11 12 21 22
111 112 121 122 211 212 221
1112 1121 1122 1211 1212 1221 2111 2112 2121 2211
11122 11212 11221 12112 12121 12211 21112 21121 21211 22111
and traverse this tree while creating solutions, so that each vector only needs to be generated once.
I think this method only works for some values of n, m and k; I'm not sure it can be made to work for the general case.
1001
,0101
,0110
, and1010
, would they all be the same "distance" away when compared to1100
as they are all different in two of the bits? – Distichous