How to generates a list which elements are at a fix distance from a desired list

Asked 2/7, 2018 at 9:1 Answered 2/7, 2018 at 11:12

Solved python combinations iterable hamming-distance

I have a list of possibilities and a desired input:

possibles = [20, 30, 40, 50, 60, 70, 80, 100]
desired = [20, 30, 40]

I want to generate the close by lists. Example:

# Distance of 1 (i.e. 1 element changes to a close-by)
[30, 30, 40]
[20, 40, 40]
[20, 30, 30]
[20, 30, 50]

# Distance of 2:
[40, 30, 40]
[30, 30, 50]
[30, 40, 40]
...

My current version is only varying one element at a time, thus, as soon as the distance is above 1, I'm missing a lot of combination.

def generate_close_by(possibles, desired):
    for k in range(1, 4):
        for i, elt in enumerate(desired):
            id = possibles.index(elt)

            new = desired[:]
            if id < len(possibles)-k-1:
                new[i] = possibles[id+k]
                yield (new)

            if id > k:
                new[i] = possibles[id-k]
                yield (new)

# Output
[30, 30, 40]
[20, 40, 40]
[20, 30, 50]
[20, 30, 30]
[40, 30, 40]
[20, 50, 40]
[20, 30, 60]
[50, 30, 40]
[20, 60, 40]
[20, 30, 70]

I'm quite sure a module should already exist to do this kind of iteration (itertools?), could you point me to the write function?

Thanks.

EDIT:

Update on the tries...

I'm trying to generate a list the same size of desired in which each element correspond to how much I have to move the element of desired.

desired = [20, 30, 40]
# Distance of 1:
distance = [1, 0, 0]
distance = [0, 1, 0]
distance = [0, 0, 1]
distance = [-1, 0, 0]
distance = [0, -1, 0]
distance = [0, 0, -1]

And then plan was to try to create the new list, and if it can't (out of bounds), it just continues. Not working yet, but might be a good approach.

Fletcher answered 2/7, 2018 at 9:1 Comment(8)

I don't fully understand the logic behind the desired output – Kwei 2/7, 2018 at 9:4

@U9-Forward Order doesn't matter, and I just wrote 3 examples hand made... Basicly I want all combinations at a Hammnig distance of 1, then all combinations at a Hammnig distance of 2, and then of 3. The distance corresponds to: sum of how much the elements were "moved". That's my variable k – Fletcher 2/7, 2018 at 9:5

is desired always subset of possibles ? – Janenejanenna 2/7, 2018 at 9:9

@SeljukGülcan Yes desired is always a subset of possibles. Size 2 to 6. And possibles is always ordered. – Fletcher 2/7, 2018 at 9:10

is distance always up to 2, or can it change? – Janenejanenna 2/7, 2018 at 9:26

@SeljukGülcan You can set a max if you want, I don't expect that I will need combinations with a distance above 3. – Fletcher 2/7, 2018 at 9:27

What about [20, 20, 40]? – Interfere 2/7, 2018 at 9:28

@Interfere Good remark, didn't even noticed it. Distanc of 1, should have been outputted. – Fletcher 2/7, 2018 at 9:29

I guess I will show a more long winded approach that can be more easily generalizable.

I first write down the problem.

possible_pts = [20, 30, 40, 50, 60, 70, 80, 100]
starting_pt_in_idx = [0, 1, 2]
distance = 2

There are 3 axes that can "change". I first find the combinations of axis changes.

N = len(starting_pt_in_idx)
axis = list(range(N))

import itertools
axismoves = list(itertools.combinations_with_replacement(axis, distance))
print(axismoves)

Next, we bin it. For example, if I see axis-0 appearing twice, it becomes [2,0,0].

abs_movements = []
for combi in axismoves:
    move_bin = [0] * N
    for i in combi:
        move_bin[i] += 1
    abs_movements.append(move_bin)
print(abs_movements)

The above gave the absolute movements. To find the actual movement, we must take into consideration the change can be positive or negative along that axis.

import copy
actual_movements = []
for movement in abs_movements:
    actual_movements.append(movement)
    for i, move in enumerate(movement):
        if move != 0:
            _movement = copy.deepcopy(movement)
            _movement[i] = - move
            actual_movements.append(_movement)
print(actual_movements)

The final step is to translate the index into actual positions. So first we write this helper function.

def translate_idx_to_pos(idx_vect, points):
    idx_bounds = [0, len(points) - 1]
    pos_point = [0] * len(idx_vect)
    for i, idx_pos in enumerate(idx_vect):
        if idx_pos < idx_bounds[0] or idx_pos > idx_bounds[1]:
            return None
        else:
            pos_point[i] = points[idx_pos]
    return pos_point

Using the actual movements to act on the starting point index, then translating it back to the positions.

from operator import add
final_pts = []
for movement in actual_movements:
    final_pt_in_idx = list(map(add, starting_pt_in_idx, movement))
    final_point = translate_idx_to_pos(final_pt_in_idx, possible_pts)
    if final_point is not None:
        final_pts.append(final_point)

print(final_pts)

This gives

[40, 30, 40]
[30, 40, 40]
[30, 20, 40]
[30, 30, 50]
[30, 30, 30]
[20, 50, 40]
[20, 40, 50]
[20, 20, 50]
[20, 40, 30]
[20, 30, 60]
[20, 30, 20]

Porphyrin answered 2/7, 2018 at 11:12 Comment(4)

I like this approch on the ids and movement :) Thanks – Fletcher 2/7, 2018 at 11:14

Actually what I was trying to achieve (c.f. Edit). Works great, and fast! – Fletcher 2/7, 2018 at 12:20

Glad you like it. And indeed it is fast because it enumerates only the needed combinations and builds from there. – Porphyrin 2/7, 2018 at 15:34

Yes :) Exactly what I was trying to achieve. I did some slight changes to turn that into a generator, and it's working perfectly :) – Fletcher 2/7, 2018 at 19:46

Yes you are right, itertools is gonna be really useful here. What you want is find all subsets of desired length of the possibles list WITH duplicates, and the function that does that is itertools.product

from itertools import product

possibles = [20, 30, 40, 50, 60, 70, 80, 100]
desired = [20, 30, 40]

def fake_hamming(cur, desired, possibles):
    assert len(cur) == len(desired)

    hamm = 0
    for i in range(len(cur)):
        assert cur[i] in possibles
        assert desired[i] in possibles
        hamm += abs(possibles.index(cur[i]) - possibles.index(desired[i]))

    return hamm

def generate_close_by(desired, possibles, dist):
    all_possible_lists = product(possibles, repeat=len(desired))
    return [l for l in all_possible_lists if fake_hamming(l, desired, possibles) == dist]

print(generate_close_by(desired, possibles,1))
>>> [(20, 20, 40), (20, 30, 30), (20, 30, 50), (20, 40, 40), (30, 30, 40)]

Edit There you go, changed combinations for product (see @tobias_k comment below), and also here is the fake_hamming function xD Also it is true that it will be slow for big lists, but this is the most generic way to do it

Dalliance answered 2/7, 2018 at 9:29 Comment(12)

That is the part I've got, I've got some issues with the hamming distance sadly. – Fletcher 2/7, 2018 at 9:30

For small lists for combinations, this is probably the simplest, but for longer lists (say 100 possibilities and 5 elements in desired) this quickly becomes infeasible. – Interfere 2/7, 2018 at 9:43

Oh wait I just re read your question, that's not a real hamming distance that you want – Dalliance 2/7, 2018 at 9:44

@Dalliance maybe... I'm not really sure about what it is ^^' – Fletcher 2/7, 2018 at 9:44

@Interfere possibles is quite small, < 20 elts, and desired up to 6 elts. Issue is that I'm going to perform this computation of close-by on thousands of elements. – Fletcher 2/7, 2018 at 9:45

Hamming distance counts the number of different elements in two sequences ;) – Dalliance 2/7, 2018 at 9:45

@Dalliance So it's a bit more complex indeed. – Fletcher 2/7, 2018 at 9:46

And your outputs can contain several times the same element ? – Dalliance 2/7, 2018 at 9:46

@Fletcher Even then it's quite a list to filter: 20**6 is 64,000,000. (Also, it should not be combinations but product, as elements can appear more then once, and order does matter.) – Interfere 2/7, 2018 at 9:47

@Interfere Not an issue. I need a generator so that I can break out early. True I need product. – Fletcher 2/7, 2018 at 9:51

@Fletcher see my edited answer, it does what you want I think – Dalliance 2/7, 2018 at 9:55

@Fletcher So does this answer your question ? – Dalliance 2/7, 2018 at 10:12

def distribute(number, bucket):
  if bucket == 1:
    yield [number]
    if number != 0:
      yield [-1 * number]
  elif number == 0:
    yield [0]*bucket
  else:
    for i in range(number+1):
      for j in distribute(number-i, 1):
        for k in distribute(i, bucket-1):
          yield j+k

def generate(possibles, desired, distance):
  for index_distance_tuple in distribute(distance, len(desired)):
    retval = desired[:]
    for i, index in enumerate(index_distance_tuple):
      if index + i < 0 or index + i >= len(possibles):
        break
      retval[i] = possibles[index + i]
    else:
      yield retval

For distance 1:

for i in generate(possibles, desired, 1):
  print(i)

Output :

[30, 30, 40]
[20, 40, 40]
[20, 20, 40]
[20, 30, 50]
[20, 30, 30]

For distance 2 :

for i in generate(possibles, desired, 2):
  print(i)

Output :

[40, 30, 40]
[30, 40, 40]
[30, 20, 40]
[30, 30, 50]
[30, 30, 30]
[20, 50, 40]
[20, 40, 50]
[20, 40, 30]
[20, 20, 50]
[20, 20, 30]
[20, 30, 60]
[20, 30, 20]

Janenejanenna answered 2/7, 2018 at 9:34 Comment(3)

Misunderstanding somewhere. [30, 30, 40] is indeed at a distance of 1 since the value 20 is changed to 30 (1 changed to a value at a distance of 1 in possibles). But the following are at further distance. [50, 30, 40] is at d = 2, [50, 30, 40] at a distance of 3.. – Fletcher 2/7, 2018 at 9:41

I see, I completely misunderstood. I will edit my answer. – Janenejanenna 2/7, 2018 at 9:44

@Mathieu, I edited the answer. Can you check if it is what you want? – Janenejanenna 2/7, 2018 at 10:34

I guess I will show a more long winded approach that can be more easily generalizable.

I first write down the problem.

possible_pts = [20, 30, 40, 50, 60, 70, 80, 100]
starting_pt_in_idx = [0, 1, 2]
distance = 2

There are 3 axes that can "change". I first find the combinations of axis changes.

N = len(starting_pt_in_idx)
axis = list(range(N))

import itertools
axismoves = list(itertools.combinations_with_replacement(axis, distance))
print(axismoves)

Next, we bin it. For example, if I see axis-0 appearing twice, it becomes [2,0,0].

abs_movements = []
for combi in axismoves:
    move_bin = [0] * N
    for i in combi:
        move_bin[i] += 1
    abs_movements.append(move_bin)
print(abs_movements)

The above gave the absolute movements. To find the actual movement, we must take into consideration the change can be positive or negative along that axis.

import copy
actual_movements = []
for movement in abs_movements:
    actual_movements.append(movement)
    for i, move in enumerate(movement):
        if move != 0:
            _movement = copy.deepcopy(movement)
            _movement[i] = - move
            actual_movements.append(_movement)
print(actual_movements)

The final step is to translate the index into actual positions. So first we write this helper function.

def translate_idx_to_pos(idx_vect, points):
    idx_bounds = [0, len(points) - 1]
    pos_point = [0] * len(idx_vect)
    for i, idx_pos in enumerate(idx_vect):
        if idx_pos < idx_bounds[0] or idx_pos > idx_bounds[1]:
            return None
        else:
            pos_point[i] = points[idx_pos]
    return pos_point

Using the actual movements to act on the starting point index, then translating it back to the positions.

from operator import add
final_pts = []
for movement in actual_movements:
    final_pt_in_idx = list(map(add, starting_pt_in_idx, movement))
    final_point = translate_idx_to_pos(final_pt_in_idx, possible_pts)
    if final_point is not None:
        final_pts.append(final_point)

print(final_pts)

This gives

[40, 30, 40]
[30, 40, 40]
[30, 20, 40]
[30, 30, 50]
[30, 30, 30]
[20, 50, 40]
[20, 40, 50]
[20, 20, 50]
[20, 40, 30]
[20, 30, 60]
[20, 30, 20]

Porphyrin answered 2/7, 2018 at 11:12 Comment(4)

I like this approch on the ids and movement :) Thanks – Fletcher 2/7, 2018 at 11:14

Actually what I was trying to achieve (c.f. Edit). Works great, and fast! – Fletcher 2/7, 2018 at 12:20

Glad you like it. And indeed it is fast because it enumerates only the needed combinations and builds from there. – Porphyrin 2/7, 2018 at 15:34

Yes :) Exactly what I was trying to achieve. I did some slight changes to turn that into a generator, and it's working perfectly :) – Fletcher 2/7, 2018 at 19:46

You could try a recursive approach: Keep track of the remaining distance and generate combinations of just the relevant elements.

def get_with_distance(poss, des, dist, k=0):
    if k < len(des):
        i = poss.index(des[k])
        for n in range(-dist, dist+1):
            if 0 <= i + n < len(poss):
                for comb in get_with_distance(poss, des, dist - abs(n), k+1):
                    yield [poss[i + n]] + comb
    elif dist == 0:
        yield []

This can still run into a few "dead-ends" if there is still dist remaining, but the des list is empty, but overall, this will check much fewer combinations than generating all the combinations up-front and checking their distance afterwards.

If the list of possible elements is longer, you might want to create a dict mapping elements to their index first, so you don't have to do poss.index(first) each time.

Example:

possibles = [20, 30, 40, 50, 60, 70, 80, 100]
desired = [20, 30, 40]
for x in get_with_distance(possibles, desired, 2):
    print(x)

Output:

[20, 20, 30]
[20, 20, 50]
[20, 30, 20]
[20, 30, 60]
[20, 40, 30]
[20, 40, 50]
[20, 50, 40]
[30, 20, 40]
[30, 30, 30]
[30, 30, 50]
[30, 40, 40]
[40, 30, 40]

Interfere answered 2/7, 2018 at 10:6 Comment(0)

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags