How to generates a list which elements are at a fix distance from a desired list
Asked Answered
F

4

7

I have a list of possibilities and a desired input:

possibles = [20, 30, 40, 50, 60, 70, 80, 100]
desired = [20, 30, 40]

I want to generate the close by lists. Example:

# Distance of 1 (i.e. 1 element changes to a close-by)
[30, 30, 40]
[20, 40, 40]
[20, 30, 30]
[20, 30, 50]

# Distance of 2:
[40, 30, 40]
[30, 30, 50]
[30, 40, 40]
...

My current version is only varying one element at a time, thus, as soon as the distance is above 1, I'm missing a lot of combination.

def generate_close_by(possibles, desired):
    for k in range(1, 4):
        for i, elt in enumerate(desired):
            id = possibles.index(elt)

            new = desired[:]
            if id < len(possibles)-k-1:
                new[i] = possibles[id+k]
                yield (new)

            if id > k:
                new[i] = possibles[id-k]
                yield (new)

# Output
[30, 30, 40]
[20, 40, 40]
[20, 30, 50]
[20, 30, 30]
[40, 30, 40]
[20, 50, 40]
[20, 30, 60]
[50, 30, 40]
[20, 60, 40]
[20, 30, 70]

I'm quite sure a module should already exist to do this kind of iteration (itertools?), could you point me to the write function?

Thanks.

EDIT:

Update on the tries...

I'm trying to generate a list the same size of desired in which each element correspond to how much I have to move the element of desired.

desired = [20, 30, 40]
# Distance of 1:
distance = [1, 0, 0]
distance = [0, 1, 0]
distance = [0, 0, 1]
distance = [-1, 0, 0]
distance = [0, -1, 0]
distance = [0, 0, -1]

And then plan was to try to create the new list, and if it can't (out of bounds), it just continues. Not working yet, but might be a good approach.

Fletcher answered 2/7, 2018 at 9:1 Comment(8)
I don't fully understand the logic behind the desired outputKwei
@U9-Forward Order doesn't matter, and I just wrote 3 examples hand made... Basicly I want all combinations at a Hammnig distance of 1, then all combinations at a Hammnig distance of 2, and then of 3. The distance corresponds to: sum of how much the elements were "moved". That's my variable kFletcher
is desired always subset of possibles ?Janenejanenna
@SeljukGülcan Yes desired is always a subset of possibles. Size 2 to 6. And possibles is always ordered.Fletcher
is distance always up to 2, or can it change?Janenejanenna
@SeljukGülcan You can set a max if you want, I don't expect that I will need combinations with a distance above 3.Fletcher
What about [20, 20, 40]?Interfere
@Interfere Good remark, didn't even noticed it. Distanc of 1, should have been outputted.Fletcher
P
1

I guess I will show a more long winded approach that can be more easily generalizable.

I first write down the problem.

possible_pts = [20, 30, 40, 50, 60, 70, 80, 100]
starting_pt_in_idx = [0, 1, 2]
distance = 2

There are 3 axes that can "change". I first find the combinations of axis changes.

N = len(starting_pt_in_idx)
axis = list(range(N))

import itertools
axismoves = list(itertools.combinations_with_replacement(axis, distance))
print(axismoves)

Next, we bin it. For example, if I see axis-0 appearing twice, it becomes [2,0,0].

abs_movements = []
for combi in axismoves:
    move_bin = [0] * N
    for i in combi:
        move_bin[i] += 1
    abs_movements.append(move_bin)
print(abs_movements)

The above gave the absolute movements. To find the actual movement, we must take into consideration the change can be positive or negative along that axis.

import copy
actual_movements = []
for movement in abs_movements:
    actual_movements.append(movement)
    for i, move in enumerate(movement):
        if move != 0:
            _movement = copy.deepcopy(movement)
            _movement[i] = - move
            actual_movements.append(_movement)
print(actual_movements)

The final step is to translate the index into actual positions. So first we write this helper function.

def translate_idx_to_pos(idx_vect, points):
    idx_bounds = [0, len(points) - 1]
    pos_point = [0] * len(idx_vect)
    for i, idx_pos in enumerate(idx_vect):
        if idx_pos < idx_bounds[0] or idx_pos > idx_bounds[1]:
            return None
        else:
            pos_point[i] = points[idx_pos]
    return pos_point

Using the actual movements to act on the starting point index, then translating it back to the positions.

from operator import add
final_pts = []
for movement in actual_movements:
    final_pt_in_idx = list(map(add, starting_pt_in_idx, movement))
    final_point = translate_idx_to_pos(final_pt_in_idx, possible_pts)
    if final_point is not None:
        final_pts.append(final_point)

print(final_pts)

This gives

[40, 30, 40]
[30, 40, 40]
[30, 20, 40]
[30, 30, 50]
[30, 30, 30]
[20, 50, 40]
[20, 40, 50]
[20, 20, 50]
[20, 40, 30]
[20, 30, 60]
[20, 30, 20]
Porphyrin answered 2/7, 2018 at 11:12 Comment(4)
I like this approch on the ids and movement :) ThanksFletcher
Actually what I was trying to achieve (c.f. Edit). Works great, and fast!Fletcher
Glad you like it. And indeed it is fast because it enumerates only the needed combinations and builds from there.Porphyrin
Yes :) Exactly what I was trying to achieve. I did some slight changes to turn that into a generator, and it's working perfectly :)Fletcher
D
1

Yes you are right, itertools is gonna be really useful here. What you want is find all subsets of desired length of the possibles list WITH duplicates, and the function that does that is itertools.product

from itertools import product

possibles = [20, 30, 40, 50, 60, 70, 80, 100]
desired = [20, 30, 40]

def fake_hamming(cur, desired, possibles):
    assert len(cur) == len(desired)

    hamm = 0
    for i in range(len(cur)):
        assert cur[i] in possibles
        assert desired[i] in possibles
        hamm += abs(possibles.index(cur[i]) - possibles.index(desired[i]))

    return hamm

def generate_close_by(desired, possibles, dist):
    all_possible_lists = product(possibles, repeat=len(desired))
    return [l for l in all_possible_lists if fake_hamming(l, desired, possibles) == dist]

print(generate_close_by(desired, possibles,1))
>>> [(20, 20, 40), (20, 30, 30), (20, 30, 50), (20, 40, 40), (30, 30, 40)]

Edit There you go, changed combinations for product (see @tobias_k comment below), and also here is the fake_hamming function xD Also it is true that it will be slow for big lists, but this is the most generic way to do it

Dalliance answered 2/7, 2018 at 9:29 Comment(12)
That is the part I've got, I've got some issues with the hamming distance sadly.Fletcher
For small lists for combinations, this is probably the simplest, but for longer lists (say 100 possibilities and 5 elements in desired) this quickly becomes infeasible.Interfere
Oh wait I just re read your question, that's not a real hamming distance that you wantDalliance
@Dalliance maybe... I'm not really sure about what it is ^^'Fletcher
@Interfere possibles is quite small, < 20 elts, and desired up to 6 elts. Issue is that I'm going to perform this computation of close-by on thousands of elements.Fletcher
Hamming distance counts the number of different elements in two sequences ;)Dalliance
@Dalliance So it's a bit more complex indeed.Fletcher
And your outputs can contain several times the same element ?Dalliance
@Fletcher Even then it's quite a list to filter: 20**6 is 64,000,000. (Also, it should not be combinations but product, as elements can appear more then once, and order does matter.)Interfere
@Interfere Not an issue. I need a generator so that I can break out early. True I need product.Fletcher
@Fletcher see my edited answer, it does what you want I thinkDalliance
@Fletcher So does this answer your question ?Dalliance
J
1
def distribute(number, bucket):
  if bucket == 1:
    yield [number]
    if number != 0:
      yield [-1 * number]
  elif number == 0:
    yield [0]*bucket
  else:
    for i in range(number+1):
      for j in distribute(number-i, 1):
        for k in distribute(i, bucket-1):
          yield j+k

def generate(possibles, desired, distance):
  for index_distance_tuple in distribute(distance, len(desired)):
    retval = desired[:]
    for i, index in enumerate(index_distance_tuple):
      if index + i < 0 or index + i >= len(possibles):
        break
      retval[i] = possibles[index + i]
    else:
      yield retval

For distance 1:

for i in generate(possibles, desired, 1):
  print(i)

Output :

[30, 30, 40]
[20, 40, 40]
[20, 20, 40]
[20, 30, 50]
[20, 30, 30]

For distance 2 :

for i in generate(possibles, desired, 2):
  print(i)

Output :

[40, 30, 40]
[30, 40, 40]
[30, 20, 40]
[30, 30, 50]
[30, 30, 30]
[20, 50, 40]
[20, 40, 50]
[20, 40, 30]
[20, 20, 50]
[20, 20, 30]
[20, 30, 60]
[20, 30, 20]
Janenejanenna answered 2/7, 2018 at 9:34 Comment(3)
Misunderstanding somewhere. [30, 30, 40] is indeed at a distance of 1 since the value 20 is changed to 30 (1 changed to a value at a distance of 1 in possibles). But the following are at further distance. [50, 30, 40] is at d = 2, [50, 30, 40] at a distance of 3..Fletcher
I see, I completely misunderstood. I will edit my answer.Janenejanenna
@Mathieu, I edited the answer. Can you check if it is what you want?Janenejanenna
P
1

I guess I will show a more long winded approach that can be more easily generalizable.

I first write down the problem.

possible_pts = [20, 30, 40, 50, 60, 70, 80, 100]
starting_pt_in_idx = [0, 1, 2]
distance = 2

There are 3 axes that can "change". I first find the combinations of axis changes.

N = len(starting_pt_in_idx)
axis = list(range(N))

import itertools
axismoves = list(itertools.combinations_with_replacement(axis, distance))
print(axismoves)

Next, we bin it. For example, if I see axis-0 appearing twice, it becomes [2,0,0].

abs_movements = []
for combi in axismoves:
    move_bin = [0] * N
    for i in combi:
        move_bin[i] += 1
    abs_movements.append(move_bin)
print(abs_movements)

The above gave the absolute movements. To find the actual movement, we must take into consideration the change can be positive or negative along that axis.

import copy
actual_movements = []
for movement in abs_movements:
    actual_movements.append(movement)
    for i, move in enumerate(movement):
        if move != 0:
            _movement = copy.deepcopy(movement)
            _movement[i] = - move
            actual_movements.append(_movement)
print(actual_movements)

The final step is to translate the index into actual positions. So first we write this helper function.

def translate_idx_to_pos(idx_vect, points):
    idx_bounds = [0, len(points) - 1]
    pos_point = [0] * len(idx_vect)
    for i, idx_pos in enumerate(idx_vect):
        if idx_pos < idx_bounds[0] or idx_pos > idx_bounds[1]:
            return None
        else:
            pos_point[i] = points[idx_pos]
    return pos_point

Using the actual movements to act on the starting point index, then translating it back to the positions.

from operator import add
final_pts = []
for movement in actual_movements:
    final_pt_in_idx = list(map(add, starting_pt_in_idx, movement))
    final_point = translate_idx_to_pos(final_pt_in_idx, possible_pts)
    if final_point is not None:
        final_pts.append(final_point)

print(final_pts)

This gives

[40, 30, 40]
[30, 40, 40]
[30, 20, 40]
[30, 30, 50]
[30, 30, 30]
[20, 50, 40]
[20, 40, 50]
[20, 20, 50]
[20, 40, 30]
[20, 30, 60]
[20, 30, 20]
Porphyrin answered 2/7, 2018 at 11:12 Comment(4)
I like this approch on the ids and movement :) ThanksFletcher
Actually what I was trying to achieve (c.f. Edit). Works great, and fast!Fletcher
Glad you like it. And indeed it is fast because it enumerates only the needed combinations and builds from there.Porphyrin
Yes :) Exactly what I was trying to achieve. I did some slight changes to turn that into a generator, and it's working perfectly :)Fletcher
I
0

You could try a recursive approach: Keep track of the remaining distance and generate combinations of just the relevant elements.

def get_with_distance(poss, des, dist, k=0):
    if k < len(des):
        i = poss.index(des[k])
        for n in range(-dist, dist+1):
            if 0 <= i + n < len(poss):
                for comb in get_with_distance(poss, des, dist - abs(n), k+1):
                    yield [poss[i + n]] + comb
    elif dist == 0:
        yield []

This can still run into a few "dead-ends" if there is still dist remaining, but the des list is empty, but overall, this will check much fewer combinations than generating all the combinations up-front and checking their distance afterwards.

If the list of possible elements is longer, you might want to create a dict mapping elements to their index first, so you don't have to do poss.index(first) each time.

Example:

possibles = [20, 30, 40, 50, 60, 70, 80, 100]
desired = [20, 30, 40]
for x in get_with_distance(possibles, desired, 2):
    print(x)

Output:

[20, 20, 30]
[20, 20, 50]
[20, 30, 20]
[20, 30, 60]
[20, 40, 30]
[20, 40, 50]
[20, 50, 40]
[30, 20, 40]
[30, 30, 30]
[30, 30, 50]
[30, 40, 40]
[40, 30, 40]
Interfere answered 2/7, 2018 at 10:6 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.