How to fill the gaps in a list of tuples

Asked 25/3, 2020 at 14:9 Answered 25/3, 2020 at 14:53

I have a list of tuples like the following:

[(1, 'Red'), (2, 'Yellow'), (6, 'Pink'), (7, 'Blue'), (8, 'Green')]

The numbers in the tuple represent the index. However, since some of the indexes are missing in my input file, i need to insert some tuples in the list, and make the list look like the following:

[(1, 'Red'), (2, 'Yellow'), (3, None), (4, None), (5, None), (6, 'Pink'), (7, 'Blue'), (8, 'Green')]

If some of you have any ideas I would really appreciate if you take your time and comment something.

Burkle answered 25/3, 2020 at 14:9 Comment(10)

that doesn't look like valid python – Biblicist 25/3, 2020 at 14:9

Can you share a valid list? – Benadryl 25/3, 2020 at 14:10

Updating tuples is by definition not possible. Inserting values—which may be tuples—into a list on the other hand is pretty trivial. What problem do you have doing so? – Cucurbit 25/3, 2020 at 14:11

You can't update the value of a tuple, but you can replace them if they're stored in a list or other mutable container. – Alysaalyse 25/3, 2020 at 14:15

[(1, 'Red'), (2, 'Yellow'), (6, 'Pink'), (7, 'Blue'), (8, 'Green')] that is the original list, and this is what i expect: [(1,'Red'), (2,'Yellow'), (3, None), (4, None), (5, None), (6, 'Pink'), (7, 'Blue'), (8, 'Green') – Burkle 25/3, 2020 at 14:17

I've updated the question with valid Python :) – Kordofan 25/3, 2020 at 14:21

@Alysaalyse my problem is that i need to insert the tuples, which should contain the index numbers missing from the original list, in this case 4, 5 , 6 – Burkle 25/3, 2020 at 14:21

Yes... Voted to reopen. – Illaudable 25/3, 2020 at 14:25

What have you tried so far? Is your list always sorted by index? Do you need to modify the list inlace, or is it fine to create a new list from the old one? – Ledeen 25/3, 2020 at 14:27

Its fine to create a new one as well. What i tried was to use the index() function of the list and then assign it to each tuple, but the problem was that it was returning index = 3 for all the missing tuples – Burkle 25/3, 2020 at 14:30

I propose here the simplest implementation, but not very efficient for large lists:

test = [(1, 'color: Red'), (2, 'color: Yellow'), (6, 'color: Pink'), (7, 'color: Blue'), (8, 'color: Green')]


max_index = max(test, key=lambda item:item[0])[0]

missing_values = []
for i in range(1, max_index + 1):
    missing = False
    for index, val in test:
        if i != index:
            missing = True
        else:
            missing = False
            break
    if missing:
        missing_values.append((i,'color: None'))

new_test = test + missing_values
new_test_sorted = sorted(new_test, key=lambda x:x[0])
print(new_test_sorted)

That gives:

[(1, 'color: Red'), (2, 'color: Yellow'), (3, 'color: None'), (4, 'color: None'), (5, 'color: None'), (6, 'color: Pink'), (7, 'color: Blue'), (8, 'color: Green')]

Nitid answered 25/3, 2020 at 14:39 Comment(2)

thank you for your answer. Can you pls tell me what happens in the line list_items = lambda test : test[0] – Burkle 25/3, 2020 at 15:9

Absolutely nothing! It Is a typo! I Will remove It now ^_^ – Nitid 25/3, 2020 at 15:19

Here's a simple approach you can try out. If first gets the min and max number range, then gets the missing numbers using set difference set(A) - set(B), then concatenates the missing numbers with original list and sorts the result with sorted(). I've added comments to explain the approach as well :)

lst = [(1, 'Red'), (2, 'Yellow'), (6, 'Pink'), (7, 'Blue'), (8, 'Green')]

# Get only numbers
active_numbers = [x for x, _ in lst]

# Get min and max ranges
min_number, max_number = min(active_numbers), max(active_numbers)

# Get all possible numbers in range
all_numbers = set(range(min_number, max_number + 1))

# Find missing numbers using set difference set(A) - set(B)
difference = all_numbers - set(active_numbers)

# Add missing numbers and original numbers and sort result
result = list(sorted(lst + [(x, None) for x in difference]))

print(result)

Output:

[(1, 'Red'), (2, 'Yellow'), (3, None), (4, None), (5, None), (6, 'Pink'), (7, 'Blue'), (8, 'Green')]

Kordofan answered 25/3, 2020 at 14:29 Comment(0)

Assuming that either the list is sorted or that the result doesn't need to preserve the list order, you can use a dict created from the original list.

z = [(1, 'Red'), (2, 'Yellow'), (6, 'Pink'), (7, 'Blue'), (8, 'Green')]
d = dict(z)
low, high = min(d), max(d)
result = [(i, d.get(i)) for i in range(low, high + 1)]

Perionychium answered 25/3, 2020 at 14:47 Comment(0)

The following code worked for me. It's very naive and not particularly efficient. The min_key and max_key give you the interval bounds for your keys, so you don't always start as 0. For all indices in that range, it will set a default value of None. If a value is present, nothing is changed.

Then, the items in the dictionary will be sorted based on the key value.

    data = [(1, 'Red'), (2, 'Yellow'), (6, 'Pink'), (7, 'Blue'), (8, 'Green')]
    data_as_dict = dict(data)
    max_key = max(data_as_dict.keys())
    min_key = min(data_as_dict.keys())

    for i in range(min_key, max_key):
        data_as_dict.setdefault(i, None)
    data_as_dict = sorted(data_as_dict.items(), key=lambda item: item[0])
    print(data_as_dict)

[(1, 'Red'), (2, 'Yellow'), (3, None), (4, None), (5, None), (6, 'Pink'), (7, 'Blue'), (8, 'Green')]

However, if you don't mind starting at index 0, you might want to have a look at using a list containing just your colours, where the first value of your tuple is the index inside the list, to optimise the memory footprint.

Hope it helps!

Regal answered 25/3, 2020 at 14:29 Comment(0)

Here is a simple,one-pass method that keeps the orig order:

out = []
lasti = 0
for i, v in data:
    if i - lasti > 1:
        # if not continued, fix the gap
        for j in range(lasti + 1, i):
            out.append((j, None))
    out.append((i, v)) # add the value
    lasti = i
print(out)

Output:

[(1, 'Red'), (2, 'Yellow'), (3, None), (4, None), (5, None), (6, 'Pink'), (7, 'Blue'), (8, 'Green')]

Earlap answered 25/3, 2020 at 14:53 Comment(0)

I propose here the simplest implementation, but not very efficient for large lists:

test = [(1, 'color: Red'), (2, 'color: Yellow'), (6, 'color: Pink'), (7, 'color: Blue'), (8, 'color: Green')]


max_index = max(test, key=lambda item:item[0])[0]

missing_values = []
for i in range(1, max_index + 1):
    missing = False
    for index, val in test:
        if i != index:
            missing = True
        else:
            missing = False
            break
    if missing:
        missing_values.append((i,'color: None'))

new_test = test + missing_values
new_test_sorted = sorted(new_test, key=lambda x:x[0])
print(new_test_sorted)

That gives:

[(1, 'color: Red'), (2, 'color: Yellow'), (3, 'color: None'), (4, 'color: None'), (5, 'color: None'), (6, 'color: Pink'), (7, 'color: Blue'), (8, 'color: Green')]

Nitid answered 25/3, 2020 at 14:39 Comment(2)

thank you for your answer. Can you pls tell me what happens in the line list_items = lambda test : test[0] – Burkle 25/3, 2020 at 15:9

Absolutely nothing! It Is a typo! I Will remove It now ^_^ – Nitid 25/3, 2020 at 15:19

Recommended topics

Hot tags