Splitting list based on missing numbers in a sequence
Asked Answered
L

6

28

I am looking for the most pythonic way of splitting a list of numbers into smaller lists based on a number missing in the sequence. For example, if the initial list was:

seq1 = [1, 2, 3, 4, 6, 7, 8, 9, 10]

the function would yield:

[[1, 2, 3, 4], [6, 7, 8, 9, 10]]

or

seq2 = [1, 2, 4, 5, 6, 8, 9, 10]

would result in:

[[1, 2], [4, 5, 6], [8, 9, 10]]
Lynd answered 30/6, 2010 at 12:59 Comment(3)
How do you know which number is "missing"? Are you requiring that the sequences be simple ascending integers without duplicates? Please state the rules for finding a missing value.Darees
do you have some working code that you feel isn't pythonic that you could post?Fermium
What do you want it for? I've got a code snippet for reducing number sequences where contiguous numbers can be represented as ranges: e.g. [[1,4],[6,10]]Melonymelos
P
54

Python 3 version of the code from the old Python documentation:

>>> # Find runs of consecutive numbers using groupby.  The key to the solution
>>> # is differencing with a range so that consecutive numbers all appear in
>>> # same group.
>>> from itertools import groupby
>>> from operator import itemgetter
>>> data = [ 1,  4,5,6, 10, 15,16,17,18, 22, 25,26,27,28]
>>> for k, g in groupby(enumerate(data), lambda i_x: i_x[0] - i_x[1]):
...     print(list(map(itemgetter(1), g)))
...
[1]
[4, 5, 6]
[10]
[15, 16, 17, 18]
[22]
[25, 26, 27, 28]

The groupby function from the itertools module generates a break every time the key function changes its return value. The trick is that the return value is the number in the list minus the position of the element in the list. This difference changes when there is a gap in the numbers.

The itemgetter function is from the operator module, you'll have to import this and the itertools module for this example to work.

Alternatively, as a list comprehension:

>>> [map(itemgetter(1), g) for k, g in groupby(enumerate(seq2), lambda i_x: i_x[0] - i_x[1])]
[[1, 2], [4, 5, 6], [8, 9, 10]]
Proctoscope answered 30/6, 2010 at 13:6 Comment(1)
Should also post a solution for when the initial sequence is not an integer! Float? Datetime?Roselinerosella
S
9

This is a solution that works in Python 3 (based on previous answers that work in python 2 only).

>>> from operator import itemgetter
>>> from itertools import *
>>> groups = []
>>> for k, g in groupby(enumerate(seq2), lambda x: x[0]-x[1]):
>>>     groups.append(list(map(itemgetter(1), g)))
... 
>>> print(groups)
[[1, 2], [4, 5, 6], [8, 9, 10]]

or as a list comprehension

>>> [list(map(itemgetter(1), g)) for k, g in groupby(enumerate(seq2), lambda x: x[0]-x[1])]
[[1, 2], [4, 5, 6], [8, 9, 10]]

Changes were needed because

  • Removal of tuple parameter unpacking PEP 3113
  • map returning an iterator instead of a list
Subduct answered 2/2, 2017 at 13:5 Comment(0)
L
6

Another option which doesn't need itertools etc.:

>>> data = [1, 4, 5, 6, 10, 15, 16, 17, 18, 22, 25, 26, 27, 28]
>>> spl = [0]+[i for i in range(1,len(data)) if data[i]-data[i-1]>1]+[None]
>>> [data[b:e] for (b, e) in [(spl[i-1],spl[i]) for i in range(1,len(spl))]]
... [[1], [4, 5, 6], [10], [15, 16, 17, 18], [22], [25, 26, 27, 28]]
Lucan answered 30/6, 2010 at 14:5 Comment(0)
F
1

My way

alist = [1, 2, 3, 4, 5, 6, 7, 8, 10, 11, 12, 15, 16, 17, 18, 20, 21, 22]
newlist = []
start = 0
end = 0
for index,value in enumerate(alist):
    if index < len(alist)-1:
        if alist[index+1]> value+1:
            end = index +1
            newlist.append(alist[start:end])
            start = end
    else:
            newlist.append(alist[start: len(alist)])
print(newlist)

Result

[[1, 2, 3, 4, 5, 6, 7, 8], [10, 11, 12], [15, 16, 17, 18], [20, 21, 22]]
Forgot answered 16/3, 2018 at 5:52 Comment(0)
A
1

I like this one better because it doesn't require any extra libraries or special treatment for first case:

a = [1, 2, 3, 4, 5, 6, 7, 8, 10, 11, 12, 15, 16, 17, 18, 20, 21, 22]
b = []
subList = []
prev_n = -1

for n in a:
    if prev_n+1 != n:            # end of previous subList and beginning of next
        if subList:              # if subList already has elements
            b.append(subList)
            subList = []
    subList.append(n)
    prev_n = n

if subList:
    b.append(subList)

print a
print b

Output:

[1, 2, 3, 4, 5, 6, 7, 8, 10, 11, 12, 15, 16, 17, 18, 20, 21, 22]

[[1, 2, 3, 4, 5, 6, 7, 8], [10, 11, 12], [15, 16, 17, 18], [20, 21, 22]]

Amalberga answered 18/1, 2019 at 23:4 Comment(0)
T
0

Short one-liner using numpy:

seq2 = [1, 2, 4, 5, 6, 8, 9, 10]
np.split(seq2, np.where(np.diff(seq2) > 1)[0] + 1)

Result:

[array([1, 2]), array([4, 5, 6]), array([ 8,  9, 10])]
Tautonym answered 23/9, 2023 at 17:57 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.