No need for imports, this will work provided a list of objects or a string; anything with var[indexing]
. Tested on python 3.6
# This will create windows with all but 1 overlap
def ngrams_list(a_list, window_size=5, skip_step=1):
return list(zip(*[a_list[i:] for i in range(0, window_size, skip_step)]))
the for loop by itself creates this with a_list
being the alphabet (shown window = 5
, OP would want window=2
:
['ABCDEFGHIJKLMNOPQRSTUVWXYZ',
'BCDEFGHIJKLMNOPQRSTUVWXYZ',
'CDEFGHIJKLMNOPQRSTUVWXYZ',
'DEFGHIJKLMNOPQRSTUVWXYZ',
'EFGHIJKLMNOPQRSTUVWXYZ']
zip(*result_of_for_loop)
will collect all full vertical columns as results. And if you want less than all-but-one overlap:
# You can sample that output to get less overlap:
def sliding_windows_with_overlap(a_list, window_size=5, overlap=2):
zip_output_as_list = ngrams_list(a_list, window_size)])
return zip_output_as_list[::overlap+1]
With overlap=2
it skips the columns starting with B
& C
, and choosing the D
[('A', 'B', 'C', 'D', 'E'),
('D', 'E', 'F', 'G', 'H'),
('G', 'H', 'I', 'J', 'K'),
('J', 'K', 'L', 'M', 'N'),
('M', 'N', 'O', 'P', 'Q'),
('P', 'Q', 'R', 'S', 'T'),
('S', 'T', 'U', 'V', 'W'),
('V', 'W', 'X', 'Y', 'Z')]
EDIT: looks like this is similar to what @chmullig provided, with options
None
at position 0. – Sunup