Using the pandas Rolling object to create a sliding window of lists
Asked Answered
B

2

9

This outstanding post illustrates quite clearly how to use the pandas cumsum() DataFrame method to build a 3D tensor containing a column with lists of lists whose dimensions make them suitable to be used as time series input to an LSTM. I would like to do something very similar but with a rolling list of lists instead of a cumulative aggregation of lists.

For example. Say you had a DataFrame with 3 time series thus:

 A   B   C
 1   2   3
 4   5   6
 7   8   9
10  11  12

The article I linked to above, shows you how to use pandas cumsum() to build a DataFrame column of nested lists that look like this:

[[1, 2, 3]]
[[1, 2, 3], [4, 5, 6]]
[[1, 2, 3], [4, 5, 6], [7, 8, 9]]

The key lines of python code that accomplish this are as follows:

input_cols =  list(df.columns)
df['single_list'] = df[input_cols].apply(
                       tuple, axis=1).apply(list)
df['double_encapsulated'] = df.single_list.apply(
                                      lambda x: [list(x)])

But I want a rolling window of lists, not a cumulative sum of lists. It should look like this:

[[1, 2, 3], [4, 5, 6], [7, 8, 9]]
[[4, 5, 6], [7, 8, 9], [10, 11, 12]]
[[7, 8, 9], [10, 11, 12], [13, 14, 15]]

Can this be done with a Rolling object?

Beriberi answered 30/1, 2019 at 3:32 Comment(0)
H
9

Here are a couple of tricks to achieve your desired results:

import pandas as pd
dd = {'A': {0: 1, 1: 4, 2: 7, 3: 10, 4: 13},
 'B': {0: 2, 1: 5, 2: 8, 3: 11, 4: 14},
 'C': {0: 3, 1: 6, 2: 9, 3: 12, 4: 15}}
df = pd.DataFrame(dd)

list_of_indexes=[]
df.index.to_series().rolling(3).apply((lambda x: list_of_indexes.append(x.tolist()) or 0), raw=False)
list_of_indexes

d1 = df.apply(tuple,axis=1).apply(list)
[[d1[ix] for ix in x] for x in list_of_indexes]

Output:

[[[1, 2, 3], [4, 5, 6], [7, 8, 9]],
 [[4, 5, 6], [7, 8, 9], [10, 11, 12]],
 [[7, 8, 9], [10, 11, 12], [13, 14, 15]]]

Details:

Create an empty list. Use rolling and apply with a trick of a function that returns None and "or" operator with zero to allow rolling apply to return 0 (a number). However, what we are really after are the results of the function, "append" in this case. We are using the dataframe index as the input to our rolling function, so "list_of_indexes" is a rolling list of indexes of the original dataframe, df. Now, let's modify the dataframe to convert rows in to list which is d1, using "apply tuple" and "apply list".

Lastly, let's use d1 to replace our list_of_indexes with the appropriate list from the origingal dataframe using list comprehension.

Hafnium answered 30/1, 2019 at 4:15 Comment(2)
Which version of python are you using? I get the following: TypeError: apply() got an unexpected keyword argument 'raw'.Beriberi
I am using pandas 0.24.0.Hafnium
A
3

Since pandas 1.1 rolling objects are iterable and you can do:

[win.values.tolist() for win in df.rolling(3, axis=1) if win.shape[0] == 3]

With the if we make sure to only get complete windows.

Asparagus answered 9/12, 2020 at 12:1 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.