Say I have the following dataframe
import pandas as pd
df = pd.DataFrame({ 'distance':[2.0, 3.0, 1.0, 4.0],
'velocity':[10.0, 20.0, 5.0, 40.0] })
gives the dataframe
distance velocity
0 2.0 10.0
1 3.0 20.0
2 1.0 5.0
3 4.0 40.0
How can I calculate the average of the velocity column over the rolling sum of the distance column? With the example above, create a rolling sum over the last N rows in order to get a minimum cumulative distance of 5, and then calculate the average velocity over those rows.
My target output would then be like this:
distance velocity rv
0 2.0 10.0 NaN
1 3.0 20.0 15.0
2 1.0 5.0 11.7
3 4.0 40.0 22.5
where
15.0 = (10+20)/2 (2 because 3 + 2 >= 5)
11.7 = (10 + 20 + 5)/3 (3 because 1 + 3 + 2 >= 5)
22.5 = (5 + 40)/2 (2 because 4 + 1 >= 5)
Update: in Pandas-speak, my code should find the index of the reverse cumulative distance sum back from my current record (such that it is 5 or greater), and then use that index to calculate the start of the moving average.