Cumsum Reset based on a condition in Pandas
Asked Answered
A

1

11

My question is very similar to Cumsum within group and reset on condition in pandas and Pandas: cumsum per category based on additional condition but they don't quite get me there due to my conditional requirements. I have a data frame that looks like this:

  TransactionId     Delta
          14          2
          14          3
          14          1
          14          2
          15          4
          15          2
          15          3

I want to create another column "Cumulative" that calculates the cumsum of Delta for each TransactionId. So the result would look like this:

  TransactionId     Delta    Cumulative
          14          2          2
          14          3          5
          14          1          6
          14          2          8
          15          4          4
          15          2          6
          15          3          9

I have the condition for checking TransactionId equality setup:

c1 = df.TransactionId.eq(df.TransactionId.shift())

But I can't figure out how to add the Delta value to the previous Cumulative row.

Arnelle answered 31/10, 2018 at 8:39 Comment(0)
F
15

Use groupby.cumsum:

df['Cumulative'] = df.groupby('TransactionId')['Delta'].cumsum()

print (df)

  TransactionId  Delta  Cumulative
0       14         2       2
1       14         3       5
2       14         1       6
3       14         2       8
4       15         4       4 
5       15         2       6 
6       15         3       9
Femur answered 31/10, 2018 at 8:42 Comment(2)
This is the most clear explanation of this problem that I have seen.Wordless
What if TransactionId would be a bool and you want to restart count if bool = True ?Voidance

© 2022 - 2024 — McMap. All rights reserved.