split geometric progression efficiently in Python (Pythonic way)
Asked Answered
M

1

7

I am trying to achieve a calculation involving geometric progression (split). Is there any effective/efficient way of doing it. The data set has millions of rows. I need the column "Traded_quantity"

Marker Action Traded_quantity
2019-11-05 09:25 0 0
09:35 2 BUY 3
09:45 0 0
09:55 1 BUY 4
10:05 0 0
10:15 3 BUY 56
10:24 6 BUY 8128

turtle = 2 (User defined)

base_quantity = 1 (User defined)

    def turtle_split(row):
        if row['Action'] == 'BUY':
            return base_quantity * (turtle ** row['Marker'] - 1) // (turtle - 1)
        else:
            return 0
    df['Traded_quantity'] = df.apply(turtle_split, axis=1).round(0).astype(int)

Calculation

For 0th Row, Traded_quantity should be zero (because the Marker is zero)

For 1st Row, Traded_quantity should be (1x1) + (1x2) = 3 (Marker 2 will be split into 1 and 1, First 1 will be multiplied with the base_quantity>>1x1, Second 1 will be multiplied with the result from first 1 times turtle>>1x2), then we make a sum of these two numbers)

For 2nd Row, Traded_quantity should be zero (because the Marker is zero)

For 3rd Row, Traded_quantity should be (2x2) = 4(Marker 1 will be multiplied with the last split from row 1 time turtle i.e 2x2)

For 4th Row, Traded_quantity should be zero(because the Marker is zero)

For 5th Row, Traded_quantity should be (4x2)+(4x2x2)+(4x2x2x2) = 56(Marker 3 will be split into 1,1 and 1, First 1 will be multiplied with the last split from row3 times turtle >>4x2, Second 1 will be multiplied with the result from first 1 with turtle>>8x2), third 1 will be multiplied with the result from second 1 with turtle>>16x2) then we make a sum of these three numbers)

For 6th Row, Traded_quantity should be (32x2)+(32x2x2)+(32x2x2x2)+(32x2x2x2x2)+(32x2x2x2x2x2) = 8128

Whenever there will be a BUY, the traded quantity will be calculated using the last batch from Traded_quantity times turtle.

Turns out the code is generating correct Traded_quantity when there is no zero in Marker. Once there is a gap with a couple of zeros geometric progression will not help, I would require the previous fig(from Cache) to recalculate Traded_q. tried with lru_cache for recursion, didn't work.

Moen answered 22/1, 2022 at 7:31 Comment(3)
"Calculation" I don't understand any of this. Where do these numbers come from? How do you know they are the right numbers? What is the underlying logic?Eyebright
It appears that you really have a math question, not a programming questions. Hint: can you think of a rule that quickly gives you the sum of 2 + (2*2) + (2*2*2) + ...? If you try writing out the result of that for a few terms, do you notice a pattern? (If you add 2 to each of the results, do you see a pattern?)Eyebright
edited the calculation, tried adding more details about the calculationMoen
L
5

This should work

def turtle_split(row):
        global base_quantity
        if row['Action'] == 'BUY':
            summation = base_quantity * (turtle ** row['Marker'] - 1) // (turtle - 1)
            base_quantity = base_quantity * (turtle ** (row['Marker'] - 1))*turtle
            return summation
        else:
            return 0
Lambart answered 22/1, 2022 at 10:9 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.