I am trying to use a customised function with groupby
in pandas. I find that using apply
allows me to do that in the following way:
(An example which calculates a new mean from two groups)
import pandas as pd
def newAvg(x):
x['cm'] = x['count']*x['mean']
sCount = x['count'].sum()
sMean = x['cm'].sum()
return sMean/sCount
data = [['A', 4, 2.5], ['A', 3, 6], ['B', 4, 9.5], ['B', 3, 13]]
df = pd.DataFrame(data, columns=['pool', 'count', 'mean'])
df_gb = df.groupby(['pool']).apply(newAvg)
Is it possible to integrate this into an agg
function? Along these lines:
df.groupby(['pool']).agg({'count': sum, ['count', 'mean']: apply(newAvg)})