Passing argument in groupby.agg with multiple functions
Asked Answered
D

2

5

Anyone knows how to pass arguments in a groupby.agg() with multiple functions?

Bottom line, I would like to use it with a custom function, but I will ask my question using a built-in function needing an argument.

Assuming:

import pandas as pd
import numpy as np
import datetime
np.random.seed(15)
day = datetime.date.today()
day_1 = datetime.date.today() - datetime.timedelta(1)
day_2 = datetime.date.today() - datetime.timedelta(2)
day_3 = datetime.date.today() - datetime.timedelta(3)
ticker_date = [('fi', day), ('fi', day_1), ('fi', day_2), ('fi', day_3),
               ('di', day), ('di', day_1), ('di', day_2), ('di', day_3)]
index_df = pd.MultiIndex.from_tuples(ticker_date, names=['lvl_1', 'lvl_2'])
df = pd.DataFrame(np.random.rand(8), index_df, ['value'])

How would I do this:

df.groupby('lvl_1').agg(['min','max','quantile'])

with, as argument for 'quantile':

q = 0.22 
Doubleganger answered 17/2, 2018 at 17:20 Comment(0)
H
11

Use lambda function:

q = 0.22
df1 = df.groupby('lvl_1')['value'].agg(['min','max',lambda x: x.quantile(q)])
print (df1)
            min       max  <lambda>
lvl_1                              
di     0.275401  0.530000  0.294589
fi     0.054363  0.848818  0.136555

Or is possible create f function and set it name for custom column name:

q = 0.22
f = lambda x: x.quantile(q)
f.__name__ = 'custom_quantile'
df1 = df.groupby('lvl_1')['value'].agg(['min','max',f])
print (df1)
            min       max  custom_quantile
lvl_1                                     
di     0.275401  0.530000         0.294589
fi     0.054363  0.848818         0.136555
Herwig answered 17/2, 2018 at 17:28 Comment(3)
Awesome, second time you help me out! I like the second option because it will help me a lot given that bottom line, I am looking to set custom functions! Thanks mateDoubleganger
Is there a practical difference between creating a named lambda, or just using a def statement?Revenge
@Revenge - in my opinion def is more common, but is is same with some exceptionsHerwig
K
1
df1 = df.groupby('lvl_1')['value'].agg(['min','max',("custom_quantile",lambda x: x.quantile(q))])

for q=0.22, the output is:

       min      max         custom_quantile
lvl_1           
di     0.275401 0.530000    0.294589
fi     0.054363 0.848818    0.136555
Kropp answered 27/5, 2020 at 8:23 Comment(3)
I doubt that this helps, or even works at all. To convince me otherwise please explain how this works and why it is supposed to help.Saltwort
It works (edited output), and is similar to the first option in jerzael's answer, although it's a bit better (naming the column inline).Gash
Providing output which makes the "it works, honest" more plausible is appreciated. However, an explanation of how it works and why it is supposed to help would be better. StackOverflow is about sharing knowledge and helping people to understand. Not for providing code to solve problems (tested or not). Please help to fight the misunderstanding that StackOverflow is a platform for finding unpaid programmers to do work for others.Saltwort

© 2022 - 2024 — McMap. All rights reserved.