Python Pandas: Passing arguments to a function in agg()
Asked Answered
W

3

19

I am trying to reduce data in a pandas dataframe by using different kind of functions and argument values. However, I did not manage to change the default arguments in the aggregation functions. Here is an example:

>>> df = pd.DataFrame({'x': [1,np.nan,2,1],
...                    'y': ['a','a','b','b']})
>>> df
     x  y
0  1.0  a
1  NaN  a
2  2.0  b
3  1.0  b

Here is an aggregation function, for which I would like to test different values of b:

>>> def translate_mean(x, b=10):
...   y = [elem + b for elem in x]
...   return np.mean(y)

In the following code, I can use this function with the default b value, but I would like to pass other values:

>>> df.groupby('y').agg(translate_mean)
      x
y
a   NaN
b  11.5

Any ideas?

Wabash answered 15/6, 2017 at 21:5 Comment(2)
Related : https://mcmap.net/q/666034/-passing-argument-in-groupby-agg-with-multiple-functions/4050261Glynas
I believe this might help. pandas-docs.github.io/pandas-docs-travis/user_guide/…Request
S
12

Just in case you have multiple columns, and you want to apply different functions and different parameters for each column, you can use lambda function with agg function. For example:

>>> df = pd.DataFrame({'x': [1,np.nan,2,1],
...                    'y': ['a','a','b','b']
                       'z': ['0.1','0.2','0.3','0.4']})
>>> df
     x  y  z
0  1.0  a  0.1
1  NaN  a  0.2
2  2.0  b  0.3
3  1.0     0.4

>>> def translate_mean(x, b=10):
...   y = [elem + b for elem in x]
...   return np.mean(y)

To groupby column 'y', and apply function translate_mean with b=10 for col 'x'; b=25 for col 'z', you can try this:

df_res = df.groupby(by='a').agg({
    'x': lambda x: translate_mean(x, 10),
    'z': lambda x: translate_mean(x, 25)})

Hopefully, it helps.

Schulz answered 9/10, 2019 at 19:42 Comment(0)
G
21

Just pass as arguments to agg (this works with apply, too).

df.groupby('y').agg(translate_mean, b=4)
Out: 
     x
y     
a  NaN
b  5.5
Garlen answered 15/6, 2017 at 21:9 Comment(1)
Short of using lambas, is this possible when aggregating across multiple functions? e.g, df.groupby('y').agg([translate_mean, translate_mean]) and I want to translate by first 4, then 8?Sequestrate
S
12

Just in case you have multiple columns, and you want to apply different functions and different parameters for each column, you can use lambda function with agg function. For example:

>>> df = pd.DataFrame({'x': [1,np.nan,2,1],
...                    'y': ['a','a','b','b']
                       'z': ['0.1','0.2','0.3','0.4']})
>>> df
     x  y  z
0  1.0  a  0.1
1  NaN  a  0.2
2  2.0  b  0.3
3  1.0     0.4

>>> def translate_mean(x, b=10):
...   y = [elem + b for elem in x]
...   return np.mean(y)

To groupby column 'y', and apply function translate_mean with b=10 for col 'x'; b=25 for col 'z', you can try this:

df_res = df.groupby(by='a').agg({
    'x': lambda x: translate_mean(x, 10),
    'z': lambda x: translate_mean(x, 25)})

Hopefully, it helps.

Schulz answered 9/10, 2019 at 19:42 Comment(0)
P
7

Maybe you can try using apply in this case:

df.groupby('y').apply(lambda x: translate_mean(x['x'], 20))

Now the result is:

y
a     NaN
b    21.5
Pyroconductivity answered 15/6, 2017 at 21:9 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.