How to DataFrame.groupby along axis=1
Asked Answered
L

4

8

I have:

df = pd.DataFrame({'A':[1, 2, -3],'B':[1,2,6]})
df
    A   B
0   1   1
1   2   2
2   -3  6

Q: How do I get:

    A
0   1
1   2
2   1.5

using groupby() and aggregate()?

Something like,

df.groupby([0,1], axis=1).aggregate('mean')

So basically groupby along axis=1 and use row indexes 0 and 1 for grouping. (without using Transpose)

Luculent answered 24/12, 2017 at 20:53 Comment(1)
Are you, by any chance, looking for just df.apply(pd.Series.mean, 1)? You can also get a dataframe out of this with df.apply(pd.Series.mean, 1).to_frame('A').Combo
P
3

Are you looking for ?

df.mean(1)
Out[71]: 
0    1.0
1    2.0
2    1.5
dtype: float64

If you do want groupby

df.groupby(['key']*df.shape[1],axis=1).mean()
Out[72]: 
   key
0  1.0
1  2.0
2  1.5
Pitfall answered 24/12, 2017 at 23:19 Comment(1)
But I'd like to be able to specify row indexes as first argument to groupby (analogous to specifying column indexes as first argument when doing groupby with axis=0). You see what I mean ?Luculent
S
3

Grouping keys can come in 4 forms, I will only mention the first and third which are relevant to your question. The following is from "Data Analysis Using Pandas":

Each grouping key can take many forms, and the keys do not have to be all of the same type:

• A list or array of values that is the same length as the axis being grouped

•A dict or Series giving a correspondence between the values on the axis being grouped and the group names

So you can pass on an array the same length as your columns axis, the grouping axis, or a dict like the following:

df1.groupby({x:'mean' for x in df1.columns}, axis=1).mean()

    mean
0   1.0
1   2.0
2   1.5
Sahib answered 27/3, 2018 at 19:20 Comment(1)
code line can be reduced by using df1.groupby([1,1], axis=1).mean() OR df1.groupby(['SS','SS'], axis=1).mean() but @Sahib code is more readable as dict clearly says the mappingPinnatisect
M
3

Given the original dataframe df as follows -

   A  B  C
0  1  1  2
1  2  2  3
2 -3  6  1

Please use command

df.groupby(by=lambda x : df[x].loc[0],axis=1).mean()

to get the desired output as -

     1    2
0  1.0  2.0
1  2.0  3.0
2  1.5  1.0

Here, the function lambda x : df[x].loc[0] is used to map columns A and B to 1 and column C to 2. This mapping is then used to decide the grouping.

You can also use any complex function defined outside the groupby statement instead of the lambda function.

Multidisciplinary answered 14/9, 2021 at 12:15 Comment(0)
Z
-1

try this:

df["A"] = np.mean(dff.loc[:,["A","B"]],axis=1)
df.drop(columns=["B"],inplace=True)
      A
 0   1.0
 1   2.0
 2   1.5
Zero answered 30/7, 2019 at 2:26 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.