Keep columns after a groupby in an empty dataframe
Asked Answered
K

2

12

The dataframe is an empty df after query.when groupby,raise runtime waring,then get another empty dataframe with no columns.How to keep the columns?

df = pd.DataFrame(columns=["PlatformCategory","Platform","ResClassName","Amount"])
print df

result:

Empty DataFrame
Columns: [PlatformCategory, Platform, ResClassName, Amount]
Index: []

then groupby:

df = df.groupby(["PlatformCategory","Platform","ResClassName"]).sum()
df = df.reset_index(drop=False,inplace=True)
print df

result: sometimes is None sometime is empty dataframe

Empty DataFrame
Columns: []
Index: []

why empty dataframe has no columns.

runtimewaring:

/data/pyrun/lib/python2.7/site-packages/pandas/core/groupby.py:3672: RuntimeWarning: divide by zero encountered in log

if alpha + beta * ngroups < count * np.log(count):

/data/pyrun/lib/python2.7/site-packages/pandas/core/groupby.py:3672: RuntimeWarning: invalid value encountered in double_scalars
  if alpha + beta * ngroups < count * np.log(count):
Kermanshah answered 7/9, 2017 at 7:28 Comment(0)
C
6

You need as_index=False and group_keys=False:

df = df.groupby(["PlatformCategory","Platform","ResClassName"], as_index=False).count()
df

Empty DataFrame
Columns: [PlatformCategory, Platform, ResClassName, Amount]
Index: []

No need to reset your index afterwards.

Cinque answered 7/9, 2017 at 7:32 Comment(6)
This was exactly what I was also looking for. Thanks a lot!Kermanshah
This only works for empty dataframe.In no empty dataframe case,it does't work.When change count() to sum() ,it does't work too.I want to get the sum compatible two cases .Have you some advice?Kermanshah
@Kermanshah Share some data... in your question?Cinque
@Kermanshah If you are trying to find the sum of some particular column, then call sum() on that column.Cinque
change to sum,get empty dataframe without columnsKermanshah
@Kermanshah Interestingly, I don't think it's possible to do this with sum, because sum condenses all rows in an aggregation attempt. Try it with actual data and you'll understand.Cinque
R
1

Some code that works the same for .sum() whether or not the dataframe is empty:

def groupby_sum(df, groupby_cols):
    groupby = df.groupby(groupby_cols, as_index=False)
    summed = groupby.sum()
    return (groupby.count() if summed.empty else summed).set_index(groupby_cols)

df = groupby_sum(df, ["PlatformCategory", "Platform", "ResClassName"])
Recrimination answered 25/8, 2021 at 13:52 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.