Binning Pandas value_counts
Asked Answered
C

1

6

I have a Pandas Series produced by df.column.value_counts().sort_index().

| N Months | Count |
|------|------|
|    0 |   15 |
|    1 |    9 |
|    2 |   78 |
|    3 |  151 |
|    4 |  412 |
|    5 |  181 |
|    6 |  543 |
|    7 |  175 |
|    8 |  409 |
|    9 |  594 |
|   10 |  137 |
|   11 |  202 |
|   12 |  170 |
|   13 |  446 |
|   14 |   29 |
|   15 |   39 |
|   16 |   44 |
|   17 |  253 |
|   18 |   17 |
|   19 |   34 |
|   20 |   18 |
|   21 |   37 |
|   22 |  147 |
|   23 |   12 |
|   24 |   31 |
|   25 |   15 |
|   26 |  117 |
|   27 |    8 |
|   28 |   38 |
|   29 |   23 |
|   30 |  198 |
|   31 |   29 |
|   32 |  122 |
|   33 |   50 |
|   34 |   60 |
|   35 |  357 |
|   36 |  329 |
|   37 |  457 |
|   38 |  609 |
|   39 | 4744 |
|   40 | 1120 |
|   41 |  591 |
|   42 |  328 |
|   43 |  148 |
|   44 |   46 |
|   45 |   10 |
|   46 |    1 |
|   47 |    1 |
|   48 |    7 |
|   50 |    2 |

my desired output is

| bin   | Total  |
|-------|--------|
| 0-13  |   3522 |
| 14-26 |    793 |
| 27-50 |   9278 |

I tried df.column.value_counts(bins=3).sort_index() but got

|               bin               | Total |
|---------------------------------|-------|
| (-0.051000000000000004, 16.667] |  3634 |
| (16.667, 33.333]                |  1149 |
| (33.333, 50.0]                  |  8810 |

I can get the correct result with

a = df.column.value_counts().sort_index()[:14].sum()
b = df.column.value_counts().sort_index()[14:27].sum()
c = df.column.value_counts().sort_index()[28:].sum()

print(a, b, c)

Output: 3522 793 9270

But I am wondering if there is a pandas method that can do what I want. Any advice is very welcome. :-)

Corrales answered 3/12, 2020 at 2:26 Comment(0)
S
21

You can use pd.cut:

pd.cut(df['N Months'], [0,13, 26, 50], include_lowest=True).value_counts()

Update you should be able to pass custom bin to value_counts:

df['N Months'].value_counts(bins = [0,13, 26, 50])

Output:

N Months
(-0.001, 13.0]    3522
(13.0, 26.0]       793
(26.0, 50.0]      9278
Name: Count, dtype: int64
Stripling answered 3/12, 2020 at 2:40 Comment(1)
passing in the list is a great answer!Corrales

© 2022 - 2024 — McMap. All rights reserved.