Pandas NaN introduced by pivot_table

Asked 22/9, 2016 at 6:58 Answered 21/1, 2021 at 20:34

Solved python pandas pivot pivot-table nan

I have a table containing some countries and their KPI from the world-banks API. this looks like . As you can see no nan values are present.

However, I need to pivot this table to bring int into the right shape for analysis. A pd.pivot_table(countryKPI, index=['germanCName'], columns=['indicator.id']) For some e.g. TUERKEI this works just fine:

But for most of the countries strange nan values are introduced. How can I prevent this?

Hepplewhite answered 22/9, 2016 at 6:58 Comment(0)

I think the best way to understand pivoting is to apply it to a small sample:

import pandas as pd
import numpy as np

countryKPI = pd.DataFrame({'germanCName':['a','a','b','c','c'],
                           'indicator.id':['z','x','z','y','m'],
                           'value':[7,8,9,7,8]})

print (countryKPI)
  germanCName indicator.id  value
0           a            z      7
1           a            x      8
2           b            z      9
3           c            y      7
4           c            m      8

print (pd.pivot_table(countryKPI, index=['germanCName'], columns=['indicator.id']))
             value               
indicator.id     m    x    y    z
germanCName                      
a              NaN  8.0  NaN  7.0
b              NaN  NaN  NaN  9.0
c              8.0  NaN  7.0  NaN

If need replace NaN to 0 add parameter fill_value:

print (countryKPI.pivot_table(index='germanCName', 
                              columns='indicator.id', 
                              values='value', 
                              fill_value=0))
indicator.id  m  x  y  z
germanCName             
a             0  8  0  7
b             0  0  0  9
c             8  0  7  0

Stylographic answered 22/9, 2016 at 7:9 Comment(7)

Indeed this example is good. But how can I prevent the NaN values? – Hepplewhite 22/9, 2016 at 7:11

Ok, what do you need? NaN replace to 0? – Stylographic 22/9, 2016 at 7:11

I see so the problem is that in my data for some countries not all indicators were reported ... :( – Hepplewhite 22/9, 2016 at 7:15

Yes, exactly. This is problem why you get NaN. – Stylographic 22/9, 2016 at 7:16

But a call to api.worldbank.org/countries/eg/indicators/… indeed returns some data -> which strangely did not result in my download pd.Dataframe – Hepplewhite 22/9, 2016 at 7:22

Let us continue this discussion in chat. – Hepplewhite 22/9, 2016 at 7:28

any idea why when I add fill_values=0, it doesn't do anything? I want it to count the 0's (missing values in my average) in the pivot table but it doesn't take the missing values into account. – Colleencollege 9/8, 2018 at 16:30

I would do this:

piv_out = pd.pivot_table(countryKPI, index=['germanCName'], columns=['indicator.id'])

print(piv_out.to_string(na_rep=""))

Swihart answered 21/1, 2021 at 20:34 Comment(0)

-2

As per documentations:

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.pivot.html

pivot method returns: reshaped DataFrame.

Now you can replace the na values with any desired values, using fillna method.

FOR EXAMPLE:

MY PIVOT RETURNS THE BELOW dataFrame:

Now I want to replace the Nan with 0, I will apply the fillna() method on the returned data frame from pivot method

Galbanum answered 11/3, 2019 at 2:46 Comment(0)

Recommended topics

Hot tags