Pandas: nan->None

Asked 25/1, 2018 at 22:50 Answered 19/6, 2024 at 10:39

pandas.DataFrame.to_dict converts nan to nan and null to None. As explained in Python comparison ignoring nan this is sometimes suboptimal.

Is there a way to convert all nans to None? (either in pandas or later on in Python)

E.g.,

>>> df = pd.DataFrame({"a":[1,None],"b":[None,"foo"]})
>>> df
     a     b
0  1.0  None
1  NaN   foo
>>> df.to_dict()
{'a': {0: 1.0, 1: nan}, 'b': {0: None, 1: 'foo'}}

I want

{'a': {0: 1.0, 1: None}, 'b': {0: None, 1: 'foo'}}

instead.

Aeriel answered 25/1, 2018 at 22:50 Comment(0)

import pandas as pd

df = pd.DataFrame({"a":[1,None],"b":[None,"foo"]})
df.where((pd.notnull(df)), None)
Out[850]: 
      a     b
0     1  None
1  None   foo
df.where((pd.notnull(df)), None).to_dict()
Out[851]: {'a': {0: 1.0, 1: None}, 'b': {0: None, 1: 'foo'}}

Decontaminate answered 25/1, 2018 at 22:55 Comment(5)

I'll note that this does the same thing, converts every column to an object type, just that it does it in two steps. – Misti 25/1, 2018 at 22:58

@cᴏʟᴅsᴘᴇᴇᴅ yep, you are right , almost the same :-) – Decontaminate 25/1, 2018 at 22:59

Just mentioning that since OP seems to think this is converting the data to string (which isn't the case!). – Misti 25/1, 2018 at 23:0

@cᴏʟᴅsᴘᴇᴇᴅ: this is different from what you suggested because it works on the externally generated DataFrame, as opposed to creating a generic DF from scratch. – Aeriel 26/1, 2018 at 2:39

@Aeriel I am aware of what it does. My point in my previous comment was that the end result is the same (a generic dataframe), not a dataframe of strings like you initially surmised. I was only addressing your misconception, nothing more. – Misti 26/1, 2018 at 2:43

Initialise as an object DataFrame (at your peril...):

df = pd.DataFrame({"a":[1,None],"b":[None,"foo"]}, dtype=object)    
df

      a     b
0     1  None
1  None   foo

In the first column, pandas attempts to infer the dtype, and guesses float. You can prevent that by forcing it to remain object thereby suppressing any type of conversion at all.

Misti answered 25/1, 2018 at 22:53 Comment(9)

This is cheating. I have numeric columns in the DataFrame, and converting it to string loses information. – Aeriel 25/1, 2018 at 22:55

@Aeriel No, there is no string conversion taking place. – Misti 25/1, 2018 at 22:56

Each column is initialised as column of python objects. Pandas no longer makes assumptions about what its content is, and falls back to slow methods of operating on it. – Misti 25/1, 2018 at 22:57

I had a feeling though that df = pd.DataFrame({"a":[1,None],"b":[None,"foo"]}) was an MCVE to give a starting DF to play with. In reality, if you're at the end of a chain of processes, does it make sense to convert your whole resulting DF to object before to_dict()? – Article 25/1, 2018 at 22:59

@Aeriel object != str – Ornithorhynchus 25/1, 2018 at 22:59

@cᴏʟᴅsᴘᴇᴇᴅ I've just seen your comment on the other answer so I'm probably wrong here. – Article 25/1, 2018 at 23:1

@Article It usually doesn't make sense converting any dataframe to object except in the rarest of cases. OP seems to have a good reason for wanting to do so, so I'm not getting in their way here... – Misti 25/1, 2018 at 23:1

@cᴏʟᴅsᴘᴇᴇᴅ No, what I meant by my very last comment is I missed something. df.where((pd.notnull(df)), None).to_dict() looks the business, but you stated it's converting to object type in two steps. So your answer, on the surface, does look like a cheat to me because you alter the DF at creation but ultimately it doesn't matter. +1 for reshaping my thinking :) – Article 25/1, 2018 at 23:6

@Article Cheers, as long as you call pd.DataFrame somewhere, this works :D – Misti 25/1, 2018 at 23:17

I found that the accepted answer did not work, but this did:

df.replace([np.nan], [None]).to_dict('records')

I don't know why. I can say at least that all fields of the df that appeared to have na values in them did verify as such by checking them with df.isna().

I got the solution from here.

Fortnightly answered 19/6, 2024 at 10:39 Comment(0)

Recommended topics

Hot tags