How do I convert numpy NaN objects to SQL nulls?
Asked Answered
M

4

15

I have a Pandas dataframe that I'm inserting into an SQL database. I'm using Psycopg2 directly to talk to the database, not SQLAlchemy, so I can't use Pandas built in to_sql functions. Almost everything works as expected except for the fact that numpy np.NaN values get converted to text as NaN and inserted into the database. They really should be treated as SQL null values.

So, I'm trying to make a custom adapter to convert np.NaN to SQL null but everything I've tried results in the same NaN strings being inserted in the database.

The code I'm currently trying is:

def adapt_nans(null):
    a = adapt(None).getquoted()
    return AsIs(a)

register_adapter(np.NaN, adapt_nans)

I've tried a number of variations along this theme but haven't had any luck.

Myers answered 20/8, 2015 at 0:3 Comment(2)
Personally I'd say that NaN should not be converted to NULL since they're not the same thing at all, but I can imagine contexts where I guess it could make sense. I'd use a BEFORE INSERT OR UPDATE ... FOR EACH ROW ... trigger to transform them.Stigmasterol
I do understand the general differences between NaN and NULL, however, in this particular case they really are the same thing. The data is read into the dataframe from a flat file and where there is missing data Pandas inserts a NaN.Myers
M
16

The code I was trying previously fails because it assumes that np.Nan is its own type when it is actually a float. The following code, courtesy of Daniele Varrazzo on the psycopg2 mailing list, does the job correctly.

def nan_to_null(f,
        _NULL=psycopg2.extensions.AsIs('NULL'),
        _Float=psycopg2.extensions.Float):
    if not np.isnan(f):
        return _Float(f)
    return _NULL

 psycopg2.extensions.register_adapter(float, nan_to_null)
Myers answered 27/8, 2015 at 19:26 Comment(5)
The function didn't seem to work with np.float64. No clue why. Changing if f is not _NaN: to if not np.isnan(f): fixed it. Otherwise perfect!Doorframe
@JensdeBruijn I wonder if the NaN type for float64 is different than the NaN type for a regular float.Myers
I am surprised too. If not that, could it be pandas? Using the above statement my error was fixed. I just want to show it as an option is someone runs into the same problemDoorframe
I had to use the small fix propsed by @JensdeBruijn, now works greatWatchtower
Had to use the small fix as well. I proposed an edit.Avie
E
5

If you are trying to insert Pandas dataframe data into PostgreSQL and getting the error for NaN, all you have to do is:

import psycopg2

output_df = output_df.fillna(psycopg2.extensions.AsIs('NULL'))

#Now insert output_df data in the table
Ejaculate answered 27/4, 2022 at 17:28 Comment(2)
what is the equivalent with psycopg 3?Motorcycle
TypeError: Field 'zzz' expected a number but got <psycopg2.extensions.AsIs object at 0x7f6bd5dbdb70>.Romney
C
4

This answer is an alternate version of Gregory Arenius's Answer. I have replaced the conditional statement to work on any Nan value by simply checking if the value is equal to itself.

def nan_to_null(f,
         _NULL=psycopg2.extensions.AsIs('NULL')
         _Float=psycopg2.extensions.Float)):
    if f != f:
        return _NULL
    else:
         return _Float(f)

 psycopg2.extensions.register_adapter(float, nan_to_null)

If you check if a nan value is equal to itself you will get False. The rational behind why this works is explained in detail in Stephen Canon's answer.

Chloras answered 28/7, 2020 at 9:36 Comment(0)
I
0

I believe the easiest way is:

df.where(pd.notnull(df), None)

Then None is "translated": to NULL when imported to Postgres.

Ingunna answered 11/10, 2020 at 7:59 Comment(1)
This only works if the column is not a numeric / int type.Voice

© 2022 - 2024 — McMap. All rights reserved.