pandas FutureWarning: Downcasting object dtype arrays on .fillna, .ffill, .bfill is deprecated and will change in a future version
Asked Answered
U

4

29

In order to print dataframes nicely using tabulate, so that NaN and NaT are printed as empty cells, I've been using this successfully:

print(tabulate(df.astype(object).fillna("")))

Now, this causes the following warning:

FutureWarning: Downcasting object dtype arrays on .fillna, .ffill, .bfill is deprecated and will change in a future version. Call result.infer_objects(copy=False) instead.

I don't know what I should do instead now. I certainly don't see how infer_objects(copy=False) would help as the whole point here is indeed to force converting everything to a string representation and filling in missing values with empty strings.

Unwilled answered 29/1 at 15:54 Comment(3)
Can you provide a minimal reproducible example and your pandas version? I cannot reproduce your issue.Fergus
@mrgou, I am also having this issue with fillna. In my case, I am replacing Na with False, boolean. ``` <ipython-input-50-aed301f4f635>:2: FutureWarning: Downcasting object dtype arrays on .fillna, .ffill, .bfill is deprecated and will change in a future version. Call result.infer_objects(copy=False) instead. To opt-in to the future behavior, set pd.set_option('future.no_silent_downcasting', True) ``` I also feel that suppressing the warning is not the correct approach. Were you able to resolve the problem? Is this an issue we need to raise with pandas developers?Kristikristian
I wrote a small article explaining the FutureWarning and what to do about it, check it out.Dancy
D
10

In case of following distinct examples:

ser1 = pd.Series([False, True, float("nan")])
ser1 = ser1.fillna(False)

and

ser2 = pd.Series(["Zero", 5, 2.3])
ser2 = ser2.replace("Zero", 0)

the use of an option context combined with infer_objects at the end seems to be the most generic solution to get rid of the FutureWarning:

with pd.option_context("future.no_silent_downcasting", True):
    ser1 = ser1.fillna(False).infer_objects(copy=False)

and

with pd.option_context("future.no_silent_downcasting", True):
    ser2 = ser2.replace("Zero", 0).infer_objects(copy=False)

Probably better is to be more specific and use astype(bool) and astype(float) instead of infer_objects(copy=False) in the above.

Remark that other proposed solutions don't work in this case:

  1. The use of infer_objects(copy=False) before fillna or replace:
ser1.infer_objects(copy=False).fillna(False)
ser2.infer_objects(copy=False).replace("Zero", 0)

doesn't get rid of the FutureWarning.

  1. The use of astype before fillna or replace is even more dangerous as it returns the wrong result for the first example:
ser1.astype(bool).fillna(False)

and raises a ValueError for the second example:

ser2.astype(float).replace("Zero", 0)
  1. I would not recommend setting pandas.set_option("future.no_silent_downcasting", True) as this may hide issues elsewhere.
Distinguishing answered 27/2 at 8:58 Comment(3)
This solved a similar problem I was having. However, the syntax in the context manager is pd.option_context instead of pd.option.context.Insightful
For a DataFrame containing columns (Series) of strings of numbers, and with the intention of filling NaN with 0, executing: series.astype(float).fillna(0) avoids the FutureWarning that would otherwise result. Making the dtype explicit before filling values is a correct approach for the context outlined in this comment.Intuitionism
I can see how this would be helpful in the case where .replace (or the various fill commands) are used to produce a new series. But I don't see how anything plays nicely when using the inplace=True versions of .replace (or the various fill commands).Phonetist
E
6

Convert the DataFrame/Series type first

Example:

df.astype(float).fillna(value)

Infer the objects' type with infer_objects

df.infer_objects(copy=False).fillna(value)

Where value is a compatible type of the inferred objects:


Setting pandas.set_option("future.no_silent_downcasting", True) seems to work to remove the warning too, but I don't know if this is the correct behavior (as also pointed out by @mrgou in the comments).


Explanation

The arrays are of the type object and, when you call fillna, it tries to infer the objects' type which issues this warning.

Enceladus answered 5/2 at 14:20 Comment(7)
How to use ffill bfill etc on this. It just throws the warnings. And what the warning says to do doesn't work either.Tangency
Converting a Series or DataFrame dtype from object to a specific type like float, for example, should work as the first example I gave. What is the dtype of the Series or DataFrame? What kind of data is contained in it?Enceladus
As @Fergus mentioned, which version of pandas issues the warning with fillna, and on what min reprex? I get a similar warning using replace, as in the other answers, but I don't get any warning using fillna for any example I have tried.Madura
Version 2.2.1. I think it has more to do with data conversion. The series was initially of int type, then replaced some values with NaNs, then used fillna causing the warning. The dtype was int, then object, then fillna complained about the implicit conversion. Hope you get to reproduce it with this info.Enceladus
@MaiconMauricio, if you did that all in one cell, or in a script, it was probably the replace that issued the warning.Madura
Neither of these solutions work for the following two examples: pd.Series(["Zero", 5, 2.3]).replace("Zero", 0) and pd.Series([False, True, float("nan")]).fillna(False). The first solution fails with ValueError for the first example (when applying astype(float)), and gives the wrong result for the second example (when applying astype(bool)); The second solution infer_objects(copy=False) keeps the FutureWarning for both examples; The third solution (set option "future.no_silent_downcasting") returns objects instead of floats or bools.Distinguishing
Thank you, @Distinguishing and joe. I'll be looking more in depth into the correctness of my answer as soon as I have time to.Enceladus
L
5

I just was in a similar situation. This line of code has worked for years and I just now find out its being deprecated.

df = pd.read_excel(excel,dtype='str')
df.replace({r'\n': ' '}, regex=True)

I had to update it to this:

df = pd.read_excel(excel,dtype='str')
df.astype(str).replace(to_replace=['\n', ' '], value=(''))

As I had received this warning:

FutureWarning: Downcasting behavior in `replace` is deprecated and will be removed 
in a future version. To retain the old behavior, explicitly call 
`result.infer_objects(copy=False)`. To opt-in to the future behavior, set 
`pd.set_option('future.no_silent_downcasting', True)`
  df = df.replace({r'\n': ' '}, regex=True)
Lomond answered 6/2 at 14:58 Comment(0)
C
3

I encountered the same issue today and I'm still looking for a fix myself. In your case, I must admit I don't know what calling astype(object) twice does. However, as the error message mentions, you can try avoiding this and cast your data to strings instead (if that's feasible for the data you're handling). Then, you can use pd.DataFrame.replace like so:

df.astype(str).replace(to_replace=["nan", "None"], value=(""))

Of course, you probably want to adapt to_replace to include the NaN and NaT you're searching for. On my system, astype(str) converts NaN to nan, so it might be similar for you and NaT.

Cressler answered 30/1 at 13:15 Comment(1)
The duplicate astype was a typo. I fixed it. Eventually, I think the warning is a false alarm: I added pd.set_option('future.no_silent_downcasting', True) expecting that forcing the new behavior would cause an exception to be raised... But it didn't, and the warning was not shown again.Unwilled

© 2022 - 2024 — McMap. All rights reserved.