Add Series as a new row into DataFrame triggers FutureWarning
Asked Answered
O

6

7

Trying to add a new row of type Series into a DataFrame, both share the same columns/index:

df.loc[df.shape[0]] = r

Getting:

FutureWarning: In a future version, object-dtype columns with all-bool values will not be included in reductions with bool_only=True. Explicitly cast to bool dtype instead.

Which comes from inference module.

Onomastics answered 21/9, 2022 at 12:36 Comment(3)
Can you create a minimal reproducible example?Dearly
Cannot reproduce have tried with: import pandas as pd d = {'col1': [True, False, True], 'col2': [True, False, True], 'col3': [False, True, True]} df = pd.DataFrame(data=d) df.loc[df.shape[0]] = [True,False,True] as @Dearly suggest, please provide a reproducible exampleReifel
Indeed, just made a dummy example which is fine: import pandas as pd # DataFrame d = {'c1': [1, 2], 'c2': [3, 4], 'c3': [True, False], 'c4': ['abc', 'def']} df = pd.DataFrame(data=d) df # Series d = {'c1': 3, 'c2': 5, 'c3': True, 'c4': 'ghi'} s = pd.Series(d) s # insert new row df[df.shape[0]] = s On it, there is some prop data involved...Onomastics
B
17

I got the same error and it is because of the version 1.5.0 of pandas why maybe some answers are here not solving the issue:

Deprecated treating all-bool object-dtype columns as bool-like in DataFrame.any() and DataFrame.all() with bool_only=True, explicitly cast to bool instead (GH46188)

So I tried to understand.. but somehow I was able to find a solution. The cause is that columns with boolean values are not properly casted. I used the concat and for me it was the existing DataFrame.

Because I don't want to define for all columns of the Dataframe the corresponding dtype (which might be also possible), I changed it for the necessary columns:

df["var1"]=df["var1"].astype(bool)

Or for multiple ones:

df=df.astype({"var1":bool,"var2":bool})

Then the concat worked for me without the FutureWarning.

Bork answered 28/9, 2022 at 15:43 Comment(0)
L
3

I eventually figured out this FutureWarning can be triggered by using pd.concat() on a dataframe with a boolean column and an empty dataframe.

Avoiding the operation if the dataframe was empty silenced the warning.

Lenora answered 5/6, 2023 at 19:59 Comment(0)
M
0

try:

df
    c1  c2  c3      c4
0   1   3   True    abc
1   2   4   False   def

d = {'c1': 3, 'c2': 5, 'c3': True, 'c4': 'ghi'} 
s = pd.Series(d) 

s
c1       3
c2       5
c3    True
c4     ghi
dtype: object

df.loc[df.shape[0]] = s.to_numpy() 

df
    c1  c2  c3      c4
0   1   3   True    abc
1   2   4   False   def
2   3   5   True    ghi

Misinform answered 21/9, 2022 at 13:27 Comment(3)
Won't solve the issue.Onomastics
why !? what is the required output ? this is based on the example of df and s you providedMisinform
The example was to show that is does not replicate the issue. In addition, I've tried .values, similar to what yo suggested.Onomastics
D
0

base:

import pandas as pd

data = pd.DataFrame.from_dict({
    'Name': ['Nik', 'Kate', 'Evan', 'Kyra'],
    'Age': [31, 30, 40, 33],
    'Location': ['Toronto', 'London', 'Kingston', 'Hamilton']
})

df = pd.DataFrame(data)
df
Name Age Location
0 Nik 31 Toronto
1 Kate 30 London
2 Evan 40 Kingston
3 Kyra 33 Hamilton

solution:

import pandas as pd

data = pd.DataFrame.from_dict({
    'Name': ['Nik', 'Kate', 'Evan', 'Kyra'],
    'Age': [31, 30, 40, 33],
    'Location': ['Toronto', 'London', 'Kingston', 'Hamilton']
})

df = pd.DataFrame(data)

# Using pandas.concat() to add a row
r = pd.DataFrame({'Name':'Creuza', 'Age':69, 'Location':'São Gonçalo'}, index=[0])
df2 = pd.concat([r,df.loc[:]]).reset_index(drop=True)
df2
Name Age Location
0 Creuza 69 São Gonçalo
1 Nik 31 Toronto
2 Kate 30 London
3 Evan 40 Kingston
4 Kyra 33 Hamilton
Dehiscence answered 21/9, 2022 at 14:54 Comment(2)
Using pd.concat won't solve the issue.Onomastics
Seriously? And why not? Try to explain in more detail what you need exactly. Maybe sharing the code (or part of it that makes sense) will help more...Dehiscence
I
0

happened to me as well when I search the message in google I got here. the reason it happened to me: when converting a dict to a data frame serious the conversion isn't converting a boolean type into: <class 'pandas.core.arrays.boolean.BooleanArray'> it converts it to <class 'numpy.ndarray'> . so you need to convert it "manually" and than concat it, correct command that worked for me was:

_item = pd.DataFrame([dictionary])
_item["column"] = _item["column"].astype("boolean")
data_frame = pd.concat([data_frame, _item], ignore_index=True)

see also: https://github.com/pandas-dev/pandas/issues/46662

Inquisitive answered 1/2, 2023 at 15:54 Comment(0)
H
0

An example that raises this warning in pandas 1.5.2:

import pandas as pd
import numpy as np

df1 = pd.DataFrame({'foo': [0, 1, 2], 'bar': [True, True, np.nan]})
df2 = pd.DataFrame({'foo': [3, 4], 'bar': [True, False]})

Gives warning:

pd.concat([df1.loc[df1['bar'] == True], df2], ignore_index=True)

Give no warning:

pd.concat([df1, df2], ignore_index=True)

pd.concat([df1.loc[df1['bar'] == True].infer_objects(), df2], ignore_index=True)

The warning is triggered because the dtype for column 'bar' is object, but the data could be cast to bool. In the first of the no-warning options the column dtype cannot be bool. In the second of the the no-warning options, the column dtype is set to bool before concatenating.

Hinch answered 16/5, 2024 at 7:56 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.