I have a small DF (2rows x 4cols). And a function that will add an extra column depending on some logic, once the apply
is performed. With Pandas 0.24.2
I've been doing this as df.apply(func, axis=1)
and I would get my extra column. So far, so good.
Now with Pandas 1.1.0
something weird happens: when I apply
, the first row is processed twice, and the second row is not even considered.
I will show the original DF, the expected one, and the function. I added a print(row)
so you can see how the first row
of the DF is repeated in the process.
In [82]: df_attr_list
Out[82]:
name attrName string_value dict_value
0 FW12611 HW type None ALU1
1 FW12612 HW type None ALU1
Now, the function, and its output ...
def setFinalValue(row):
rtrName = row['name']
attrName = row['attrName'].replace(" ","")
dict_value = row['dict_value']
string_value = row['string_value']
finalValue = 'N/A'
if attrName in ['Val1','Val2','Val3']:
finalValue = dict_value
elif attrName in ['Val4','Val5',]:
finalValue = string_value
else:
finalValue = "N/A"
row['finalValue'] = finalValue
print(row)
return row
Now, the output after the apply
...
In [83]: df_attr_list.apply(setFinalValue, axis=1)
name FW12611
attrName HW type
string_value None
dict_value ALU1
finalValue ALU1
Name: 0, dtype: object
name FW12611
attrName HW type
string_value None
dict_value ALU1
finalValue ALU1
Name: 1, dtype: object
Out[83]:
name attrName string_value dict_value finalValue
0 FW12611 HW type None ALU1 ALU1
1 FW12611 HW type None ALU1 ALU1
As you can see, the extra column is added, but the first row of the original DF is processed twice, as if the second didn't exist ...
Why is this happening?
I'm already trying this out with pandas 1.1.0...
In [86]: print(pd.__version__)
1.1.0
thanks!
1.1.0
and I'm already using it. Actually, as per your second link, I would expect at least the first row being processed twice, but the second to be processed as well: that's not happening ... – Counsellor