AttributeError: 'float' object has no attribute 'lower'
Asked Answered
C

6

20

I'm facing this attribute error and I'm stuck at how to handle float values if they appear in a tweet.The streaming tweet has to be lower cased and tokenized so i have used split function.

Can somebody please help me to deal with it, any workaround or solution ..?

Here's the error which m gettin....

AttributeError                            Traceback (most recent call last)
<ipython-input-28-fa278f6c3171> in <module>()
      1 stop_words = []
----> 2 negfeats = [(word_feats(x for x in p_test.SentimentText[f].lower().split() if x not in stop_words), 'neg') for f in l]
      3 posfeats = [(word_feats(x for x in p_test.SentimentText[f].lower().split() if x not in stop_words), 'pos') for f in p]
      4 
      5 trainfeats = negfeats+ posfeats

AttributeError: 'float' object has no attribute 'lower'

Here is my code

p_test = pd.read_csv('TrainSA.csv')

stop_words = [ ]

def word_feats(words):

    return dict([(word, True) for word in words])


l = [ ]

for f in range(len(p_test)):

    if p_test.Sentiment[f] == 0:

        l.append(f)



p = [ ]

for f in range(len(p_test)):

    if p_test.Sentiment[f] == 1:

        p.append(f) 




negfeats = [(word_feats(x for x in p_test.SentimentText[f].lower().split() if x not in stop_words), 'neg') for f in l]

posfeats = [(word_feats(x for x in p_test.SentimentText[f].lower().split() if x not in stop_words), 'pos') for f in p]


trainfeats = negfeats+ posfeats

print len(trainfeats)


import random 

random.shuffle(trainfeats)

print(len(trainfeats))




p_train = pd.read_csv('TrainSA.csv')


l_t = []

for f in range(len(p_train)):

    if p_train.Sentiment[f] == 0:

        l_t.append(f)


p_t = []

for f in range(len(p_train)):

    if p_train.Sentiment[f] == 1:

        p_t.append(f)        

print len(l_t)

print len(p_t)

I tried many ways but still not able to get them to use lower and split function.

Cosentino answered 11/1, 2016 at 14:45 Comment(2)
Apparently p_test.SentimentText[f] is a floating point number, rather than a string. You can't call lower() on a float.Stoat
It usually helps to include actual error text with traceback instead of just mentioning it - otherwise people have to guess where that error could have originated.Chuckchuckfull
C
44

Thank you @Dick Kniep. Yes,it is Pandas CSV reader. Your suggestion worked. Following is the python code which worked for me by specifying the field datatype, (in this case, its string)

p_test = pd.read_csv('TrainSA.csv')
p_test.SentimentText=p_test.SentimentText.astype(str)
Cosentino answered 11/1, 2016 at 16:5 Comment(0)
A
20

I get the feeling that your problems has its root in the pd.read_csv('TrainSA.csv') function. Althought you did not post this routine I assume it is Pandas read_csv. This routine intelligently converts input to python datatypes. However this means that in your case some values could be translated to a float. You can prevent this intelligent (?) behaviour by specifying which datatypes you expect for each column.

Ahern answered 11/1, 2016 at 15:10 Comment(0)
R
5

If you are using data frame, drop the NA using:

df = df.dropna()
Radiochemical answered 1/12, 2021 at 18:41 Comment(3)
This is not a good solution when we need all data. In this answer, we will miss some data that have Na.Carrageen
@Carrageen you can always set a default value instead, dropna is one of the solutions if it fits your use case, let say replace it with a default value is another.Radiochemical
You can use df.fillna("0") insteadCleromancy
C
4

I got similar error with my dataset. Setup dtype parameter didn't help me. I have to prepare my dataset. The problem was with NaN column value. Dataset part:

Id,Category,Text
1,contract,"Some text with commas, and other "
2,contract,

So my solution: before read_csv I added dummy text instead of an empty row:

Id,Category,Text
1,contract,"Some text with commas, and other "
2,contract,"NaN"

Now my app works fine.

Clarify answered 19/5, 2017 at 8:23 Comment(0)
Y
1
df=pd.read_excel("location\file.xlsx")
df.characters=df.characters.astype(str)

I tried this and I got my answer.

Yuk answered 13/7, 2022 at 9:32 Comment(0)
F
0

You can ensure if the DataFrame series is not null or non-missing values.

You can do the below step before performing any operations.

df = df[df['ColumnName'].notna()]

Flurry answered 6/11, 2022 at 20:1 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.