Cannot convert string to float in pandas (ValueError)
Asked Answered
E

4

18

I have a dataframe created form a JSON output that looks like this:

        Total Revenue    Average Revenue    Purchase count    Rate
Date    
Monday  1,304.40 CA$     20.07 CA$          2,345             1.54 %

The value stored are received as string from the JSON. I am trying to:

1) Remove all characters in the entry (ex: CA$ or %) 2) convert rate and revenue columns to float 3) Convert count columns as int

I tried to do the following:

df[column] = (df[column].str.split()).apply(lambda x: float(x[0]))

It works fine except when I have a value with a coma (ex: 1,465 won't work whereas 143 would).

I tried to use several function to replace the "," by "", etc. Nothing worked so far. I always receive the following error:

ValueError: could not convert string to float: '1,304.40'

Epstein answered 24/8, 2016 at 14:9 Comment(0)
T
19

These strings have commas as thousands separators so you will have to remove them before the call to float:

df[column] = (df[column].str.split()).apply(lambda x: float(x[0].replace(',', '')))

This can be simplified a bit by moving split inside the lambda:

df[column] = df[column].apply(lambda x: float(x.split()[0].replace(',', '')))
Tameka answered 24/8, 2016 at 14:11 Comment(0)
C
4

Another solution with list comprehension, if need apply string functions working only with Series (columns of DataFrame) like str.split and str.replace:

df = pd.concat([df[col].str.split()
                       .str[0]
                       .str.replace(',','').astype(float) for col in df], axis=1)

#if need convert column Purchase count to int
df['Purchase count'] = df['Purchase count'].astype(int)
print (df)
         Total Revenue  Average Revenue  Purchase count  Rate
Date                                                        
Monday         1304.4            20.07            2345  1.54
Costello answered 24/8, 2016 at 14:15 Comment(0)
R
1

I have also faced that problem and my problem was resolved using the code below:

import pandas as pd
df['Purchase count'] = pd.to_numeric(df['Purchase count'], errors='coerce')
print(df.dtypes)
Reconstitute answered 19/8, 2022 at 14:42 Comment(0)
K
1

the below solution worked for me..!!

import pandas as pd


df['Purchase count'] = df['Purchase count'].replace(',', '', regex=True).astype(float)

print('type:    ', type(df['Purchase count']))
Kao answered 3/1, 2024 at 9:20 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.