How to use pandas to_csv float_format?

About

Asked 23/7, 2018 at 10:31 Answered 23/7, 2018 at 11:53

I am reading from a data file that has 8 precision, then after interpolating some values I am saving them like where the float_format option is not working,

df.to_csv('data.dat',sep=' ', index=False, header=False, float_format="%.8f")

and the result file looks like

0.02506602 0.05754493 0.36854688
0.02461631 0.0599653 0.43078098
0.02502534 0.06209149 0.44955311
0.4267356675182389 0.1718682822340447 0.5391386354945895
0.426701667727433 0.17191008887193007 0.5391897818631616
0.4266676661681287 0.17195189807522643 0.5392409104354972

The first 3 lines were in data file and next 3 are the new interpolated values. I want all the values to be of same length. Whats going wrong here and how do I fix it?

Also: It would be nice if I can control the float precision differently for different columns.

Peugia answered 23/7, 2018 at 10:31 Comment(0)

Your code looks fine. Most likely, there is an issue with your input data. Use pd.DataFrame.dtypes to check all your input series are of type float. If they aren't convert to float via:

df[col_list] = df[col_list].apply(pd.to_numeric, downcast='float').fillna(0)

Here's a working example:

from io import StringIO
import pandas as pd

mystr = StringIO("""0.02506602 0.05754493 0.36854688
0.02461631 0.0599653 0.43078098
0.02502534 0.06209149 0.44955311
0.4267356675182389 0.1718682822340447 0.5391386354945895
0.426701667727433 0.17191008887193007 0.5391897818631616
0.4266676661681287 0.17195189807522643 0.5392409104354972""")

df = pd.read_csv(mystr, delim_whitespace=True, header=None)

print(df.dtypes)

# 0    float64
# 1    float64
# 2    float64
# dtype: object

file_loc = r'C:\temp\test.dat'
df.to_csv(file_loc, sep=' ', index=False, header=False, float_format="%.8f")

df = pd.read_csv(file_loc, delim_whitespace=True, header=None)

print(df[0].iloc[-1])

# 0.42666767

Limoges answered 23/7, 2018 at 11:53 Comment(7)

well somwhere along the way I have used a code df.loc[df[col1]==some_value]='' that messed up everything – Peugia 23/7, 2018 at 12:6

@Eular, Yes, that could be it. Not sure why you'd add empty strings to numeric data. Use np.nan instead and you might have better luck. – Limoges 23/7, 2018 at 12:6

print an empty line in some places - that's non-trivial (and inefficient). I strongly advise against. I think you probably need to provide a minimal reproducible example. Because (as you can see from my example), it's not straightforward to reproduce your problem. – Limoges 23/7, 2018 at 12:52

Ok, float_format working now. Thanks. Can you set the precision as 2 point for 1st column and 8 point for later 2? – Peugia 23/7, 2018 at 12:53

@Eular, I'm not sure this is possible with to_csv. You may wish to start a new question. – Limoges 23/7, 2018 at 12:54

well, I had one integer column and I wanted to write that as integers but when I use np.nan I can't keep the column as integer. Thats why I am trying different precision format for different columns. Also if I use round() then some prints in scientific e notation I also don't want that. So, my best shot would be using round() without introducing e notation. – Peugia 23/7, 2018 at 15:52

It's a common problem. nan is considered a float, but there isn't a substitute for int type. Best, in my opinion, to leave as float, or some integer (e.g. -1) which you know is invalid data. Definitely not a good idea to start rounding and playing around with object type. – Limoges 23/7, 2018 at 15:55

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags