Remove leading comma in header when using pandas to_csv
Asked Answered
T

3

5

By default to_csv writes a CSV like

,a,b,c
0,0.0,0.0,0.0
1,0.0,0.0,0.0
2,0.0,0.0,0.0

But I want it to write like this:

a,b,c
0,0.0,0.0,0.0
1,0.0,0.0,0.0
2,0.0,0.0,0.0

How do I achieve this? I can't set index=False because I want to preserve the index. I just want to remove the leading comma.

df = pd.DataFrame(np.zeros((3,3)), columns = ['a','b','c'])
df.to_csv("test.csv") # this results in the first example above.
Trotman answered 24/1, 2020 at 13:51 Comment(6)
You do not want to remove that leading comma or all other columns are shifted to left since csv is a comma-separated values text file.Attract
@Attract The second dataframe in my example above works perfectly well. Give it a try. pd.read_csv("test.csv") (where test.csv is the second example).Trotman
Carefully look at that desired result which is exactly what I mention. Column A is no longer aligned to original values but shifted to left and then you have the last column without a header!Attract
It's implied that the index is unnamed, and pd.read_csv interprets that implication correctly. I know this is certainly not best practice, and I don't recommend anyone do it this way, but I needed to do it this way for some legacy reasons. @AttractTrotman
Understood but do note, other than pandas, reading this csv (per accepted answer below) in other applications/languages will result in shifted columns. I had a feeling this was an XY Problem. Your real question should have been handling the legacy reasons! I have yet to met a use case to break best practices. Good luck and happy coding!Attract
@Attract 100% agreeTrotman
S
5

It is possible by write only columns without index first and then data without header in append mode:

df = pd.DataFrame(np.zeros((3,3)), columns = ['a','b','c'], index=list('XYZ'))

pd.DataFrame(columns=df.columns).to_csv("test.csv", index=False)
#alternative for empty df
#df.iloc[:0].to_csv("test.csv", index=False)
df.to_csv("test.csv", header=None, mode='a')

df = pd.read_csv("test.csv")
print (df)
     a    b    c
X  0.0  0.0  0.0
Y  0.0  0.0  0.0
Z  0.0  0.0  0.0
Selves answered 24/1, 2020 at 14:4 Comment(0)
C
3

Simply set a name for your index: df.index.name = 'blah'. This name will appear as the first name in the headers.

import numpy as np
import pandas as pd

df = pd.DataFrame(np.zeros((3,3)), columns = ['a','b','c'])
df.index.name = 'my_index'
print(df.to_csv())

yields

my_index,a,b,c
0,0.0,0.0,0.0
1,0.0,0.0,0.0
2,0.0,0.0,0.0

However if (as per your comment) you wish to have 3 coma-separated names in the headers while there are 4 coma-separated values in the rows of the csv, you'll have to handcraft it. It will NOT be compliant with any csv standard format though.

Cordelier answered 24/1, 2020 at 13:53 Comment(1)
Well I want the header to be "a,b,c", and not "my_index,a,b,c" ? @AttractTrotman
A
3

Alternatively, try reseting the index so it becomes a column in data frame, named index. This works with multiple indexes as well.

df = df.reset_index()
df.to_csv('output.csv', index = False)
Attract answered 24/1, 2020 at 14:1 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.