How can I format the index column(s) with xlsxwriter?

Asked 6/10, 2016 at 9:44 Answered 20/7, 2021 at 3:22

I'm using xlsxwriter and the set_column function that format the columns in my excel outputs.

However, formatting seems to be ignored when applied to the index column (or index columns in case of multi index).

I've found a workaround, so far is to introduce a fake index with reset_index then pass index=False to the to_excel function but then the nice merging feature of the multi index will be gone too.

Any ideas?

import pandas as pd
import numpy as np

from Config import TEMP_XL_FILE

def temp():
    ' temp'
    pdf = pd.DataFrame(np.random.randn(6,4), columns=list('ABCD'))
    pdf.set_index('A', drop=True, inplace=True)
    writer = pd.ExcelWriter(TEMP_XL_FILE, engine='xlsxwriter')
    pdf.to_excel(writer, 'temp')
    workbook = writer.book
    worksheet = writer.sheets['temp']
    tempformat = workbook.add_format({'num_format': '0%', 'align': 'center'})
    worksheet.set_column(-1, 3, None, tempformat)
    writer.save()

if __name__ == '__main__':
    temp()

Esquire answered 6/10, 2016 at 9:44 Comment(2)

Can you provide us your code? – Joyous 6/10, 2016 at 9:50

In XlsxWriter, and in Excel, a Cell format overrides a Row format overrides a Column format. In this case Pandas is applying a cell format (with merge) to the index cells so the set_column() format doesn't have an effect. I don't think it is possible to override or set the Panda's index format via the API (apart from the datetime_format and date_format), – Sarpedon 6/10, 2016 at 12:46

The pandas ExcelWriter overwrites the XlsxWriter formats in the index columns. To prevent that, change the pandas header_style to None

header_style = {"font": {"bold": True},
                "borders": {"top": "thin",
                            "right": "thin",
                            "bottom": "thin",
                            "left": "thin"},
                "alignment": {"horizontal": "center",
                              "vertical": "top"}}

To do that:

import pandas.io.formats.excel

pandas.io.formats.excel.header_style = None

See also

Whitelaw answered 10/10, 2019 at 22:56 Comment(1)

In 1.1.5 at least, this is broken and that symbol header_style does not exist in the excel module. – Tanjatanjore 16/7, 2021 at 22:1

I couldn't get @Max's answer to work, but the following worked for me with 1.1.5 (should work for pandas 1.*):

import pandas.io.formats.excel

pandas.io.formats.excel.ExcelFormatter.header_style = None

Vanderbilt answered 16/2, 2021 at 17:57 Comment(3)

This seems to overwrite an instance property with a static, which is a bad idea. – Tanjatanjore 16/7, 2021 at 22:0

Can you recommend a better way? – Vanderbilt 17/7, 2021 at 9:45

Derive from ExcelFormatter and use it to override the header_style implementation; don't do any monkey-patching; and use your new formatter class where needed. – Tanjatanjore 20/7, 2021 at 3:4

As far as I've understood, Pandas sets the format of the index row. There are ways to reset it, but those solutions weren't very reliable. It was also quite hard to actually format it.

Writing out the index columns with the desired format worked best for me:

import pandas as pd

# The data that we're feeding to ExcelWriter
df = pd.DataFrame(
    {
        "Col A": ["a", "a", "b", "b"],
        "Col B": ["a", "b", "c", "d"],
        "Col C": [1, 2, 3, 4],
    }
)

# The Excel file we're creating
writer = pd.ExcelWriter("pandas_out.xlsx", engine="xlsxwriter")
df.to_excel(writer, sheet_name="Sheet1", index=False) # Prevents Pandas from outputting an index

# The variables we'll use to do our modifications
workbook = writer.book
worksheet = writer.sheets["Sheet1"]

worksheet.set_row(0, 30) # Set index row height to 30

# Find more info here: https://xlsxwriter.readthedocs.io/format.html#format-methods-and-format-properties
header_format = workbook.add_format(
    {
        "bold": True,
        "valign": "vcenter",
        "align": "center",
        "bg_color": "#d6d6d6",
        "border": True,
    }
)

# Write the column headers with the defined format.
for col_num, value in enumerate(df.columns.values):
    worksheet.write(0, col_num, value, header_format)

# Set format of data
format1 = workbook.add_format({"align": "center"})
worksheet.set_column('A:Z', 10, format1) # Width of cell

writer.save()

Pineapple answered 16/12, 2020 at 22:46 Comment(1)

You're not just writing the columns, you're overwriting them from what Pandas has already set. This is not ideal. You can instead define your own formatter and keep Pandas' own default column loop. – Tanjatanjore 20/7, 2021 at 3:25

Don't monkey-patch the library, derive from the formatter class
Don't bother rewriting all of the code to create the cells yourself
Use the intermediate "CSS-like" formatting language Pandas defines internally for cell content; this is all written out in pandas/io/excel/_xlsxwriter.py

This works with 1.1.5:

import numpy as np
import pandas as pd
from pandas.io.formats.excel import ExcelFormatter
from typing import Dict, Any

# from Config import TEMP_XL_FILE
TEMP_XL_FILE = 'headers.xlsx'

class CenteredFormatter(ExcelFormatter):
    @property
    def header_style(self) -> Dict[str, Any]:
        d = dict(super().header_style)
        d.setdefault('alignment', {})['horizontal'] = 'center'
        d.setdefault('number_format', {})['format_code'] = '0%'
        return d


def temp() -> None:
    with pd.ExcelWriter(TEMP_XL_FILE, engine='xlsxwriter') as writer:
        pdf = pd.DataFrame(np.random.randn(6, 4), columns=list('ABCD'))
        pdf.set_index('A', drop=True, inplace=True)
        formatter = CenteredFormatter(pdf)
        formatter.write(writer, sheet_name='temp')

if __name__ == '__main__':
    temp()

Tanjatanjore answered 20/7, 2021 at 3:22 Comment(0)

Recommended topics

Hot tags