How can I format the index column(s) with xlsxwriter?
Asked Answered
E

4

10

I'm using xlsxwriter and the set_column function that format the columns in my excel outputs.

However, formatting seems to be ignored when applied to the index column (or index columns in case of multi index).

I've found a workaround, so far is to introduce a fake index with reset_index then pass index=False to the to_excel function but then the nice merging feature of the multi index will be gone too.

Any ideas?

import pandas as pd
import numpy as np

from Config import TEMP_XL_FILE

def temp():
    ' temp'
    pdf = pd.DataFrame(np.random.randn(6,4), columns=list('ABCD'))
    pdf.set_index('A', drop=True, inplace=True)
    writer = pd.ExcelWriter(TEMP_XL_FILE, engine='xlsxwriter')
    pdf.to_excel(writer, 'temp')
    workbook = writer.book
    worksheet = writer.sheets['temp']
    tempformat = workbook.add_format({'num_format': '0%', 'align': 'center'})
    worksheet.set_column(-1, 3, None, tempformat)
    writer.save()

if __name__ == '__main__':
    temp()
Esquire answered 6/10, 2016 at 9:44 Comment(2)
Can you provide us your code?Joyous
In XlsxWriter, and in Excel, a Cell format overrides a Row format overrides a Column format. In this case Pandas is applying a cell format (with merge) to the index cells so the set_column() format doesn't have an effect. I don't think it is possible to override or set the Panda's index format via the API (apart from the datetime_format and date_format),Sarpedon
W
3

The pandas ExcelWriter overwrites the XlsxWriter formats in the index columns. To prevent that, change the pandas header_style to None

header_style = {"font": {"bold": True},
                "borders": {"top": "thin",
                            "right": "thin",
                            "bottom": "thin",
                            "left": "thin"},
                "alignment": {"horizontal": "center",
                              "vertical": "top"}} 

To do that:

import pandas.io.formats.excel

pandas.io.formats.excel.header_style = None

See also

Whitelaw answered 10/10, 2019 at 22:56 Comment(1)
In 1.1.5 at least, this is broken and that symbol header_style does not exist in the excel module.Tanjatanjore
V
3

I couldn't get @Max's answer to work, but the following worked for me with 1.1.5 (should work for pandas 1.*):

import pandas.io.formats.excel

pandas.io.formats.excel.ExcelFormatter.header_style = None
Vanderbilt answered 16/2, 2021 at 17:57 Comment(3)
This seems to overwrite an instance property with a static, which is a bad idea.Tanjatanjore
Can you recommend a better way?Vanderbilt
Derive from ExcelFormatter and use it to override the header_style implementation; don't do any monkey-patching; and use your new formatter class where needed.Tanjatanjore
P
1

As far as I've understood, Pandas sets the format of the index row. There are ways to reset it, but those solutions weren't very reliable. It was also quite hard to actually format it.

Writing out the index columns with the desired format worked best for me:

import pandas as pd

# The data that we're feeding to ExcelWriter
df = pd.DataFrame(
    {
        "Col A": ["a", "a", "b", "b"],
        "Col B": ["a", "b", "c", "d"],
        "Col C": [1, 2, 3, 4],
    }
)

# The Excel file we're creating
writer = pd.ExcelWriter("pandas_out.xlsx", engine="xlsxwriter")
df.to_excel(writer, sheet_name="Sheet1", index=False) # Prevents Pandas from outputting an index

# The variables we'll use to do our modifications
workbook = writer.book
worksheet = writer.sheets["Sheet1"]

worksheet.set_row(0, 30) # Set index row height to 30

# Find more info here: https://xlsxwriter.readthedocs.io/format.html#format-methods-and-format-properties
header_format = workbook.add_format(
    {
        "bold": True,
        "valign": "vcenter",
        "align": "center",
        "bg_color": "#d6d6d6",
        "border": True,
    }
)

# Write the column headers with the defined format.
for col_num, value in enumerate(df.columns.values):
    worksheet.write(0, col_num, value, header_format)

# Set format of data
format1 = workbook.add_format({"align": "center"})
worksheet.set_column('A:Z', 10, format1) # Width of cell

writer.save()
Pineapple answered 16/12, 2020 at 22:46 Comment(1)
You're not just writing the columns, you're overwriting them from what Pandas has already set. This is not ideal. You can instead define your own formatter and keep Pandas' own default column loop.Tanjatanjore
T
0
  • Don't monkey-patch the library, derive from the formatter class
  • Don't bother rewriting all of the code to create the cells yourself
  • Use the intermediate "CSS-like" formatting language Pandas defines internally for cell content; this is all written out in pandas/io/excel/_xlsxwriter.py

This works with 1.1.5:

import numpy as np
import pandas as pd
from pandas.io.formats.excel import ExcelFormatter
from typing import Dict, Any

# from Config import TEMP_XL_FILE
TEMP_XL_FILE = 'headers.xlsx'

class CenteredFormatter(ExcelFormatter):
    @property
    def header_style(self) -> Dict[str, Any]:
        d = dict(super().header_style)
        d.setdefault('alignment', {})['horizontal'] = 'center'
        d.setdefault('number_format', {})['format_code'] = '0%'
        return d


def temp() -> None:
    with pd.ExcelWriter(TEMP_XL_FILE, engine='xlsxwriter') as writer:
        pdf = pd.DataFrame(np.random.randn(6, 4), columns=list('ABCD'))
        pdf.set_index('A', drop=True, inplace=True)
        formatter = CenteredFormatter(pdf)
        formatter.write(writer, sheet_name='temp')

if __name__ == '__main__':
    temp()
Tanjatanjore answered 20/7, 2021 at 3:22 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.