Save Pandas DataFrames with formulas to xlsx files
Asked Answered
M

3

12

In a Pandas DataFrame i have some "cells" with values and some that need to contain excel formulas. I have read that i can get formulas with

link = 'HYPERLINK("#Groups!A' + str(someInt) + '"; "LINKTEXT")'
xlwt.Formula(link)

and store them in the dataframe.

When i try to save my dataframe as an xlsx file with

writer = pd.ExcelWriter("pandas" + str(fileCounter) + ".xlsx", engine = "xlsxwriter")
df.to_excel(writer, sheet_name = "Paths", index = False)
# insert more sheets here
writer.save()

i get the error:

TypeError: Unsupported type <class 'xlwt.ExcelFormula.Formula'> in write()

So i tried to write my formula as a string to my dataframe but Excel wants to restore the file content and then fills all formula cells with 0's.

Edit: I managed to get it work with regular strings but nevertheless would be interested in a solution for xlwt formulas.

So my question is: How do i save dataframes with formulas to xlsx files?

Mcgrody answered 15/7, 2018 at 13:31 Comment(2)
@Parfait i want to add hyperlinks to other cells to the df.Mcgrody
added formula content example.Mcgrody
E
3

After writing the df using table.to_excel(writer, sheet_name=...), I use write_formula() as in this example (edited to add the full loop). To write all the formulas in your dataframe, read each formula in your dataframe.

 # replace the right side below with reading the formula from your dataframe
 # e.g., formula_to_write = df.loc(...)`

 rows = table.shape[0]
 for row_num in range(1 + startrow, rows + startrow + 1):
    formula_to_write = '=I{} * (1 - AM{})'.format(row_num+1, row_num+1) 
    worksheet.write_formula(row_num, col, formula_to_write)`

Later in the code (I seem to recall one of these might be redundant, but I haven't looked it up): writer.save() workbook.close()

Documentation is here.

Evelyne answered 15/7, 2018 at 16:52 Comment(0)
C
8

Since you are using xlsxwriter, strings are parsed as formulas by default ("strings_to_formulas: Enable the worksheet.write() method to convert strings to formulas. The default is True"), so you can simply specify formulas as strings in your dataframe.

Example of a formula column which references other columns in your dataframe:

d = {'col1': [1, 2], 'col2': [3, 4]}
df = pd.DataFrame(data=d)
writer = pd.ExcelWriter("foo.xlsx", engine="xlsxwriter")
df["product"] = None
df["product"] = (
    '=INDIRECT("R[0]C[%s]", 0)+INDIRECT("R[0]C[%s]", 0)'
    % (
        df.columns.get_loc("col1") - df.columns.get_loc("product"),
        df.columns.get_loc("col2") - df.columns.get_loc("product"),
    )
)
df.to_excel(writer, index=False)
writer.save()

Produces the following output:

Example output in LibreOffice

Commemorate answered 23/9, 2019 at 12:36 Comment(2)
Is there any side effects for this solution? Why is there a write_formula() What is the difference?Pacien
It is worth noting that for relatively new formulas the _xlfn. prefix should be added. I encountered this problem: https://mcmap.net/q/1010623/-formula-and-encoding-issues-when-saving-df-to-excel/11485896.Phenyl
E
3

After writing the df using table.to_excel(writer, sheet_name=...), I use write_formula() as in this example (edited to add the full loop). To write all the formulas in your dataframe, read each formula in your dataframe.

 # replace the right side below with reading the formula from your dataframe
 # e.g., formula_to_write = df.loc(...)`

 rows = table.shape[0]
 for row_num in range(1 + startrow, rows + startrow + 1):
    formula_to_write = '=I{} * (1 - AM{})'.format(row_num+1, row_num+1) 
    worksheet.write_formula(row_num, col, formula_to_write)`

Later in the code (I seem to recall one of these might be redundant, but I haven't looked it up): writer.save() workbook.close()

Documentation is here.

Evelyne answered 15/7, 2018 at 16:52 Comment(0)
G
0
  • you need to save in as usual just keep in mind to write the formula as string.
  • you can use also f strings with vars.
    writer = pd.ExcelWriter(FILE_PATH ,mode='a', if_sheet_exists='overlay')
    
    col_Q_index = 3
    best_formula = f'=max(L1,N98,Q{col_Q_index})'
    formula_df = pd.DataFrame([[best_formula]])
    formula_df.to_excel(writer, sheet_name=SHEET_NAME, startrow=i, startcol=17, index=False, header=False)
    
    writer.save()
Gilmagilman answered 8/2, 2022 at 14:29 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.