How to save a new sheet in an existing excel file, using Pandas?
Asked Answered
P

14

172

I want to use excel files to store data elaborated with python. My problem is that I can't add sheets to an existing excel file. Here I suggest a sample code to work with in order to reach this issue

import pandas as pd
import numpy as np

path = r"C:\Users\fedel\Desktop\excelData\PhD_data.xlsx"

x1 = np.random.randn(100, 2)
df1 = pd.DataFrame(x1)

x2 = np.random.randn(100, 2)
df2 = pd.DataFrame(x2)

writer = pd.ExcelWriter(path, engine = 'xlsxwriter')
df1.to_excel(writer, sheet_name = 'x1')
df2.to_excel(writer, sheet_name = 'x2')
writer.save()
writer.close()

This code saves two DataFrames to two sheets, named "x1" and "x2" respectively. If I create two new DataFrames and try to use the same code to add two new sheets, 'x3' and 'x4', the original data is lost.

import pandas as pd
import numpy as np

path = r"C:\Users\fedel\Desktop\excelData\PhD_data.xlsx"

x3 = np.random.randn(100, 2)
df3 = pd.DataFrame(x3)

x4 = np.random.randn(100, 2)
df4 = pd.DataFrame(x4)

writer = pd.ExcelWriter(path, engine = 'xlsxwriter')
df3.to_excel(writer, sheet_name = 'x3')
df4.to_excel(writer, sheet_name = 'x4')
writer.save()
writer.close()

I want an excel file with four sheets: 'x1', 'x2', 'x3', 'x4'. I know that 'xlsxwriter' is not the only "engine", there is 'openpyxl'. I also saw there are already other people that have written about this issue, but still I can't understand how to do that.

Here a code taken from this link

import pandas
from openpyxl import load_workbook

book = load_workbook('Masterfile.xlsx')
writer = pandas.ExcelWriter('Masterfile.xlsx', engine='openpyxl') 
writer.book = book
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)

data_filtered.to_excel(writer, "Main", cols=['Diff1', 'Diff2'])

writer.save()

They say that it works, but it is hard to figure out how. I don't understand what "ws.title", "ws", and "dict" are in this context.

Which is the best way to save "x1" and "x2", then close the file, open it again and add "x3" and "x4"?

Paganini answered 21/2, 2017 at 15:7 Comment(0)
P
214

Thank you. I believe that a complete example could be good for anyone else who have the same issue:

import pandas as pd
import numpy as np

path = r"C:\Users\fedel\Desktop\excelData\PhD_data.xlsx"

x1 = np.random.randn(100, 2)
df1 = pd.DataFrame(x1)

x2 = np.random.randn(100, 2)
df2 = pd.DataFrame(x2)

writer = pd.ExcelWriter(path, engine = 'xlsxwriter')
df1.to_excel(writer, sheet_name = 'x1')
df2.to_excel(writer, sheet_name = 'x2')
writer.close()

Here I generate an excel file, from my understanding it does not really matter whether it is generated via the "xslxwriter" or the "openpyxl" engine.

When I want to write without loosing the original data then

import pandas as pd
import numpy as np
from openpyxl import load_workbook

path = r"C:\Users\fedel\Desktop\excelData\PhD_data.xlsx"

book = load_workbook(path)
writer = pd.ExcelWriter(path, engine = 'openpyxl')
writer.book = book

x3 = np.random.randn(100, 2)
df3 = pd.DataFrame(x3)

x4 = np.random.randn(100, 2)
df4 = pd.DataFrame(x4)

df3.to_excel(writer, sheet_name = 'x3')
df4.to_excel(writer, sheet_name = 'x4')
writer.close()

this code do the job!

Paganini answered 21/2, 2017 at 18:34 Comment(14)
any idea, why when I try this I get: ValueError: No Excel writer 'Sales Leads Calculations.xlsx'?Lizalizabeth
yes, this is adding the sheet to the excel without wiping out the pre-existing sheets. Thanks!Sentiment
how would we handle things if the file doesn't exist? I'm getting a FileNotFoundError exception at the book =.. line.Sentiment
When save the excel file, How do I keep the existing excel sheet formats?Benzoate
If anyone reads this and wonders how to overwrite an existing sheet with the same name instead of renaming the new one: Add the line writer.sheets = dict((ws.title, ws) for ws in book.worksheets) after writer.book = bookRampart
@Stefano Fedele can you do the same updation of existing excel using 'xlsxwriter' instead of 'openpyxl'?Yarndyed
Hi @amanb , I am reading a wb which has some data in one sheets. When i am trying above code then I am able to write df in new sheets but existing sheets got deleted. <code> ''' from openpyxl import load_workbook book = load_workbook('P:\TestResultSheet.xlsx') writer = pandas.ExcelWriter('P:\TestResultSheet.xlsx', engine = 'openpyxl') result_df1.to_excel(writer,sheet_name ='Sheet2') writer.save() writer.close() ''' This code deleted my exiting sheet Sheet1 and its data and updated dataframe in Sheet2. Please help to understand what i am missing here.Harmonica
I have tried it and it does not work. I got an error on line "book = load_workbook(path)" saying: KeyError: "There is no item named '[Content_Types].xml' in the archive"Daniels
I Thank you for your answer. I am new to python and have to finish some task on the go while learning. Thank you very muchLordan
I was facing this issue and was also wondering whether we could append a sheet using xlsxwriter instead of openpyxl and apparently the answer is no. It is written in the documentation of xlsxwriter: xlsxwriter.readthedocs.io/tutorial01.html#tutorial1Marsden
Hi @Stefano Fedele I tried your solution on Google Colab, instead of giving the full path I gave the ExcelFile.xlsx and I tried both with and withour r . The code worked but then the file was corrupted. I don't know why.Cristalcristate
sorry for the late, Andrea. That "r" has to be used on Windows only, it is used for correct format of the stringPaganini
@Rampart , your addition is helpful, but this just writes the new sheet on top of the old one, instead of overwriting it. This means that remnants of the old sheet can still remain (if different shape, for example). Better to remove the old sheet entirely if it exists. After book = load_workbook(path), do try: book.remove(book[sheet_name]); except: passAmazement
is there a way to do this purely with ExcelWriter? I am struggling to mix and match the two due to having already set up creating tables with ExcelWriter and the API for this is completely different in openpyxl.Hermie
L
53

For creating a new file

x1 = np.random.randn(100, 2)
df1 = pd.DataFrame(x1)
with pd.ExcelWriter('sample.xlsx') as writer:  
    df1.to_excel(writer, sheet_name='x1')

For appending to the file, use the argument mode='a' in pd.ExcelWriter.

x2 = np.random.randn(100, 2)
df2 = pd.DataFrame(x2)
with pd.ExcelWriter('sample.xlsx', engine='openpyxl', mode='a') as writer:  
    df2.to_excel(writer, sheet_name='x2')

Default is mode ='w'. See documentation.

Lemke answered 1/9, 2020 at 17:3 Comment(3)
This solution did not overwrite previous sheets for me.Burgage
I think this is better than the accepted answer. Because .book is not public API.Landwaiter
Make sure to install 'openpyxl' using pip, see the following link to install it.Forceful
B
19

In the example you shared you are loading the existing file into book and setting the writer.book value to be book. In the line writer.sheets = dict((ws.title, ws) for ws in book.worksheets) you are accessing each sheet in the workbook as ws. The sheet title is then ws so you are creating a dictionary of {sheet_titles: sheet} key, value pairs. This dictionary is then set to writer.sheets. Essentially these steps are just loading the existing data from 'Masterfile.xlsx' and populating your writer with them.

Now let's say you already have a file with x1 and x2 as sheets. You can use the example code to load the file and then could do something like this to add x3 and x4.

path = r"C:\Users\fedel\Desktop\excelData\PhD_data.xlsx"
writer = pd.ExcelWriter(path, engine='openpyxl')
df3.to_excel(writer, 'x3', index=False)
df4.to_excel(writer, 'x4', index=False)
writer.save()

That should do what you are looking for.

Bottommost answered 21/2, 2017 at 15:19 Comment(2)
any idea, why when I try this I get: ValueError: No Excel writer 'Sales Leads Calculations.xlsx'?Lizalizabeth
this is erasing the pre-existing sheets.Sentiment
C
15

A simple example for writing multiple data to excel at a time. And also when you want to append data to a sheet on a written excel file (closed excel file).

When it is your first time writing to an excel. (Writing "df1" and "df2" to "1st_sheet" and "2nd_sheet")

import pandas as pd 
from openpyxl import load_workbook

df1 = pd.DataFrame([[1],[1]], columns=['a'])
df2 = pd.DataFrame([[2],[2]], columns=['b'])
df3 = pd.DataFrame([[3],[3]], columns=['c'])

excel_dir = "my/excel/dir"

with pd.ExcelWriter(excel_dir, engine='xlsxwriter') as writer:    
    df1.to_excel(writer, '1st_sheet')   
    df2.to_excel(writer, '2nd_sheet')   
    writer.save()    

After you close your excel, but you wish to "append" data on the same excel file but another sheet, let's say "df3" to sheet name "3rd_sheet".

book = load_workbook(excel_dir)
with pd.ExcelWriter(excel_dir, engine='openpyxl') as writer:
    writer.book = book
    writer.sheets = dict((ws.title, ws) for ws in book.worksheets)    

    ## Your dataframe to append. 
    df3.to_excel(writer, '3rd_sheet')  

    writer.save()     

Be noted that excel format must not be xls, you may use xlsx one.

Chaunce answered 6/10, 2017 at 6:19 Comment(1)
I don't see what this answer adds. In fact, repeated use of a context manager like this will involve a lot more I/O.Eleanor
E
12

Every time you want to save a Pandas DataFrame to an Excel, you may call this function:

import os

def save_excel_sheet(df, filepath, sheetname, index=False):
    # Create file if it does not exist
    if not os.path.exists(filepath):
        df.to_excel(filepath, sheet_name=sheetname, index=index)

    # Otherwise, add a sheet. Overwrite if there exists one with the same name.
    else:
        with pd.ExcelWriter(filepath, engine='openpyxl', if_sheet_exists='replace', mode='a') as writer:
            df.to_excel(writer, sheet_name=sheetname, index=index)
Entrammel answered 12/10, 2021 at 20:31 Comment(0)
E
8

I would strongly recommend you work directly with openpyxl since it now supports Pandas DataFrames.

This allows you to concentrate on the relevant Excel and Pandas code.

Eleanor answered 21/2, 2017 at 16:28 Comment(2)
It would be really helpful if you could add bit more "Pandas" examples similar to thisWinfredwinfrey
I don't do a lot of work with Pandas myself so I can't really provide that many examples but would welcome improvements to the documentation.Eleanor
C
5

Can do it without using ExcelWriter, using tools in openpyxl This can make adding fonts to the new sheet much easier using openpyxl.styles

import pandas as pd
from openpyxl import load_workbook
from openpyxl.utils.dataframe import dataframe_to_rows

#Location of original excel sheet
fileLocation =r'C:\workspace\data.xlsx'

#Location of new file which can be the same as original file
writeLocation=r'C:\workspace\dataNew.xlsx'

data = {'Name':['Tom','Paul','Jeremy'],'Age':[32,43,34],'Salary':[20000,34000,32000]}

#The dataframe you want to add
df = pd.DataFrame(data)

#Load existing sheet as it is
book = load_workbook(fileLocation)
#create a new sheet
sheet = book.create_sheet("Sheet Name")

#Load dataframe into new sheet
for row in dataframe_to_rows(df, index=False, header=True):
    sheet.append(row)

#Save the modified excel at desired location    
book.save(writeLocation)
Carillon answered 6/5, 2020 at 11:7 Comment(2)
This is a nice solution, however I'm not sure if it is an implication as well. Do you mean you can't do it with ExcelWriter or you just don't need to?Pliam
You can do it with Excelwriter, but I find it easier with just using openpyxl.Carillon
U
2

You can read existing sheets of your interests, for example, 'x1', 'x2', into memory and 'write' them back prior to adding more new sheets (keep in mind that sheets in a file and sheets in memory are two different things, if you don't read them, they will be lost). This approach uses 'xlsxwriter' only, no openpyxl involved.

import pandas as pd
import numpy as np

path = r"C:\Users\fedel\Desktop\excelData\PhD_data.xlsx"

# begin <== read selected sheets and write them back
df1 = pd.read_excel(path, sheet_name='x1', index_col=0) # or sheet_name=0
df2 = pd.read_excel(path, sheet_name='x2', index_col=0) # or sheet_name=1
writer = pd.ExcelWriter(path, engine='xlsxwriter')
df1.to_excel(writer, sheet_name='x1')
df2.to_excel(writer, sheet_name='x2')
# end ==>

# now create more new sheets
x3 = np.random.randn(100, 2)
df3 = pd.DataFrame(x3)

x4 = np.random.randn(100, 2)
df4 = pd.DataFrame(x4)

df3.to_excel(writer, sheet_name='x3')
df4.to_excel(writer, sheet_name='x4')
writer.save()
writer.close()

If you want to preserve all existing sheets, you can replace above code between begin and end with:

# read all existing sheets and write them back
writer = pd.ExcelWriter(path, engine='xlsxwriter')
xlsx = pd.ExcelFile(path)
for sheet in xlsx.sheet_names:
    df = xlsx.parse(sheet_name=sheet, index_col=0)
    df.to_excel(writer, sheet_name=sheet)
Unpeopled answered 24/7, 2019 at 21:37 Comment(0)
B
1

Another fairly simple way to go about this is to make a method like this:

def _write_frame_to_new_sheet(path_to_file=None, sheet_name='sheet', data_frame=None):
    book = None
    try:
        book = load_workbook(path_to_file)
    except Exception:
        logging.debug('Creating new workbook at %s', path_to_file)
    with pd.ExcelWriter(path_to_file, engine='openpyxl') as writer:
        if book is not None:
            writer.book = book
        data_frame.to_excel(writer, sheet_name, index=False)

The idea here is to load the workbook at path_to_file if it exists and then append the data_frame as a new sheet with sheet_name. If the workbook does not exist, it is created. It seems that neither openpyxl or xlsxwriter append, so as in the example by @Stefano above, you really have to load and then rewrite to append.

Breeching answered 5/9, 2019 at 11:19 Comment(1)
import logging import pandas as pd import openpyxl def write_frame_to_new_sheet(path_to_file=None, sheet_name='sheet', data_frame=None): book = None try: book = openpyxl.load_workbook(path_to_file) except Exception: logging.debug('Creating new workbook at %s', path_to_file) with pd.ExcelWriter(path_to_file, engine='openpyxl') as writer: if book is not None: writer.book = book data_frame.to_excel(writer, sheet_name, index=False)Cull
R
0
#This program is to read from excel workbook to fetch only the URL domain names and write to the existing excel workbook in a different sheet..
#Developer - Nilesh K
import pandas as pd
from openpyxl import load_workbook #for writting to the existing workbook

df = pd.read_excel("urlsearch_test.xlsx")

#You can use the below for the relative path.
# r"C:\Users\xyz\Desktop\Python\

l = [] #To make a list in for loop

#begin
#loop starts here for fetching http from a string and iterate thru the entire sheet. You can have your own logic here.
for index, row in df.iterrows():
    try: 
        str = (row['TEXT']) #string to read and iterate
        y = (index)
        str_pos = str.index('http') #fetched the index position for http
        str_pos1 = str.index('/', str.index('/')+2) #fetched the second 3rd position of / starting from http
        str_op = str[str_pos:str_pos1] #Substring the domain name
        l.append(str_op) #append the list with domain names

    #Error handling to skip the error rows and continue.
    except ValueError:
            print('Error!')
print(l)
l = list(dict.fromkeys(l)) #Keep distinct values, you can comment this line to get all the values
df1 = pd.DataFrame(l,columns=['URL']) #Create dataframe using the list
#end

#Write using openpyxl so it can be written to same workbook
book = load_workbook('urlsearch_test.xlsx')
writer = pd.ExcelWriter('urlsearch_test.xlsx',engine = 'openpyxl')
writer.book = book
df1.to_excel(writer,sheet_name = 'Sheet3')
writer.save()
writer.close()

#The below can be used to write to a different workbook without using openpyxl
#df1.to_excel(r"C:\Users\xyz\Desktop\Python\urlsearch1_test.xlsx",index='false',sheet_name='sheet1')
Ruelu answered 6/8, 2019 at 12:22 Comment(2)
I'm not following how this is related to the question, except that its about excel.Trudeau
I was working to find a complete solution to read and write to existing workbook but was unable to find the same. Here i found a hint on how to write to existing workbook so i thought of giving a complete solution for my problem. Hope its clear.Ruelu
S
0

if you want to add empty sheet

xw = pd.ExcelWriter(file_path, engine='xlsxwriter')    
pd.DataFrame().to_excel(xw, 'sheet11')

if you get empty sheet

sheet = xw.sheets['sheet11']
Sorcery answered 19/4, 2022 at 9:7 Comment(0)
U
0

The following solution worked for me:

    # dataframe to save
    df = pd.DataFrame({"A":[1,2], "B":[3,4]})
    
    # path where you want to save
    path = "./..../..../.../test.xlsx"
    
    # if an excel sheet named `test` is already present append on sheet 2
    if os.path.isfile(path):  
      with pd.ExcelWriter(path, mode='a') as writer:
        df.to_excel(writer, sheet_name= "sheet_2")
    else:
    # if not present then write to a excel file on sheet 1
      with pd.ExcelWriter(path) as writer:
        df.to_excel(writer, sheet_name= "sheet_1")

Now, if you want to write multiple dataframes on different sheets, simply add a loop and keep on changing the sheet_name.

Ungulate answered 26/7, 2022 at 18:55 Comment(2)
what is going on here? What's with the if statement checking if a path is a file when you've hard coded it? Please do not post code-only answersDasilva
I have updated the code, you can have a look at it.Ungulate
A
0

The easiest way to do this is just using the function to_excel of Pandas specifying a new sheet_name where you want to storage the data in the existing excel file that continue other sheets.

path = 'input/existing_file.xlsx'
df_new_data = pd.read_excel('input/new_data.xlsx')

df_new_data.to_excel(path, sheet_name='New Data', index=False)

UPDATE: I've just realized that if you do this is going to rewrite your existing Excel file deleting the others sheets and just leaving the new one. So unfortunately, the only way to solve this is using the xlsxwriter as suggested in other answers

Aftermath answered 24/4, 2023 at 12:10 Comment(0)
S
-1
import pandas as pd
import openpyxl

writer = pd.ExcelWriter('test.xlsx', engine='openpyxl')
data_df.to_excel(writer, 'sheet_name')
writer.save()
writer.close()
Sharisharia answered 28/2, 2021 at 13:10 Comment(2)
Please don't post only code as an answer, but also provide an explanation of what your code does and how it solves the problem of the question. Answers with an explanation are usually more helpful and of better quality, and are more likely to attract upvotesMitten
@RanMarciano’s feedback is always important. But it’s especially important here, where there are nine other answers, including an accepted answer with well over a hundred upvotes. Why is a new answer necessary? When might your approach be preferable to the existing answers? Do you take advantage of new APIs not available when the original answers were submitted? Without this information, it’s unlikely readers will understand when or why they should try this approach. Can you edit your answer?Revival

© 2022 - 2025 — McMap. All rights reserved.