How to copy worksheet from one workbook to another one using openpyxl?
Asked Answered
M

12

25

I have a large amount of EXCEL files (i.e. 200) I would like to copy one specific worksheet from one workbook to another one. I have done some investigations and I couldn't find a way of doing it with Openpyxl

This is the code I have developed so far

def copy_sheet_to_different_EXCEL(path_EXCEL_read,Sheet_name_to_copy,path_EXCEL_Save,Sheet_new_name):
''' Function used to copy one EXCEL sheet into another file.
    
    def path_EXCEL_read,Sheet_name_to_copy,path_EXCEL_Save,Sheet_new_name
    
Input data:
    1.) path_EXCEL_read: the location of the EXCEL file along with the name where the information is going to be saved
    2.) Sheet_name_to_copy= The name of the EXCEL sheet to copy
    3.) path_EXCEL_Save: The path of the EXCEL file where the sheet is going to be copied
    3.) Sheet_new_name: The name of the new EXCEL sheet
    
Output data:
    1.) Status= If 0, everything went OK. If 1, one error occurred.

Version History:
1.0 (2017-02-20): Initial version.

'''
status=0

if(path_EXCEL_read.endswith('.xls')==1): 
    print('ERROR - EXCEL xls file format is not supported by openpyxl. Please, convert the file to an XLSX format')
    status=1
    return status
    
try:
   wb = openpyxl.load_workbook(path_EXCEL_read,read_only=True)
except:
    print('ERROR - EXCEL file does not exist in the following location:\n  {0}'.format(path_EXCEL_read))
    status=1
    return status

Sheet_names=wb.get_sheet_names()    # We copare against the sheet name we would like to cpy

if ((Sheet_name_to_copy in Sheet_names)==0):
    print('ERROR - EXCEL sheet does not exist'.format(Sheet_name_to_copy))
    status=1
    return status   

# We checking if the destination file exists

if (os.path.exists(path_EXCEL_Save)==1):
    #If true, file exist so we open it
    
    if(path_EXCEL_Save.endswith('.xls')==1): 
        print('ERROR - Destination EXCEL xls file format is not supported by openpyxl. Please, convert the file to an XLSX format')
        status=1
    return status
    
    try:
        wdestiny = openpyxl.load_workbook(path_EXCEL_Save)
    except:
        print('ERROR - Destination EXCEL file does not exist in the following location:\n  {0}'.format(path_EXCEL_read))
        status=1
    return status

    #we check if the destination sheet exists. If so, we will delete it
    
    destination_list_sheets = wdestiny.get_sheet_names()
    
    if((Sheet_new_name in destination_list_sheets) ==True):
        print('WARNING - Sheet "{0}" exists in: {1}. It will be deleted!'.format(Sheet_new_name,path_EXCEL_Save))
        wdestiny.remove_sheet(Sheet_new_name) 

else:
    wdestiny=openpyxl.Workbook()
# We copy the Excel sheet
    
try:
    sheet_to_copy = wb.get_sheet_by_name(Sheet_name_to_copy) 
    target = wdestiny.copy_worksheet(sheet_to_copy)
    target.title=Sheet_new_name
except:
    print('ERROR - Could not copy the EXCEL sheet. Check the file')
    status=1
    return status

try:
    wdestiny.save(path_EXCEL_Save)
except:
    print('ERROR - Could not save the EXCEL sheet. Check the file permissions')
    status=1
    return status

#Program finishes
return status
Mitchell answered 20/2, 2017 at 12:0 Comment(1)
I ended up using this answer using xlwingsTrachyte
C
26

I had the same problem. For me style, format, and layout were very important. Moreover, I did not want to copy formulas but only the value (of the formulas). After a lot of trail, error, and stackoverflow I came up with the following functions. It may look a bit intimidating but the code copies a sheet from one Excel file to another (possibly existing file) while preserving:

  1. font and color of text
  2. filled color of cells
  3. merged cells
  4. comment and hyperlinks
  5. format of the cell value
  6. the width of every row and column
  7. whether or not row and column are hidden
  8. frozen rows

It is useful when you want to gather sheets from many workbooks and bind them into one workbook. I copied most attributes but there might be a few more. In that case you can use this script as a jumping off point to add more.

###############
## Copy a sheet with style, format, layout, ect. from one Excel file to another Excel file
## Please add the ..path\\+\\file..  and  ..sheet_name.. according to your desire.

import openpyxl
from copy import copy

def copy_sheet(source_sheet, target_sheet):
    copy_cells(source_sheet, target_sheet)  # copy all the cel values and styles
    copy_sheet_attributes(source_sheet, target_sheet)


def copy_sheet_attributes(source_sheet, target_sheet):
    target_sheet.sheet_format = copy(source_sheet.sheet_format)
    target_sheet.sheet_properties = copy(source_sheet.sheet_properties)
    target_sheet.merged_cells = copy(source_sheet.merged_cells)
    target_sheet.page_margins = copy(source_sheet.page_margins)
    target_sheet.freeze_panes = copy(source_sheet.freeze_panes)

    # set row dimensions
    # So you cannot copy the row_dimensions attribute. Does not work (because of meta data in the attribute I think). So we copy every row's row_dimensions. That seems to work.
    for rn in range(len(source_sheet.row_dimensions)):
        target_sheet.row_dimensions[rn] = copy(source_sheet.row_dimensions[rn])

    if source_sheet.sheet_format.defaultColWidth is None:
        print('Unable to copy default column wide')
    else:
        target_sheet.sheet_format.defaultColWidth = copy(source_sheet.sheet_format.defaultColWidth)

    # set specific column width and hidden property
    # we cannot copy the entire column_dimensions attribute so we copy selected attributes
    for key, value in source_sheet.column_dimensions.items():
        target_sheet.column_dimensions[key].min = copy(source_sheet.column_dimensions[key].min)   # Excel actually groups multiple columns under 1 key. Use the min max attribute to also group the columns in the targetSheet
        target_sheet.column_dimensions[key].max = copy(source_sheet.column_dimensions[key].max)  # https://mcmap.net/q/538206/-openpyxl-can-not-read-consecutive-hidden-columns discussed the issue. Note that this is also the case for the width, not onl;y the hidden property
        target_sheet.column_dimensions[key].width = copy(source_sheet.column_dimensions[key].width) # set width for every column
        target_sheet.column_dimensions[key].hidden = copy(source_sheet.column_dimensions[key].hidden)


def copy_cells(source_sheet, target_sheet):
    for (row, col), source_cell in source_sheet._cells.items():
        target_cell = target_sheet.cell(column=col, row=row)

        target_cell._value = source_cell._value
        target_cell.data_type = source_cell.data_type

        if source_cell.has_style:
            target_cell.font = copy(source_cell.font)
            target_cell.border = copy(source_cell.border)
            target_cell.fill = copy(source_cell.fill)
            target_cell.number_format = copy(source_cell.number_format)
            target_cell.protection = copy(source_cell.protection)
            target_cell.alignment = copy(source_cell.alignment)

        if source_cell.hyperlink:
            target_cell._hyperlink = copy(source_cell.hyperlink)

        if source_cell.comment:
            target_cell.comment = copy(source_cell.comment)


wb_target = openpyxl.Workbook()
target_sheet = wb_target.create_sheet(..sheet_name..)

wb_source = openpyxl.load_workbook(..path\\+\\file_name.., data_only=True)
source_sheet = wb_source[..sheet_name..]

copy_sheet(source_sheet, target_sheet)

if 'Sheet' in wb_target.sheetnames:  # remove default sheet
    wb_target.remove(wb_target['Sheet'])

wb_target.save('out.xlsx')
Clematis answered 16/8, 2021 at 9:25 Comment(8)
Works. If only it also handled images...Iguanodon
Update: The only thing you have to do for this magic to handle images is: add line target_sheet._images = copy(source_sheet._images) after target_sheet.freeze_panes = copy(source_sheet.freeze_panes) line in copy_sheet_attributes. Splendid! BTW, this is the second case where I wonder on the obscure reasons (see, I restrained myself and hasn't said dumb) openpyxl devs are hiding features before users. For the first case look here.Iguanodon
Thanks! It works well in excel, but when I open the document in LibreOffice I get the error: "The data could not be loaded completely because the maximum number of rows per sheet was exceeded." - but then it proceeds to load the output file just fine so ¯_(ツ)_/¯Gammon
Thank you!!!!!!!Halftone
To copy auto_filter parameters you need to add target_sheet.auto_filter = copy(source_sheet.auto_filter) to copy_sheet_attributesGershon
To optimize memory using better to use source_sheet._cells.iteritems() rather than source_sheet._cells.items().Gershon
Any idea on how to do this with a "read_only=True" workbook as source?Hubbard
You copied sheet_format, but why are you copying sheet_format.defaultColWidth again? Does it not work when copying sheet_format?Pupillary
L
12

i found a way playing around with it

import openpyxl

xl1 = openpyxl.load_workbook('workbook1.xlsx')
# sheet you want to copy
s = openpyxl.load_workbook('workbook2.xlsx').active
s._parent = xl1
xl1._add_sheet(s)
xl1.save('some_path/name.xlsx')
Leandro answered 21/10, 2021 at 13:53 Comment(6)
This works for me as my expectation...thank you so muchCowell
Great. it's working like a charm.Cowart
Didn't work for me. I get write_stylesheet; xf.alignment = wb._alignments[style.alignmentId]; IndexError: list index out of range on save.Iguanodon
not a bad try but my copied sheet comes out pretty wierd, formatting etc seems random. I am instead going with the solution below from OscarDrubbing
Yeah, great idea to abuse a private method… this will break styles in most cases.Darnel
Update: Ckarlie Clark now killed that workaround with raise ValueError("You cannot add worksheets from another workbook."). Maybe working a usable copy function supporting styles, and/or supporting shapes would have been a more constructive attitude. I'm done with OpenPyxl now.Trek
D
8

You cannot use copy_worksheet() to copy between workbooks because it depends on global constants that may vary between workbooks. The only safe and reliable way to proceed is to go row-by-row and cell-by-cell.

You might want to read the discussions about this feature

Darnel answered 20/2, 2017 at 13:10 Comment(0)
B
5

For speed I am using data_only and read_only attributes when opening my workbooks. Also iter_rows() is really fast, too.

@Oscar's excellent answer needs some changes to support ReadOnlyWorksheet and EmptyCell

# Copy a sheet with style, format, layout, ect. from one Excel file to another Excel file
# Please add the ..path\\+\\file..  and  ..sheet_name.. according to your desire.
import openpyxl
from copy import copy


def copy_sheet(source_sheet, target_sheet):
    copy_cells(source_sheet, target_sheet)  # copy all the cel values and styles
    copy_sheet_attributes(source_sheet, target_sheet)


def copy_sheet_attributes(source_sheet, target_sheet):
    if isinstance(source_sheet, openpyxl.worksheet._read_only.ReadOnlyWorksheet):
        return
    target_sheet.sheet_format = copy(source_sheet.sheet_format)
    target_sheet.sheet_properties = copy(source_sheet.sheet_properties)
    target_sheet.merged_cells = copy(source_sheet.merged_cells)
    target_sheet.page_margins = copy(source_sheet.page_margins)
    target_sheet.freeze_panes = copy(source_sheet.freeze_panes)

    # set row dimensions
    # So you cannot copy the row_dimensions attribute. Does not work (because of meta data in the attribute I think). So we copy every row's row_dimensions. That seems to work.
    for rn in range(len(source_sheet.row_dimensions)):
        target_sheet.row_dimensions[rn] = copy(source_sheet.row_dimensions[rn])

    if source_sheet.sheet_format.defaultColWidth is None:
        print('Unable to copy default column wide')
    else:
        target_sheet.sheet_format.defaultColWidth = copy(source_sheet.sheet_format.defaultColWidth)

    # set specific column width and hidden property
    # we cannot copy the entire column_dimensions attribute so we copy selected attributes
    for key, value in source_sheet.column_dimensions.items():
        target_sheet.column_dimensions[key].min = copy(source_sheet.column_dimensions[key].min)   # Excel actually groups multiple columns under 1 key. Use the min max attribute to also group the columns in the targetSheet
        target_sheet.column_dimensions[key].max = copy(source_sheet.column_dimensions[key].max)  # https://mcmap.net/q/538206/-openpyxl-can-not-read-consecutive-hidden-columns discussed the issue. Note that this is also the case for the width, not onl;y the hidden property
        target_sheet.column_dimensions[key].width = copy(source_sheet.column_dimensions[key].width) # set width for every column
        target_sheet.column_dimensions[key].hidden = copy(source_sheet.column_dimensions[key].hidden)


def copy_cells(source_sheet, target_sheet):
    for r, row in enumerate(source_sheet.iter_rows()):
        for c, cell in enumerate(row):
            source_cell = cell
            if isinstance(source_cell, openpyxl.cell.read_only.EmptyCell):
                continue
            target_cell = target_sheet.cell(column=c+1, row=r+1)

            target_cell._value = source_cell._value
            target_cell.data_type = source_cell.data_type

            if source_cell.has_style:
                target_cell.font = copy(source_cell.font)
                target_cell.border = copy(source_cell.border)
                target_cell.fill = copy(source_cell.fill)
                target_cell.number_format = copy(source_cell.number_format)
                target_cell.protection = copy(source_cell.protection)
                target_cell.alignment = copy(source_cell.alignment)

            if not isinstance(source_cell, openpyxl.cell.ReadOnlyCell) and source_cell.hyperlink:
                target_cell._hyperlink = copy(source_cell.hyperlink)

            if not isinstance(source_cell, openpyxl.cell.ReadOnlyCell) and source_cell.comment:
                target_cell.comment = copy(source_cell.comment)

With a usage something like

    wb = Workbook()
    
    wb_source = load_workbook(filename, data_only=True, read_only=True)
    for sheetname in wb_source.sheetnames:
        source_sheet = wb_source[sheetname]
        ws = wb.create_sheet("Orig_" + sheetname)
        copy_sheet(source_sheet, ws)

    wb.save(new_filename)
Bengali answered 24/9, 2021 at 1:25 Comment(0)
S
4

My workaround goes like this:

You have a template file let's say it's "template.xlsx". You open it, make changes to it as needed, save it as a new file, close the file. Repeat as needed. Just make sure to keep a copy of the original template while testing/messing around.

Slump answered 21/11, 2019 at 9:4 Comment(0)
H
3

I had a similar requirement to collate data from multiple workbooks into one workbook. As there are no inbuilt methods available in openpyxl.

I created the below script to do the job for me.

Note: In my usecase all worbooks contain data in same format.

from openpyxl import load_workbook
import os


# The below method is used to read data from an active worksheet and store it in memory.
def reader(file):
    global path
    abs_file = os.path.join(path, file)
    wb_sheet = load_workbook(abs_file).active
    rows = []
    # min_row is set to 2, to ignore the first row which contains the headers
    for row in wb_sheet.iter_rows(min_row=2):
        row_data = []
        for cell in row:
            row_data.append(cell.value)
        # custom column data I am adding, not needed for typical use cases
        row_data.append(file[17:-6])
        # Creating a list of lists, where each list contain a typical row's data
        rows.append(row_data)
    return rows


if __name__ == '__main__':
    # Folder in which my source excel sheets are present
    path = r'C:\Users\tom\Desktop\Qt'
    # To get the list of excel files
    files = os.listdir(path)
    for file in files:
        rows = reader(file)
        # below mentioned file name should be already created
        book = load_workbook('new.xlsx')
        sheet = book.active
        for row in rows:
            sheet.append(row)
        book.save('new.xlsx')
Herald answered 12/7, 2017 at 12:34 Comment(0)
G
3

I've just found this question. A good workaround, as mentioned here, could consists in modifying the original wb in memory and then saving it with another name. For example:

import openpyxl

# your starting wb with 2 Sheets: Sheet1 and Sheet2
wb = openpyxl.load_workbook('old.xlsx')

sheets = wb.sheetnames # ['Sheet1', 'Sheet2']

for s in sheets:

    if s != 'Sheet2':
        sheet_name = wb.get_sheet_by_name(s)
        wb.remove_sheet(sheet_name)

# your final wb with just Sheet1
wb.save('new.xlsx')
Groovy answered 15/9, 2017 at 13:3 Comment(3)
Missing cell formats, if cell have muliple formats for text(Richtext)Jesseniajessey
This answer just helps with removing the old worksheet, not copying over the new oneEnder
This answer just modify the original and creates a copy with another name/path. What if I don't want to update the original?Girl
A
0

A workaround I use is saving the current sheet as a pandas data frame and loading it to the excel workbook you need

Aksel answered 8/4, 2018 at 8:6 Comment(2)
Can you expand this answer with the source code you used?Gammon
you lose formatting and allExamine
H
0

By using openpyxl - not success with border copy. In my case - were successed using xlwings. It open excel in OS, copy tab to other excel, save, rename and close.

import openpyxl, os
import xlwings as xw

def copy_tab(file_old, tab_source, file_new, tab_destination):
    delete_tab = False
    if not os.path.exists(file_new):
        wb_target = openpyxl.Workbook()
        wb_target.save(file_new)
        delete_tab = True

    wb = xw.Book(file_old)
    app = wb.app
    app.visible = False
    sht = wb.sheets[tab_source]
    new_wb = xw.Book(file_new)
    new_app = new_wb.app
    new_app.visible = False
    sht.api.Copy(None, After=new_wb.sheets[-1].api)
    if delete_tab:
        new_wb.sheets['Sheet'].delete()
    wb.close()
    for sheet in new_wb.sheets:
        if tab_destination in sheet.name:
            sheet.delete()
    new_wb.sheets[tab_source].name = tab_destination
    new_wb.save()
    new_wb.close()

if __name__ == "__main__":
    file_old = r"C:\file_old.xlsx"
    file_new = r"C:\file_new.xlsx"

    copy_tab(file_old, "sheet_old", file_new, "sheet_new")
Hetti answered 26/5, 2023 at 7:37 Comment(0)
T
0

Using deepcopy instead of looping over source_sheet.row_dimensions in Oscar's solution will get rid of the LibreOffice error: "The data could not be loaded completely because the maximum number of rows per sheet was exceeded."

Traumatize answered 16/8, 2023 at 15:18 Comment(1)
for anyone wondering: target_sheet.row_dimensions = deepcopy(source_sheet.row_dimensions) instead of for rn in range(len(source_sheet.row_dimensions)):(...)Inspirational
Y
0

I'd add to @Oscar's answer:

target_sheet.print_options = copy(source_sheet.print_options)
target_sheet.protection = copy(source_sheet.protection)
target_sheet.sheet_state = copy(source_sheet.sheet_state)
target_sheet.views = copy(source_sheet.views)
target_sheet.data_validations = copy(source_sheet.data_validations)
Yamashita answered 10/1 at 16:42 Comment(1)
Your answer could be improved with additional supporting information. Please edit to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers in the help center.Hyrup
M
-1

It actually can be done in a very simple way ! It just need 3 steps :

  1. Open a file using load_workbook

    wb = load_workbook('File_1.xlsx')

  2. Select a sheet you want to copy

    ws = wb.active

  3. use name of the new file to save the file

    wb.save('New_file.xlsx')

This code will save sheet of first file (File_1.xlsx) to the secound file (New_file.xlsx).

Myrmidon answered 13/1, 2023 at 18:7 Comment(2)
This just overwrites New_file.xlsx. Doesn't actually copy the worksheet to another workbook.Beckford
This opens Excel file 'File_1.xlsx' then saves it as 'New_file.xlsx' so it is exactly the same as the original 'File_1.xlsx', same sheets same data. It's not copying a sheet from one existing workbook to another existing workbook which is what is being asked.Methoxychlor

© 2022 - 2024 — McMap. All rights reserved.