openpyxl cannot read Strict Open XML Spreadsheet format: UserWarning: File contains an invalid specification for Sheet1. This will be removed
Asked Answered
D

5

14

A few of my users (all of whom use Mac) have uploaded an Excel into my application, which then rejected it because the file appeared to be empty. After some debugging, I've determined that the file was saved in Strict Open XML Spreedsheet format, and that openpyxl (2.6.0) doesn't issue an error, but rather prints a warning to stderr.

To reproduce, open a file, add a few rows and save as Strict Open XML Spreedsheet (*.xlsx) format.

import openpyxl
with open('excel_open_strict.xlsx', 'rb') as f:
    workbook = openpyxl.load_workbook(filename=f)

This will print the following warning, but will not throw any exception:

UserWarning: File contains an invalid specification for Sheet1. This will be removed

Furthermore, the workbook appears to have no sheets:

assert workbook.get_sheet_names() == []

I've now had three Mac users experience this issue. It seems like Mac will sometimes default to using this Strict Open XML Spreedsheet format. If this is a normal case, then openpyxl should be able to handle it. Otherwise, it would be great if openpyxl would just throw an exception. As a workaround, it seems I can do the following:

import openpyxl
with open('excel_open_strict.xlsx', 'rb') as f:
    workbook = openpyxl.load_workbook(filename=f)

if not workbook.get_sheet_names():
    raise Exception("The Excel was saved in an incorrect format")
Delisadelisle answered 8/7, 2020 at 17:55 Comment(0)
B
7

I had similar problems with XLSX files created using the R library openxlsx. A sample error message from a simple python program to open the file and retrieve a single value from sheet Crops:

Warning (from warnings module): File "C:\Python38\lib\site-packages\openpyxl\reader\workbook.py", line 88 warn(msg) UserWarning: File contains an invalid specification for Crops. This will be removed

My first, very clumsy solution:

  1. Open with Excel
  2. Save the file as *.xls, which triggered a warning about compatibility.
  3. Re-save as *.xlsx

My second solution works if you only need to read the file:

  1. Impose a read-only restriction: wb = load_workbook(filename = 'CAF_LTAR_crops_out_0.3.xlsx', read_only=True)

The broad lesson seems to be that the XLSX file specification is not uniformly (correctly?) implemented across programming languages.

Barrington answered 17/7, 2020 at 19:2 Comment(0)
B
4

I am working with a Windows PC and I had the same Problem with openpyxl. I got an excel template that was saved as Strict Open XML Spreadsheet (*.xlsx). I tried to fill out the template but I got always a fault message for each work sheet as below and when I tried to print the array with all worksheet names was empty [].

UserWarning: File contains an invalid specification for Sheetname. This will be removed

Solution

I saved the file as Excel Workbook (*.xlsx) and not as Strict Open XML Spreadsheet (*.xlsx). After that I had no fault message, the array included all Worksheets and I could fill out the template with openpyxl.

Braiding answered 24/5, 2022 at 7:23 Comment(0)
P
3

Maybe you can use calamine, A pure Rust Excel/OpenDocument SpreadSheets file reader: rust on metal sheets.

In python, it is python-calamine.

Install

pip install python-calamine
# Install python_calamine will be same.

e.g.

With pandas:

import pandas as pd
from python_calamine.pandas import pandas_monkeypatch; pandas_monkeypatch()

df = pd.read_excel("path/to/file.xlsx", "sheet-name", engine="calamine")

It can load Strict Open XML Spreadsheet format and Excel Workbook format both.

Piggy answered 21/2, 2024 at 4:33 Comment(0)
F
0

I started having this issue due to the windows update that came out a couple days ago. It appears that I had to add some code for the time being to convert the xls file to and xlsx file first instead of opening the file and saving it directly.

Ferric answered 11/10, 2023 at 16:20 Comment(1)
This does not provide an answer to the question. Once you have sufficient reputation you will be able to comment on any post; instead, provide answers that don't require clarification from the asker. - From ReviewAriellearies
D
0

I would agree with the prior answer to use calamine. However, the calamine engine is now included with pandas as of version 2.2.

Install

pip install pandas[excel]

In code add

import pandas as pd

pd.read_excel("path_to_file.xlsb", engine="calamine")
Depositor answered 21/8, 2024 at 20:24 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.