Openpyxl.utils.exceptions.IllegalcharacterError
Asked Answered
N

7

24

I have the following python code to write processed words into excel file. The words are about 7729

From openpyxl import *
book=Workbook ()
sheet=book.active
sheet.title="test"
for x in range (7729):
    sheet.cell (row=1,column=x+1).value=x
book.save ('test.xlsx')

This is the what the code I used looks like, but when I run it, it gives me an error that says

openpyxl.utils.exceptions.IllegalCharacterError

This is my first time using this module, I would appreciate any kind of help.

Nepotism answered 15/4, 2018 at 17:38 Comment(9)
what is your .value= content ?Sobriquet
I am sorry I forget to right the valueNepotism
can you share the outcome of this print (sys.getdefaultencoding()) ?Sobriquet
the encoding is utf-8Nepotism
please add your python version as well it will help people to reproduce the issue.Sobriquet
I am using python 3.6.Nepotism
exception openpyxl.utils.exceptions.IllegalCharacterError Reason : The data submitted which cannot be used directly in Excel files. It must be removed or escaped. so the issue is some how with your data i.e ur x valueSobriquet
The data is a string, specifically a bunch of english letters.Nepotism
that must be the issue then, you need to either escape data or try to forcefully encode it with utf-8 and ignore errors if you can live without unprintable chars in your dataSobriquet
S
1

Try this : This code works for me .

from openpyxl import *
book=Workbook ()
sheet=book.active
sheet.title="test"
x = 0
with open("temp.txt") as myfile :
    text = myfile.readline()
    while text !="":
            sheet.cell (row=1,column=x+1).value=str(text).encode("ascii",errors="ignore")
            x+=1
            text = myfile.readline()

book.save ('test.xlsx')
Sobriquet answered 15/4, 2018 at 18:37 Comment(6)
In the real program x is a string what should I do for that, should I remove the int function and use the aboveNepotism
Thank you so much but it seems this din't work. If this helps when I do like sheet ['A1']=1 or sheet ['A1']='hi' works.Nepotism
can i see your sample data please ? and they way you are reading that data from the source ?Sobriquet
'Kitaabota', 'seenaa', 'yesuus',and yes i am reading it from a source and the source is a txt file.Nepotism
can i see the code you are using to read the text file and exact one line for text file. that will help me to reproduce the issueSobriquet
Let us continue this discussion in chat.Nepotism
S
21

openpyxl comes with an illegal characters regular expression, ready for you to use. Presuming you're happy to simply remove these characters, you can do:

import re
from openpyxl.cell.cell import ILLEGAL_CHARACTERS_RE
from openpyxl import *

book=Workbook ()
sheet=book.active
sheet.title="test"
for x in range (7729):
   sheet.cell (row=1,column=x+1).value = ILLEGAL_CHARACTERS_RE.sub(r'',x)
book.save ('test.xlsx')

To speed it up, you could put the original cell value assignment inside a try/except and only run the re substitution when an openpyxl.utils.exceptions.IllegalCharacterError is caught.

Source: https://www.programmersought.com/article/43315246046/

Selfreproach answered 29/7, 2021 at 15:16 Comment(0)
V
9

I faced similar issue and found out that it is because of \xa1 character which is hex value of ascii 26 (SUB). Openpyxl is not allowing to write such characters (ascii code < 32). I tried xlsxwriter library without any issue it worte this character in xlsx file.

Vezza answered 20/7, 2020 at 15:22 Comment(0)
S
1

Try this : This code works for me .

from openpyxl import *
book=Workbook ()
sheet=book.active
sheet.title="test"
x = 0
with open("temp.txt") as myfile :
    text = myfile.readline()
    while text !="":
            sheet.cell (row=1,column=x+1).value=str(text).encode("ascii",errors="ignore")
            x+=1
            text = myfile.readline()

book.save ('test.xlsx')
Sobriquet answered 15/4, 2018 at 18:37 Comment(6)
In the real program x is a string what should I do for that, should I remove the int function and use the aboveNepotism
Thank you so much but it seems this din't work. If this helps when I do like sheet ['A1']=1 or sheet ['A1']='hi' works.Nepotism
can i see your sample data please ? and they way you are reading that data from the source ?Sobriquet
'Kitaabota', 'seenaa', 'yesuus',and yes i am reading it from a source and the source is a txt file.Nepotism
can i see the code you are using to read the text file and exact one line for text file. that will help me to reproduce the issueSobriquet
Let us continue this discussion in chat.Nepotism
S
1
import xlsxwriter
df.to_excel('abc.xlsx', engine='xlsxwriter')
Sectorial answered 8/5 at 6:57 Comment(1)
Thank you for contributing to the Stack Overflow community. This may be a correct answer, but it’d be really useful to provide additional explanation of your code so developers can understand your reasoning. This is especially useful for new developers who aren’t as familiar with the syntax or struggling to understand the concepts. Would you kindly edit your answer to include additional details for the benefit of the community?Volturno
L
0

You missed to add the value for cell sheet.cell (row=1,column=x+1).value =

Try like this

from openpyxl import *
book = Workbook ()
sheet = book.active
sheet.title = "test"
for x in range (7):
    sheet.cell (row=1,column=x+1).value = "Hello"
book.save ('test.xlsx')
Lickerish answered 15/4, 2018 at 17:42 Comment(0)
F
0

The best easy solution is to install Xlwriter.

pip install XlsxWriter

Feudality answered 6/7, 2023 at 9:21 Comment(0)
A
0

If I understand your question correctly, you have a string x of len(x)=7729 and while filling parts of the string into cells you encounter the error: openpyxl.utils.exceptions.IllegalCharacterError.

Befor running your string through the loop you could replace all problematic substrings with an empty string ('') in the string, using the ILLEGAL_CHARACTERS_RE pattern from https://stackoverflow.com/a/68578582.

import re    
from openpyxl.cell.cell import ILLEGAL_CHARACTERS_RE
    
x = 'string containing substrings like \x0c or \x0b...'
clean_x= re.sub(ILLEGAL_CHARACTERS_RE, '', x)
print(clean_x)
...'string containing substrings like  or ...'

The resulting clean_x you can just use in your loop as before.

Cleaning the string beforehand in one line, is likely much faster, then doing it in every loop step (doing it once vs doing it 7729 times).

Alternatively, if you need to retain a representation of the problematic substrings, you could use the dictionary (escape_xlsx_char) from this post How to handle the exception ‘IllegalCharacterError’ in openpyxl to get an exscape of the problematic substrings.

Andradite answered 2/7 at 14:43 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.