UnicodeDecodeError: 'utf-8' when debugging Python files in PyCharm Community
Asked Answered
B

7

7

Current conclusion:

The encoding of the converted file is utf-8->utf-8 big->ansi -> utf-8. Reopen the file after each conversion.

After observing for a period of time, there is no such error.


When I use PyCharm to debug .py files, the same file sometimes has UnicodeDecodeError, sometimes it’s normal. My operating system is Windows 10, PyCharm version is 2020.3.3 Community edition.

The error is as follows:

Traceback (most recent call last):
  File "D:\Program Files\JetBrains\PyCharm Community Edition 2020.3.3\plugins\python-ce\helpers\pydev\_pydevd_bundle\pydevd_comm.py", line 301, in _on_run
    r = r.decode('utf-8')
UnicodeDecodeError: 'utf-8' codec can't decode bytes in position 1022-1023: unexpected end of data

I tried to add the following code to the header of the file, but sometimes I still get an error, how to solve it?

#!/usr/bin/env Python
# coding=utf-8

I found another way to save as a UTF-8 document with Notepad. I tried it, but there are still errors sometimes.

Barbi answered 21/4, 2021 at 6:4 Comment(2)
The default encoding of a Python module (or file) is UTF-8, you can verify the PyCharm encoding settings. Meaning it should not be necessary to specify the encoding at the top of the file (the last 2 lines in the question were used historically but have mostly become unnecessary). You should just change the encoding from within the IDE. This question is not about code, but setting the encoding of your files from within the IDE.Carpus
@Carpus Thanks. In IDE encoding settings, project encoding and properties files are not utf-8, they have been changed.Barbi
C
2

There isn't one single answer to the problem as it is described in the question. A number of issues can cause the indicated error, so it's best to address the several possible factors in the context of the PyCharm IDE.

  1. Every Python file .py (or any other file for that matter) has an encoding. The default encoding of a .py source code file is Unicode UTF-8. This problem is frequently faced by beginners, so lets pinpoint the relevant quotes from the official documentation (to shorten any unnecessary reading time):

    Python’s Unicode Support

    The default encoding for Python source code is UTF-8, so you can simply include a Unicode character in a string literal.

    This means in most circumstances you shouldn't need the encoding string, see Python Source Code Encodings - PEP 263. Current practice is having the source files encoded by default in UTF-8 and omitting the encoding string at the top of the module (this is also more concise).

  2. The PyCharm IDE has a number of encoding configurations that can successively be refined, going from global, to project, to file path. By default, everything should be set to UTF-8, especially the source code. See the official PyCharm documentation Configure file encoding settings.

  3. The exception to the above should be if you are processing external data files, in which case your source code should still be kept as UTF-8 and the data file opened with whatever encoding it requires. Most questions about the UnicodeDecodeError are about specifying the right file encoding when using the open() function to open some data file (they are not about the encoding of the source files where you are writing your code).

  4. When your source files cause this error, a frequent cause is after copy-pasting, or opening, a source code file that is not encoded in UTF-8. (The copy-paste is especially unexpected, when you copy from a file that isn't encoded in UTF-8 and the IDE doesn't automatically convert what you are copy-pasting into the editor). This can cause the said error. So you should narrow down which source code file has the encoding that isn't UTF-8 and convert it.

We don't have access to your project files, but the error message to me reads as the debugger trying to open a user source code file that isn't encoded in UTF-8, contrary to the IDE configurations and module encoding.

File "D:\Program Files\JetBrains\PyCharm Community Edition 2020.3.3\plugins\python-ce\helpers\pydev_pydevd_bundle\pydevd_comm.py"

Carpus answered 21/4, 2021 at 7:38 Comment(4)
This file is a file that comes with PyCharm, I will try to save it as utf-8 encoding format.Barbi
@Barbi no I don't think the problem is a file that came with the PyCharm instalation. It's likely one of the files you wrote (or are trying to execute).Carpus
I modified the encoding settings and observed for a while, thank you.Barbi
Don't try to change any files of your PyCharm installation, if you do you risk breaking the IDE. The exception are a few XML configuration files.Carpus
S
2

I faced the same issue and finally fixed it after changing the case of the python executable used in my interpreter settings, as explained here. Long story short, sometimes PyCharm try to execute the symlink in the venv directory using Python (with the "P" in uppercase) instead of python. After changing this I could debug again.

Scoter answered 16/3, 2022 at 18:58 Comment(2)
This is a little different from me, and my OS is windows.Barbi
I don't know how the Windows version works, but you may try to follow the same steps: check if the selected interpreter in your path appears in uppercase and change it if soScoter
A
1

My Answer

Hi, i've been solve the problem, i followed this blog but in my case it has been python (with lower case), so i try to change

from : {your project path}/venv/bin/python
to : {your project path}/venv/bin/python3.8

i don't know why, but it's works for me.

Allotrope answered 29/9, 2022 at 17:36 Comment(0)
A
1

The problem was resolved by removing all breakpoints in the project and setting breakpoints again at the necessary locations.

This probrem is caused by encode. When you try debug the project, Pycharm try to decode files.

I don’t know detail specification of Pycharm debug function. When debug is started, is all files of project decoded? files that have break point? I don't know.

But because my project have many files, so I remove all breakpoint from my project files.

And problem is resolved.

Of course, ALL files of my project is encoded by UTF-8.

Algar answered 24/1 at 16:20 Comment(2)
As it’s currently written, your answer is unclear. Please edit to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers in the help center.Malnourished
This does not provide an answer to the question. Once you have sufficient reputation you will be able to comment on any post; instead, provide answers that don't require clarification from the asker. - From ReviewElgon
S
0

Look at Stefan Ukena's answer on this thread:
https://youtrack.jetbrains.com/issue/PY-14497#focus=Comments-27-5243196.0-0

Quote in case the link dies:

You might need to change your Python Interpreter in the Pycharm > settings. In my case (using pipenv), it was pointing to /Library/.../bin/Python with an uppercase P. Opening the folder and checking, I found that the file or symlink was actually python with a lowercase p. Changing it from .../Python to .../python in the Pycharm settings fixed this problem. (I had to restart Pycharm afterwards.)

It helped me too, but only when I switched to pipenv instead of the usual venv. I changed Python to python and the debugger worked, but I'm still getting errors/warnings:

OSError: [Errno 9] Bad file descriptor 

But it works anyway. Without the debugger, it works as expected and without the error above.

Sst answered 11/4, 2022 at 7:59 Comment(0)
B
0

If you are using PyCharm on Windows, the following may be helpful. If you have configured your settings, try replacing :

--epochs
5
--batch_size
4
--device
0
--train_path
data/train_novel.pkl
--save_model_path
./model/novel

with

--epochs
5
--batch_size
4
--device
0
--train_path
data\train_novel.pkl
--save_model_path
.\model\novel

to see if it helps!

Bloodroot answered 30/5 at 3:3 Comment(0)
G
-2

Add 'ignore'. It worked.

r = r.decode('utf-8', 'ignore')

https://docs.python.org/3/howto/unicode.html

Gingras answered 21/4, 2021 at 6:17 Comment(1)
The 10th line of my code is commented out with #. Is it related to the \n at the end of the line?Barbi

© 2022 - 2024 — McMap. All rights reserved.