Since Python 3.9 (bpo-39380) the default encoding of ftplib.FTP
has been set to utf-8
, and the FTP constructor offers an encoding
parameter.
ftp = ftplib.FTP(..., encoding='utf-8')
However this change is probably an incompatible and bogus solution, broke a lot of existing code, and the nlst
commands etc. since that break with UnicodeDecodeError
as soon as random files on a server have names with do not do have proper utf-8 encoding.
(You can't robustly read / inspect an unknown directory safely with the new default. And OPTS UTF8 ON
FTP command has not been established and is usually a no-op on unix servers with 8bit file names - it effectively only allows 8bit transfer instead of 7bit ASCII for commands, which happens / happened as well without the command.).
The formerly fixed latin-1
encoding attribute of FTP
, which is applied to the whole command traffic, not just filenames, was probably never meant to be a default encoding. No RFC or so ever recommended / preferred latin-1
. But to enable an unrestricted 8-bit API for the 8-bit FTP protocol - though on PY3 with 8bit pseudo strings in the form of normal comfortable (unicode) strings using the lowest 256 chars.
For robust FTP its currently necessary to switch the internal encoding of ftplib.FTP back (in Py3.9+) to latin-1 / pseudo 8bit strings, do inspections maybe on the pseudo 8bit strings, and do the (optionally error-tolerant) encoding / decoding outside like this:
ftp.encoding = 'latin-1' # or ftp = FTP(..., encoding='latin-1')
...
fn_ftp_utf8 = fn.encode('utf-8').decode('latin-1')
...
fn = fn_ftp_utf8.encode('latin-1').decode('utf-8', 'backslashreplace')
In future there may be / should be at least an errors
parameter for ftplib.FTP
in addition to the encoding
parameter in order to allow simple use cases for the automatic encoding mode.