Encode MIMEText as quoted printables
Asked Answered
T

3

13

Python supports a quite functional MIME-Library called email.mime.

What I want to achieve is to get a MIME Part containing plain UTF-8 text to be encoded as quoted printables and not as base64. Although all functionallity is available in the library, I did not manage to use it:

Example:

import email.mime.text, email.encoders
m=email.mime.text.MIMEText(u'This is the text containing ünicöde', _charset='utf-8')
m.as_string()
# => Leads to a base64-encoded message, as base64 is the default.

email.encoders.encode_quopri(m)
m.as_string()
# => Leads to a strange message

The last command leads to a strange message:

Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: base64
Content-Transfer-Encoding: quoted-printable

GhpcyBpcyB0aGUgdGV4dCBjb250YWluaW5nIMO8bmljw7ZkZQ=3D=3D

This is obviously not encoded as quoted printables, the double transfer-encoding header is strange at last (if not illegal).

How can I get my text encoded as quoted printables in the mime-message?

Thrush answered 18/2, 2013 at 14:50 Comment(3)
See also https://mcmap.net/q/904388/-how-do-i-use-python-3-2-email-module-to-send-unicode-messages-encoded-in-utf-8-with-quoted-printable -- the question is Python 3, but I have used it in Python 2 as well.Heeled
For Python 3.6+ see also now #66040215Heeled
Similar to Python send email with "quoted-printable" transfer-encoding and "utf-8" content-encodingShannashannah
T
14

Okay, I got one solution which is very hacky, but at least it leads into some direction: MIMEText assumes base64 and I don't know how to change this. For this reason I use MIMENonMultipart:

import email.mime, email.mime.nonmultipart, email.charset
m=email.mime.nonmultipart.MIMENonMultipart('text', 'plain', charset='utf-8')

#Construct a new charset which uses Quoted Printables (base64 is default)
cs=email.charset.Charset('utf-8')
cs.body_encoding = email.charset.QP

#Now set the content using the new charset
m.set_payload(u'This is the text containing ünicöde', charset=cs)

Now the message seems to be encoded correctly:

Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable

This is the text containing =C3=BCnic=C3=B6de

One can even construct a new class which hides the complexity:

class MIMEUTF8QPText(email.mime.nonmultipart.MIMENonMultipart):
  def __init__(self, payload):
    email.mime.nonmultipart.MIMENonMultipart.__init__(self, 'text', 'plain',
                                                      charset='utf-8')

    utf8qp=email.charset.Charset('utf-8')
    utf8qp.body_encoding=email.charset.QP

    self.set_payload(payload, charset=utf8qp) 

And use it like this:

m = MIMEUTF8QPText(u'This is the text containing ünicöde')
m.as_string()
Thrush answered 18/2, 2013 at 15:16 Comment(0)
M
8

In Python 3 you do not need your hack:

import email

# Construct a new charset which uses Quoted Printables (base64 is default)
cs = email.charset.Charset('utf-8')
cs.body_encoding = email.charset.QP

m = email.mime.text.MIMEText(u'This is the text containing ünicöde', 'plain', _charset=cs)

print(m.as_string())
Mangan answered 2/8, 2019 at 8:16 Comment(1)
to be fair the hack was needed in Python 2. Your answer only works with Python 3. So basically you could say the original issue can be solved by switching to Python 3.Directional
N
5

Adapted from issue 1525919 and tested on python 2.7:

from email.Message import Message
from email.Charset import Charset, QP

text = "\xc3\xa1 = \xc3\xa9"
msg = Message()

charset = Charset('utf-8')
charset.header_encoding = QP
charset.body_encoding = QP

msg.set_charset(charset)
msg.set_payload(msg._charset.body_encode(text))

print msg.as_string()

will give you:

MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable

=C3=A1 =3D =C3=A9

Also see this response from a Python committer.

Nagpur answered 28/5, 2013 at 12:57 Comment(1)
I missed at first that the input to body_encode must already be utf-8 encoded, and that it doesn't do the utf-8 encoding for you. Noting this here in case it saves others the pain of the same misunderstanding.Domination

© 2022 - 2024 — McMap. All rights reserved.