python imap: how to parse multipart mail content
Asked Answered
V

3

7

A mail can contain different blocks like:

--0016e68deb06b58acf04897c624e
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
content_1
...

--0016e68deb06b58acf04897c624e
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
content_2
... and so on

How can I get content of each block with python?
And also how to get properties of each block? (content-type, etc..)

Victualler answered 4/11, 2010 at 8:34 Comment(0)
S
12

For parsing emails I have used Message.walk() method like this:

if msg.is_multipart():
    for part in msg.walk():
        ...

For content you can try: part.get_payload(). For content-type there is: part.get_content_type()

You will find documetation here: http://docs.python.org/library/email.message.html

You can also try email module with its iterators.

Syzran answered 4/11, 2010 at 9:40 Comment(3)
thanks! I haven't read that get_payload() returns a list of messages!Victualler
Could you please suggest me how to access the excel attachment. When I try using part.get_payload() the data comes in byte format and I have no idea how to save the excel(.xlsx) file to a variableApplique
if you use get_payload(decode=True) the library automatically decodes quoted-printableand base64 content. From the documentation: "Optional decode is a flag indicating whether the payload should be decoded or not, according to the Content-Transfer-Encoding header. When True and the message is not a multipart, the payload will be decoded if this header’s value is quoted-printable or base64" - docs.python.org/3.4/library/email.message.htmlBachman
F
2

http://docs.python.org/library/email.html

A very simple example (msg_as_str contains the raw bytes you got from the imap server):

import email
msg = email.message_from_string(msg_as_str)
print msg["Subject"]
Farmstead answered 4/11, 2010 at 9:37 Comment(0)
T
1

I have wrote this code. You can use it if you like it for parsing multipart content:

if mime_msg.is_multipart():
        for part in mime_msg.walk():
            if part.is_multipart():
                for subpart in part.get_payload():
                    if subpart.is_multipart():
                        for subsubpart in subpart.get_payload():
                            body = body + str(subsubpart.get_payload(decode=True)) + '\n'
                    else:
                        body = body + str(subpart.get_payload(decode=True)) + '\n'
            else:
                body = body + str(part.get_payload(decode=True)) + '\n'
else:
    body = body + str(mime_msg.get_payload(decode=True)) + '\n'

body = bytes(body,'utf-8').decode('unicode-escape')

And if you want to take out in plain text then convert body into html2text.HTML2Text()

Trunkfish answered 29/11, 2018 at 9:21 Comment(1)
stackoverflow.com/users/10538706/user10538706 how would you know the depth of the part in above code?Barm

© 2022 - 2024 — McMap. All rights reserved.