How to retrieve the whole message body using Gmail API (python)
Asked Answered
K

5

12

I want to extract the whole message body of mail using gmail api. Right now i am using 'snippets' but i need the entire text. I searched and found that it's something to do with payload, but i didn't understand how. Can someone show me an example? Also, I am using the Gmail api via python.

Kindergartner answered 31/5, 2018 at 17:54 Comment(0)
L
14

same as noogui said but I found that snippet wont return the whole body

when the snippet exceed 200~ chars you will get it under payload.body.data you can find the whole body using payload.body.data

the catch is that you are getting it base64encoded so you need to decode it :)

the following code will do the trick

import base64  
mail = service.users().messages().get(userId=user_id, id=id, format="full").execute()

def parse_msg(msg):
    if msg.get("payload").get("body").get("data"):
        return base64.urlsafe_b64decode(msg.get("payload").get("body").get("data").encode("ASCII")).decode("utf-8")
    return msg.get("snippet") 
Lifeguard answered 8/1, 2020 at 8:4 Comment(4)
Is there a way to get the plan text body. Not seeing it in the docs?Cayes
@Cayes its not written in the docs you can check some cases and see that short mails are in snippet while long ones are in payload.body.dataLifeguard
After so much of back and forth this worked for me. Thank you @OmerShachamWilliemaewillies
this still gives partial msgStammer
A
3

The gmail api docs provides sample code that showcases how to return the full message body as a message object structure. However, the code that they provided doesn't work for Python3. If you want to use to do this in Python3, you need to change their email.message_from_string() to email.message_from_bytes. Not sure exactly which module it was that changed this to make this issue happen, but the code below works just fine for me Python3.7.4

import base64
import email

message = gmail_conn.users().messages().get(userId=u_id, id=m_id, format='raw').execute()
msg_str = base64.urlsafe_b64decode(message['raw'].encode('ASCII'))
mime_msg = email.message_from_bytes(msg_str)

print(mime_msg)
Amesace answered 24/2, 2020 at 21:45 Comment(0)
P
2

Use Users.messages.get from the docs where there's a Python snippet:

import base64
import email
from apiclient import errors

def GetMessage(service, user_id, msg_id):

  try:
    message = service.users().messages().get(userId=user_id, id=msg_id).execute()
    print 'Message snippet: %s' % message['snippet']
    return message
  except errors.HttpError, error:
    print 'An error occurred: %s' % error

def GetMimeMessage(service, user_id, msg_id):

  try:
    message = service.users().messages().get(userId=user_id, id=msg_id, format='raw').execute()
    print 'Message snippet: %s' % message['snippet']
    msg_str = base64.urlsafe_b64decode(message['raw'].encode('ASCII'))
    mime_msg = email.message_from_string(msg_str)

    return mime_msg
  except errors.HttpError, error:
    print 'An error occurred: %s' % error

What your trying to access is the payload.body or if you want to go further, payload.body.data.

Pathology answered 1/6, 2018 at 11:11 Comment(0)
L
1

Now, you can definitely do it like so:

def main():
    service = build('gmail', 'v1', credentials=creds)
    for message in service.users().messages().list(userId=...).execute()['messages']:
        print(parse_msg(
            service.users().messages().get(userId='me', id=message['id'], format='raw').execute()
        ))

def parse_msg(msg):
    return base64.urlsafe_b64decode(msg['raw'].encode('ASCII')).decode('utf-8')

But I felt like you get a little bit too much data this way, so I started to do like this:

def main():
    service = build('gmail', 'v1', credentials=creds)
    for message in service.users().messages().list(userId='me').execute()['messages']:
        print(parse_msg(
            service.users().messages().get(userId=..., id=message['id'], format='full').execute()
        ))

def parse_msg(msg):
    payload = msg['payload']
    if data := payload['body'].get('data'):
        return parse_data(data)

    return ''.join(parse_data(part['body']['data']) for part in payload['parts'])

def parse_data(data):
    return base64.urlsafe_b64decode(data.encode('ASCII')).decode('utf-8')

Above you can see the modified version of the code that was posted here by Omer Shacham. This modification does so that you get all the payload data consistently, I think.

Lithology answered 13/10, 2023 at 16:14 Comment(0)
R
-1

I wrote this script which extracts the full email body in text/plain

import base64
import email


class GmailAPI():
    def get_service():
        ## You can get the gmail service snippet from here
        # https://developers.google.com/gmail/api/quickstart/python
        pass

class GmailParser():
    def data_encoder(self, text):
            if text and len(text)>0:
                message = base64.urlsafe_b64decode(text.encode('UTF8'))
                message = str(message, 'utf-8')
                message = email.message_from_string(message)
                return message
            else:
                return None
        

    def read_message(self, content)->str:
        import copy
        if content.get('payload').get('parts', None):
            parts = content.get('payload').get('parts', None)
            sub_part = copy.deepcopy(parts[0])
            while sub_part.get("mimeType", None) != "text/plain":
                try:
                    sub_part = copy.deepcopy(sub_part.get('parts', None)[0])
                except Exception as e:
                    break
            return self.data_encoder(sub_part.get('body', None).get('data', None)).as_string()
        else:
            return content.get("snippet")

gmail_parser = GmailParser()
gmail_service = GmailAPI()
mail = gmail_service.users().messages().list(userId='me', labelIds=['INBOX']).execute()
messages = mail.get('messages')
for email in messages:
    message = gmail_service.users().messages().get(userId='me', id=email['id'], format="full").execute()
    data = gmail_parser.read_message(content=message)
Romish answered 9/9, 2022 at 15:5 Comment(1)
Those two might be convinced to retract, or others to do the opposite, if in addition to your pure code answer, you would add some explanation of how your code works and why it helps. Or otherwise work towards How to Answer.Leeke

© 2022 - 2024 — McMap. All rights reserved.