I want to extract the whole message body of mail using gmail api. Right now i am using 'snippets' but i need the entire text. I searched and found that it's something to do with payload, but i didn't understand how. Can someone show me an example? Also, I am using the Gmail api via python.
same as noogui said but I found that snippet wont return the whole body
when the snippet exceed 200~ chars you will get it under payload.body.data you can find the whole body using payload.body.data
the catch is that you are getting it base64encoded so you need to decode it :)
the following code will do the trick
import base64
mail = service.users().messages().get(userId=user_id, id=id, format="full").execute()
def parse_msg(msg):
if msg.get("payload").get("body").get("data"):
return base64.urlsafe_b64decode(msg.get("payload").get("body").get("data").encode("ASCII")).decode("utf-8")
return msg.get("snippet")
The gmail api docs provides sample code that showcases how to return the full message body as a message object structure. However, the code that they provided doesn't work for Python3. If you want to use to do this in Python3, you need to change their email.message_from_string()
to email.message_from_bytes
. Not sure exactly which module it was that changed this to make this issue happen, but the code below works just fine for me Python3.7.4
import base64
import email
message = gmail_conn.users().messages().get(userId=u_id, id=m_id, format='raw').execute()
msg_str = base64.urlsafe_b64decode(message['raw'].encode('ASCII'))
mime_msg = email.message_from_bytes(msg_str)
print(mime_msg)
Use Users.messages.get from the docs where there's a Python snippet:
import base64
import email
from apiclient import errors
def GetMessage(service, user_id, msg_id):
try:
message = service.users().messages().get(userId=user_id, id=msg_id).execute()
print 'Message snippet: %s' % message['snippet']
return message
except errors.HttpError, error:
print 'An error occurred: %s' % error
def GetMimeMessage(service, user_id, msg_id):
try:
message = service.users().messages().get(userId=user_id, id=msg_id, format='raw').execute()
print 'Message snippet: %s' % message['snippet']
msg_str = base64.urlsafe_b64decode(message['raw'].encode('ASCII'))
mime_msg = email.message_from_string(msg_str)
return mime_msg
except errors.HttpError, error:
print 'An error occurred: %s' % error
What your trying to access is the payload.body or if you want to go further, payload.body.data.
Now, you can definitely do it like so:
def main():
service = build('gmail', 'v1', credentials=creds)
for message in service.users().messages().list(userId=...).execute()['messages']:
print(parse_msg(
service.users().messages().get(userId='me', id=message['id'], format='raw').execute()
))
def parse_msg(msg):
return base64.urlsafe_b64decode(msg['raw'].encode('ASCII')).decode('utf-8')
But I felt like you get a little bit too much data this way, so I started to do like this:
def main():
service = build('gmail', 'v1', credentials=creds)
for message in service.users().messages().list(userId='me').execute()['messages']:
print(parse_msg(
service.users().messages().get(userId=..., id=message['id'], format='full').execute()
))
def parse_msg(msg):
payload = msg['payload']
if data := payload['body'].get('data'):
return parse_data(data)
return ''.join(parse_data(part['body']['data']) for part in payload['parts'])
def parse_data(data):
return base64.urlsafe_b64decode(data.encode('ASCII')).decode('utf-8')
Above you can see the modified version of the code that was posted here by Omer Shacham. This modification does so that you get all the payload data consistently, I think.
I wrote this script which extracts the full email body in text/plain
import base64
import email
class GmailAPI():
def get_service():
## You can get the gmail service snippet from here
# https://developers.google.com/gmail/api/quickstart/python
pass
class GmailParser():
def data_encoder(self, text):
if text and len(text)>0:
message = base64.urlsafe_b64decode(text.encode('UTF8'))
message = str(message, 'utf-8')
message = email.message_from_string(message)
return message
else:
return None
def read_message(self, content)->str:
import copy
if content.get('payload').get('parts', None):
parts = content.get('payload').get('parts', None)
sub_part = copy.deepcopy(parts[0])
while sub_part.get("mimeType", None) != "text/plain":
try:
sub_part = copy.deepcopy(sub_part.get('parts', None)[0])
except Exception as e:
break
return self.data_encoder(sub_part.get('body', None).get('data', None)).as_string()
else:
return content.get("snippet")
gmail_parser = GmailParser()
gmail_service = GmailAPI()
mail = gmail_service.users().messages().list(userId='me', labelIds=['INBOX']).execute()
messages = mail.get('messages')
for email in messages:
message = gmail_service.users().messages().get(userId='me', id=email['id'], format="full").execute()
data = gmail_parser.read_message(content=message)
© 2022 - 2024 — McMap. All rights reserved.