How do I download only unread attachments from a specific gmail label?
Asked Answered
P

1

5

I have a Python script adapted from Downloading MMS emails sent to Gmail using Python

import email, getpass, imaplib, os

detach_dir = '.' # directory where to save attachments (default: current)
user = raw_input("Enter your GMail username:")
pwd = getpass.getpass("Enter your password: ")

# connecting to the gmail imap server
m = imaplib.IMAP4_SSL("imap.gmail.com")
m.login(user,pwd)
m.select("[Gmail]/All Mail") # here you a can choose a mail box like INBOX instead
# use m.list() to get all the mailboxes

resp, items = m.search(None, 'FROM', '"Impact Stats Script"') # you could filter using the IMAP rules here (check http://www.example-code.com/csharp/imap-search-critera.asp)
items = items[0].split() # getting the mails id

for emailid in items:
    resp, data = m.fetch(emailid, "(RFC822)") # fetching the mail, "`(RFC822)`" means "get the whole stuff", but you can ask for headers only, etc
    email_body = data[0][1] # getting the mail content
    mail = email.message_from_string(email_body) # parsing the mail content to get a mail object

    #Check if any attachments at all
    if mail.get_content_maintype() != 'multipart':
        continue

    print "["+mail["From"]+"] :" + mail["Subject"]

    # we use walk to create a generator so we can iterate on the parts and forget about the recursive headach
    for part in mail.walk():
        # multipart are just containers, so we skip them
        if part.get_content_maintype() == 'multipart':
            continue

        # is this part an attachment ?
        if part.get('Content-Disposition') is None:
            continue

        filename = part.get_filename()
        counter = 1

        # if there is no filename, we create one with a counter to avoid duplicates
        if not filename:
            filename = 'part-%03d%s' % (counter, 'bin')
            counter += 1

        att_path = os.path.join(detach_dir, filename)

        #Check if its already there
        if not os.path.isfile(att_path) :
            # finally write the stuff
            fp = open(att_path, 'wb')
            fp.write(part.get_payload(decode=True))
            fp.close()

I am filtering messages by subject and getting the attachments, but now I need to only get attachments from new emails. Can I modify the m.search() somehow to return only unread emails?

Presocratic answered 16/4, 2012 at 22:17 Comment(2)
What does it mean for an attachment to be new? Once an email is sent, the attachments are fixed...Os
I mean new emails with attachments. I'll edit the question.Presocratic
O
8

Try modifying this line:

resp, items = m.search(None, 'FROM', '"Impact Stats Script"')

to:

resp, items = m.search(None, 'UNSEEN', 'FROM', '"Impact Stats Script"')

The Python imaplib documentation shows just adding more search criteria, and the IMAP specification defines the UNSEEN search criteria:

  UNSEEN
     Messages that do not have the \Seen flag set.
Os answered 16/4, 2012 at 23:9 Comment(3)
NEW doesn't work, but just UNSEEN by itself does. Thanks. If you'll change NEW to UNSEEN in your answer, I'll mark it as accepted. It wasn't clear to me that criteria could just be added to search like that.Presocratic
Hi I am new to gmail scripts. Does this script still work and if so where do I execute it?Spiritless
@Adam, as far as I know gmail does not have scripts. This is a Python script that you can execute using your Python interpreter.Os

© 2022 - 2024 — McMap. All rights reserved.