What protocol does Google use for Gmail? (not IMAP or POP)
Asked Answered
F

4

26

You can access gmail either using the web interface, Google's Android client or using IMAP. As far as I can tell, the web interface and the Android app uses a completely different protocol than IMAP -- they are not just interfaces on top of it. The reason I'm sure of that is because the Android app can without problem open a folder with 1m mail in < 3 seconds. No plain IMAP client can do that.

So my question is what is known about this secret protocol? Where is the reference documentation for it? Has it been reverse engineered? Does Google sanction its use?

arnt's answer provides an excellent method to test gmail's raw imap speed:

$ openssl s_client -host imap.gmail.com -port 993 -crlf 
...
* OK Gimap ready for requests from 12.34.56.78
$ a LOGIN ***@*** ***
a OK
$ c SELECT "[Gmail]/All mail" !!!!
* FLAGS (\Answered \Flagged \Draft \Deleted \Seen)
* OK [PERMANENTFLAGS (\Answered \Flagged \Draft \Deleted \Seen \*)] Flags permitted.
* OK [UIDVALIDITY 673376278] UIDs valid.
* 1142417 EXISTS
* 0 RECENT
* OK [UIDNEXT 1159771] Predicted next UID.
* OK [HIGHESTMODSEQ 8670601]
c OK [READ-WRITE] [Gmail]/All mail selected. (Success)

The command I've marked, c SELECT "[Gmail]/All mail" takes about 20 seconds to complete. Since that time is larger than it takes for the GMail app on my relatively underpowered Android phone to startup and load the All mail label which does it in less than 6 seconds even after I purged its caches. The web client is even faster.

Unless I'm missing something basic this proves "beyond reasonable doubt" that Google's GMail clients does not use IMAP since you never ever have to wait 20 seconds for any SELECT command to complete.

Fritter answered 31/8, 2013 at 22:37 Comment(5)
Are you sure it's not imap. Imap doesn't need to download all email to open a folder. So, it could download some info for top 10 emails and continue to download the rest of info in the background.Hackler
Yes. IMAP performance degrades for huge mailboxes. Gmail can show the most recent 50 threads in a mailbox with 1 million mails in < 3 seconds. No other IMAP client can do that. There are more tell-tale signs of non-imapness in Gmail but it's ot for this question.Fritter
I believe you have two options to figure it out then - disassemble their client or configure device to go through WiFi on your computer and check what are the destination ports.Hackler
I don't think Gmail App is using POP or IMAP, as I disable both, my Android phone still can receive email.Amandy
Way later: a rest api for gmail has been made public: developers.google.com/gmail/apiPianissimo
F
11

After more research, I've found that there exists an API for GMail: https://developers.google.com/gmail/api/ I don't think that API was released when I posted this question back in 2013.

Using that API, I have created a demo program that fetches the 100 last mails of a label: https://gist.github.com/bjourne/37a9d15b03862003008a1b0169190dbe

The relevant part of the program is:

resource = service.users().messages()
result = resource.list(userId = 'me', labelIds = [label]).execute()
mail_ids = [m['id'] for m in result['messages']]

start = time()
mails = []
batch = BatchHttpRequest()
cb = lambda req, res, exc: mails.append(to_mail(res))
for mail_id in mail_ids:
    get_request = resource.get(**headers_params(mail_id))
    batch.add(get_request, callback = cb)
result = batch.execute()
print('Took %.2f seconds' % (time() - start))

It lists the last 100 messages sorted by date in a label (folder in IMAP terminology) containing over 570k messages.

On my machine, this loop takes about 0.5 - 0.8 seconds. I can claim confidently that no pure IMAP client on the planet comes even close. Likely, IMAP won't ever get faster because it is a poor fit for how Google stores mail internally.

So I'll answer my own question. This is the API they are using and it wasn't exposed earlier.

Fritter answered 16/10, 2016 at 19:52 Comment(1)
The work that actually takes time here is MSN maintenance. When you access a message, google's server tells you it's message number 569901 of the ones currently in the mailbox. This is information most IMAP clients discard.Xenophobe
A
11

The Android app (at least the ones I've used) use IMAP. You can verify this by running Wireshark on the server.

As to why the Android app is so fast - what I know is that it uses the SEARCH command to select the most recent n messages. Desktop clients such as Thunderbird or Outlook are much more heavy-weight and download headers and metadata for every message in the folder, despite recommendations for them not to.

A smartphone does not have the resources to store and process millions of emails (although more modern ones might be getting there) so the SEARCH approach allowed quick mail access for handheld devices.

Anyhow, Wireshark can reveal a great deal about the behaviour of IMAP clients and servers. If you're really curious, give it a shot. You can't do this if the server is Gmail, but you can try it out on another server (e.g. hMailServer).

Akerboom answered 5/9, 2013 at 14:50 Comment(6)
The Android app I'm talking about is: play.google.com/store/apps/details?id=com.google.android.gm How can you know it uses IMAP? Also afaik, there is no way in IMAP to limit searches to only n most recent messages. Please provide a reference if you know otherwise. So that can't be the reason why gmail is fast.Fritter
I know because I have worked with IMAP extensively - as I said, if you connect to a server and run Wireshark on that server, you'll see the messages getting through. And as I said, you can get a set of recent messages by working with the SEARCH command. If I remember correctly, Android mail clients use SEARCH in conjunction with a date condition, so they get something like the last two weeks of mail. Once again, the best way to understand this is to see for yourself in Wireshark.Akerboom
If a client pays attention to the EXISTS messages, it can search the n most recent mesages. If the server's last EXISTS was 50000, 'x UID SEARCH 49000:* SUBJECT sex' searches for messages about sex among the most recent thousand.Xenophobe
There are more ways to search for recent messages, too. A modseq is perhaps what I'd use with gmail. Or a date clause.Xenophobe
I don't recall the exact commands, but AFAIK Android uses SEARCH which combines both dates (last 2 weeks) and sequence numbers. Smartphones have relatively limited capabilities and so they usually retrieve something like the last 25 or 50 messages, and then get the rest if you explicitly request more.Akerboom
"The Android app ... use IMAP" - although you've never had to enable IMAP within your Gmail account to use the Gmail Android app? Which you would have to do if you were using any other third party IMAP client.Castaway
F
11

After more research, I've found that there exists an API for GMail: https://developers.google.com/gmail/api/ I don't think that API was released when I posted this question back in 2013.

Using that API, I have created a demo program that fetches the 100 last mails of a label: https://gist.github.com/bjourne/37a9d15b03862003008a1b0169190dbe

The relevant part of the program is:

resource = service.users().messages()
result = resource.list(userId = 'me', labelIds = [label]).execute()
mail_ids = [m['id'] for m in result['messages']]

start = time()
mails = []
batch = BatchHttpRequest()
cb = lambda req, res, exc: mails.append(to_mail(res))
for mail_id in mail_ids:
    get_request = resource.get(**headers_params(mail_id))
    batch.add(get_request, callback = cb)
result = batch.execute()
print('Took %.2f seconds' % (time() - start))

It lists the last 100 messages sorted by date in a label (folder in IMAP terminology) containing over 570k messages.

On my machine, this loop takes about 0.5 - 0.8 seconds. I can claim confidently that no pure IMAP client on the planet comes even close. Likely, IMAP won't ever get faster because it is a poor fit for how Google stores mail internally.

So I'll answer my own question. This is the API they are using and it wasn't exposed earlier.

Fritter answered 16/10, 2016 at 19:52 Comment(1)
The work that actually takes time here is MSN maintenance. When you access a message, google's server tells you it's message number 569901 of the ones currently in the mailbox. This is information most IMAP clients discard.Xenophobe
X
3

You can test gmail's IMAP performance easily (if you have a million-message mailbox). Open an IMAP connection with

openssl s_client -connect imap.gmail.com:993 -crlf

then login and open your inbox.

a login [email protected] yourpassword
b select inbox

Or open your allmail box if inbox isn't big enough (name may vary depending on UI language):

c select "[Gmail]/All Mail"

If SELECT is fast but an IMAP client slow, then that's because the client sends additional/unneeded slow commands. Many choose to fill or update a data structure for the entire million messages even if they're going to display only 40 messages. That's client choice, not IMAP slowness.

Xenophobe answered 15/12, 2013 at 10:36 Comment(0)
T
1

"No other IMAP client can do this" is a pretty bold statement, but a million of messages is a pretty big number as well. I would encourage you to give Trojitá a try here. Chances are that the initial synchronization will be rather slow (it would transfer the flags for that million of messages for various technical reasons related to how the IMAP flags, SELECT, SEARCH and STATUS are specified), but the subsequent resynchronizaiton should be lightning fast thanks to ESEARCH, CONDSTORE and QRESYNC. I would be interested to hear how well Trojitá works with your setup -- the contact information are on the homepage.

To your question -- most webmails nowadays provide a private API for their own use. A typical architecture is to transfer messages about updated state via JSON, but there is no standard for this and the interface is prioprietary. Chances are that the GMail "app" uses the same (or similar) method. You don't have much options to verify this as it is likely using TLS. With a web interface, it is trivial to see the traffic with an appropriate browser plugin, but not so much with a standalone Android application.

Tourneur answered 2/9, 2013 at 11:44 Comment(1)
Thanks for the suggestion. Trojita isn't slow by any means (actually very snappy compared to some other clients) but its speed isn't anywhere close to gmail's native clients.Fritter

© 2022 - 2024 — McMap. All rights reserved.