JavaMail IMAP over SSL quite slow - Bulk fetching multiple messages
Asked Answered
M

3

22

I am currently trying to use JavaMail to get emails from IMAP servers (Gmail and others). Basically, my code works: I indeed can get the headers, body contents and so on. My problem is the following: when working on an IMAP server (no SSL), it basically takes 1-2ms to process a message. When I go on an IMAPS server (hence with SSL, such as Gmail) I reach around 250m/message. I ONLY measure the time when processing the messages (the connection, handshake and such are NOT taken into account).

I know that since this is SSL, the data is encrypted. However, the time for decryption should not be that important, should it?

I have tried setting a higher ServerCacheSize value, a higher connectionpoolsize, but am seriously running out of ideas. Anyone confronted with this problem? Solved it one might hope?

My fear is that the JavaMail API uses a different connection each time it fetches a mail from the IMAPS server (involving the overhead for handshake...). If so, is there a way to override this behavior?

Here is my code (although quite standard) called from the Main() class:

 public static int connectTest(String SSL, String user, String pwd, String host) throws IOException,
                                                                               ProtocolException,
                                                                               GeneralSecurityException {

    Properties props = System.getProperties();
    props.setProperty("mail.store.protocol", SSL);
    props.setProperty("mail.imaps.ssl.trust", host);
    props.setProperty("mail.imaps.connectionpoolsize", "10");

    try {


        Session session = Session.getDefaultInstance(props, null);

        // session.setDebug(true);

        Store store = session.getStore(SSL);
        store.connect(host, user, pwd);      
        Folder inbox = store.getFolder("INBOX");

        inbox.open(Folder.READ_ONLY);                
        int numMess = inbox.getMessageCount();
        Message[] messages = inbox.getMessages();

        for (Message m : messages) {

            m.getAllHeaders();
            m.getContent();
        }

        inbox.close(false);
        store.close();
        return numMess;
    } catch (MessagingException e) {
        e.printStackTrace();
        System.exit(2);
    }
    return 0;
}

Thanks in advance.

Mcclendon answered 30/11, 2011 at 8:7 Comment(8)
Note: the String SSL is either "imap" or "imaps". Also, I have read the issue https://mcmap.net/q/587620/-javamail-performance but have tried on an IMAPS server that is not Gmail and am still getting the same results.Mcclendon
Does this happen with other clients (Thunderbird, Outlook, what-have-you) on the same IMAP/IMAPS server as well? In that case, it wouldn't be your code's fault, and rather a server problem.Wojak
How can I measure the time Thunderbird takes for importing messages? (we are in the ms area...). It charged all folders in under 20 seconds (but I don't know if it got only some information and gets the rest when I click on the message).Mcclendon
Hmm, that is a bit of a bother, yes. I believe it has some sort of operations log (off by default), but the resolution will be in seconds at most. You can configure TB to download the complete messages (it only gets headers by default), and then measure the whole inbox; that should at least show you whether it takes < 1 sec, or multiple seconds.Wojak
What is your underlying operating system?Lenorelenox
Ubuntu 11.04 @Piskvor: still working on those logs to give you an accurate answer.Mcclendon
Ok, sorry for the wait (Thunderbird log does not activate timestamp by default...). For one message, body content being fetched under thunderbird, here are the timestamps: 2011-11-30 08:47:10.360004 UTC (begin fetch) ... 2011-11-30 08:47:10.360922 UTC - -1989445888[7f5b7e04a150]: 7dcba000:imap.googlemail.com:S-INBOX:STREAM:CLOSE: Normal Message End Download Stream 2011-11-30 08:47:10.385466 UTC - -1989445888[7f5b7e04a150]: ReadNextLine [stream=8cdc70e0 nb=56 needmore=0] So either 25ms or 1ms for getting the body of the message...Mcclendon
Hi @Justmaker, How many seconds does it take for you to connect to the Store? " store.connect(host, user, pwd);"Tigon
M
29

after a lot of work, and assistance from the people at JavaMail, the source of this "slowness" is from the FETCH behavior in the API. Indeed, as pjaol said, we return to the server each time we need info (a header, or message content) for a message.

If FetchProfile allows us to bulk fetch header information, or flags, for many messages, getting contents of multiple messages is NOT directly possible.

Luckily, we can write our own IMAP command to avoid this "limitation" (it was done this way to avoid out of memory errors: fetching every mail in memory in one command can be quite heavy).

Here is my code:

import com.sun.mail.iap.Argument;
import com.sun.mail.iap.ProtocolException;
import com.sun.mail.iap.Response;
import com.sun.mail.imap.IMAPFolder;
import com.sun.mail.imap.protocol.BODY;
import com.sun.mail.imap.protocol.FetchResponse;
import com.sun.mail.imap.protocol.IMAPProtocol;
import com.sun.mail.imap.protocol.UID;

public class CustomProtocolCommand implements IMAPFolder.ProtocolCommand {
    /** Index on server of first mail to fetch **/
    int start;

    /** Index on server of last mail to fetch **/
    int end;

    public CustomProtocolCommand(int start, int end) {
        this.start = start;
        this.end = end;
    }

    @Override
    public Object doCommand(IMAPProtocol protocol) throws ProtocolException {
        Argument args = new Argument();
        args.writeString(Integer.toString(start) + ":" + Integer.toString(end));
        args.writeString("BODY[]");
        Response[] r = protocol.command("FETCH", args);
        Response response = r[r.length - 1];
        if (response.isOK()) {
            Properties props = new Properties();
            props.setProperty("mail.store.protocol", "imap");
            props.setProperty("mail.mime.base64.ignoreerrors", "true");
            props.setProperty("mail.imap.partialfetch", "false");
            props.setProperty("mail.imaps.partialfetch", "false");
            Session session = Session.getInstance(props, null);

            FetchResponse fetch;
            BODY body;
            MimeMessage mm;
            ByteArrayInputStream is = null;

            // last response is only result summary: not contents
            for (int i = 0; i < r.length - 1; i++) {
                if (r[i] instanceof IMAPResponse) {
                    fetch = (FetchResponse) r[i];
                    body = (BODY) fetch.getItem(0);
                    is = body.getByteArrayInputStream();
                    try {
                        mm = new MimeMessage(session, is);
                        Contents.getContents(mm, i);
                    } catch (MessagingException e) {
                        e.printStackTrace();
                    }
                }
            }
        }
        // dispatch remaining untagged responses
        protocol.notifyResponseHandlers(r);
        protocol.handleResult(response);

        return "" + (r.length - 1);
    }
}

the getContents(MimeMessage mm, int i) function is a classic function that recursively prints the contents of the message to a file (many examples available on the net).

To avoid out of memory errors, I simply set a maxDocs and maxSize limit (this has been done arbitrarily and can probably be improved!) used as follows:

public int efficientGetContents(IMAPFolder inbox, Message[] messages)
        throws MessagingException {
    FetchProfile fp = new FetchProfile();
    fp.add(FetchProfile.Item.FLAGS);
    fp.add(FetchProfile.Item.ENVELOPE);
    inbox.fetch(messages, fp);
    int index = 0;
    int nbMessages = messages.length;
    final int maxDoc = 5000;
    final long maxSize = 100000000; // 100Mo

    // Message numbers limit to fetch
    int start;
    int end;

    while (index < nbMessages) {
        start = messages[index].getMessageNumber();
        int docs = 0;
        int totalSize = 0;
        boolean noskip = true; // There are no jumps in the message numbers
                                           // list
        boolean notend = true;
        // Until we reach one of the limits
        while (docs < maxDoc && totalSize < maxSize && noskip && notend) {
            docs++;
            totalSize += messages[index].getSize();
            index++;
            if (notend = (index < nbMessages)) {
                noskip = (messages[index - 1].getMessageNumber() + 1 == messages[index]
                        .getMessageNumber());
            }
        }

        end = messages[index - 1].getMessageNumber();
        inbox.doCommand(new CustomProtocolCommand(start, end));

        System.out.println("Fetching contents for " + start + ":" + end);
        System.out.println("Size fetched = " + (totalSize / 1000000)
                + " Mo");

    }

    return nbMessages;
}

Do not that here I am using message numbers, which is unstable (these change if messages are erased from the server). A better method would be to use UIDs! Then you would change the command from FETCH to UID FETCH.

Hope this helps out!

Mcclendon answered 8/3, 2012 at 16:6 Comment(8)
I had some trouble with UID FETCH it just haven't fetched all the mails.Mechanist
@Justmaker: This great answer is (IMO) more or less exactly the solution I need for my fetching problem: #28166682 . However, I have a question: I see that you use `args.writeString("BODY[]");´ to fetch the whole body part of each message. What would the argument have to look like if I, for example, would like to have, the body part 1.3 of the message with the uid 16 plus the body part 2.1 of the message with the uid 17 ...Abridgment
in your CustomProtocolCommand class, doCommand you are using the method Contents.getContents(mm, i); , I fail to find, what library to import in order to get this method to work, please add the neccresary imports to your class, thank you.Music
I also noticed that body = (BODY) fetch.getItem(0); may result in cast exception errors, as it seems that in some servers getItem(0) is UID, try to put body = (BODY) fetch.getItem(BODY.class); instead, works for me, and if you want UID then fetch.getItem(UID.class);Music
I've added the imports necessary which should be in the JavaMail lib. But Contents.getContents is actually a method that you would need to implement. There are many examples on the web, most test for the type to know how to handle the content : #11240868. And yeah ,you're right about the getItem, I didn't update my answer which is a bit old now ;)Mcclendon
@Mcclendon Thanks a ton. This is quite good. However i need to make a little modification. All the messages are being marked as read, i think its this line Response[] r = protocol.command("FETCH", args); Response response = r[r.length - 1]; I need to mark the message as read when only i receive email from certain person. I have that logic in Contents.getContents(mm, i) method. I'm trying to set message.setFlag(Flags.Flag.SEEN, false); for the message first then if condition is satisfied then mark the message as read. Its not working. Any updates I need to makeLinger
Where is the advantage in respect of speed as it take a lot time feching the contents? I tried it out but do not see any improvement. One could just call getContent without using a CustomProtocolCommand.Cnossus
How much time does this take to load for 25 messages?Tigon
H
19

You need to add a FetchProfile to the inbox before you iterate through the messages. Message is a lazy loading object, it will return to the server for each message and for each field that doesn't get provided with the default profile. e.g.

for (Message message: messages) {
  message.getSubject(); //-> goes to the imap server to fetch the subject line
}

If you want to display like an inbox listing of say just From, Subject, Sent, Attachement etc.. you would use something like the following

    inbox.open(Folder.READ_ONLY);
    Message[] messages = inbox.getMessages(start + 1, total);

    FetchProfile fp = new FetchProfile();
    fp.add(FetchProfile.Item.ENVELOPE);
    fp.add(FetchProfileItem.FLAGS);
    fp.add(FetchProfileItem.CONTENT_INFO);

    fp.add("X-mailer");
    inbox.fetch(messages, fp); // Load the profile of the messages in 1 fetch.
    for (Message message: messages) {
       message.getSubject(); //Subject is already local, no additional fetch required
    }

Hope that helps.

Hydrosome answered 3/2, 2012 at 7:7 Comment(2)
The FetchProfile helped a bit, thanks! However I am now tackling trying to bulk fetch message contents for more than one message (which is not possible when directly using the JavaMail API). This would bring even greater performance improvement, given a size limit is fixed to avoid out-of-memory errors.Mcclendon
How much time does it take to receive 25 messages using FetchProfile? For me it takes nearly 4 to 5 seconds.Tigon
L
1

The total time includes the time required in cryptographic operations. The cryptographic operations need a random seeder. There are different random seeding implementations which provide random bits for use in the cryptography. By default, Java uses /dev/urandom and this is specified in your java.security as below:

securerandom.source=file:/dev/urandom

On Windows, java uses Microsoft CryptoAPI seed functionality which usually has no problems. However, on unix and linux, Java, by default uses /dev/random for random seeding. And read operations on /dev/random sometimes block and takes long time to complete. If you are using the *nix platforms then the time spent in this would get counted in the overall time.

Since, I dont know what platform you are using, I can't for sure say that this could be your problem. But if you are, then this could be one of reasons why your operations are taking long time. One of the solution to this could be to use /dev/urandom instead of /dev/random as your random seeder, which does not block. This can be specified with the system property "java.security.egd". For example,

  -Djava.security.egd=file:/dev/urandom

Specifying this system property will override the securerandom.source setting in your java.security file. You can give it a try. Hope it helps.

Lenorelenox answered 30/11, 2011 at 9:32 Comment(3)
I am indeed running on Ubuntu 11.04, I will try what you suggest and keep you posted.Mcclendon
Unfortunately, my java.security file already has the following line: securerandom.source=file:/dev/urandomMcclendon
Try adding the following property (with 3 ///) explicitly to your JVM -Djava.security.egd=file:///dev/urandom. The syntax above (with 1 /) is not recognised.Lenorelenox

© 2022 - 2024 — McMap. All rights reserved.