downloading email body containing inline images in java
Asked Answered
T

2

7

My issue goes as follows:

I have My code setup to read emails from a particular account. That part works perfectly.

the issue is with parsing the Email message. Separating attachments and email body(containing inline images).

My code goes like this:

    Void readMessages(Folder folder){

          Message[] messages = folder.getMessages();
            // loading of message objects.
                for (int messageNumber = 0; messageNumber < messages.length; messageNumber++) {

             final Message currentMessage = messages[messageNumber];
                 logger.info("Handling the mail with subject " + currentMessage.getSubject());
                logger.info("Content type for the current message is " +                                  currentMessage.getContentType());
                final String messageFileName = currentMessage.getFileName();
                logger.info("File name for the message " + messageFileName + ". File name is blank "
                                                +                     StringUtils.isBlank(messageFileName));


                        Object messageContentObject = currentMessage.getContent();
                        if (messageContentObject instanceof Multipart) {
                            Multipart multipart = (Multipart) messageContentObject;

                            // downloading all attachments....
                            int attachmentCount = multipart.getCount();
                            logger.info("Number of attachments ");
                            for (int i = 0; i < attachmentCount; i++) {
                                Part part = (Part) multipart.getBodyPart(i);
                                downloadAttachment(part, folderPath.toString());
                            }

                        }

                    }
                }
            }
         private void downloadAttachment(Part part, String folderPath) throws Exception {
    String disPosition = part.getDisposition();
    String fileName = part.getFileName();
    String decodedText = null;
    logger.info("Disposition type :: " + disPosition);
    logger.info("Attached File Name :: " + fileName);

    if (disPosition != null && disPosition.equalsIgnoreCase(Part.ATTACHMENT)) {
        logger.info("DisPosition is ATTACHMENT type.");
        File file = new File(folderPath + File.separator + decodedText);
        file.getParentFile().mkdirs();
        saveEmailAttachment(file, part);
    } else if (fileName != null && disPosition == null) {
        logger.info("DisPosition is Null type but file name is valid.  Possibly inline attchment");
        File file = new File(folderPath + File.separator + decodedText);
        file.getParentFile().mkdirs();
        saveEmailAttachment(file, part);
    } else if (fileName == null && disPosition == null) {
        logger.info("DisPosition is Null type but file name is null. It is email body.");
        File file = new File(folderPath + File.separator + "mail.html");
        file.getParentFile().mkdirs();
        saveEmailAttachment(file, part);
    }


}
     protected int saveEmailAttachment(File saveFile, Part part) throws Exception {

    BufferedOutputStream bos = null;
    InputStream is = null;
    int ret = 0, count = 0;
    try {
        bos = new BufferedOutputStream(new FileOutputStream(saveFile));
        part.writeTo(new FileOutputStream(saveFile));

    } finally {
        try {
            if (bos != null) {
                bos.close();
            }
            if (is != null) {
                is.close();
            }
        } catch (IOException ioe) {
            logger.error("Error while closing the stream.", ioe);
        }
    }
    return count;
} 

The problem i get is when i run this code, i get an HTML file but the inline images is replaced by a sign for error image which indicates the image with no source.

Please help me out with. Let me know if any more info is required.

I also tried saving the body as an .eml file by changing:

 File file = new File(folderPath + File.separator + "mail.html"); 

to

 File file = new File(folderPath + File.separator + "mail.eml");

BUt i got the same results.

Trellas answered 22/11, 2012 at 7:43 Comment(9)
"..by a sign for error image which indicates the image with no source." Are you sure the source is empty? Have you checked the HTML source?Candescent
Check this HTML file. dropbox.com/s/5evk4pjq721yo8m/mail.htmlTrellas
This HTML will show you the Image source missing.Trellas
The source isn't missing. It just isn't a location on the harddrive or on the web. It's a reference to another part in the multipart message. <img src="cid:ii_13b213e3157d833f" alt="Inline image 1">Candescent
Is there a way for browser to render such a thing ? Can a browser read the source from within the HTML ? If my content stream is present within the HTML. if yes, What tag is needed to identify it ?Trellas
You could save the images somewhere and replace the references in the source by the real (relative) location.Candescent
That's what i was thinking. But another point i want to know.how will we identify the replacement tag for multiple images ? for two images abc.jpeg and xyz.jpeg, HTML content is shown as [Inline: Image 1] and [Inline: Image 2]. How to identify which part is for which image ?Trellas
Do you mind sharing your solution with us? Your question is almost 1 year old.Sherfield
I have not got any solution to this as yet. I later decided to let it go. I will be trying MrTux solution to see if i can get it to work.Trellas
L
1

I wrote below code to convert email body text to pdf including inline images. in code i replaced the image code(ex: cid:[email protected]) with download image path. I am constructing the "hashmap" for image key and download path while downloading the image.

 HTMLWorker htmlWorker = new HTMLWorker(document);
            if(bodyStr!=null)
            {

                //find inline images
                inlineImages=downloadInLineImage(mostRecentMatch, dynamicOutputDirectory);
                if(inlineImages!=null)
                {

                    for (Map.Entry<String, String> entry : inlineImages.entrySet()) {
                        //System.out.println("Key = " + entry.getKey() + ", Value = " + entry.getValue());
                        bodyStr=bodyStr.replaceAll("cid:"+entry.getKey() , entry.getValue());
                    }
                }
                htmlWorker.parse(new StringReader(bodyStr));

        }

Download Inline image with passing Item.

 private HashMap<String,String> downloadInLineImage(Item item, String dynamicOutputDirectory)
        throws Exception, ServiceLocalException {
    //create output directory if not present

        //bind the item to a new email message. if you do not bind, then the getHasAttachments() function will fail
    EmailMessage mostRecentMatch = (EmailMessage)item;
    String from = mostRecentMatch.getFrom().getAddress();
    String user =StringUtils.substringBefore(from, "@");
    AttachmentCollection collection=item.getAttachments();

    HashMap<String,String> inlineFiles=new HashMap<String,String>();

    if(collection.getCount()>0)
    {
        for (Attachment attachment : collection.getItems()) {

            if(attachment.getIsInline())
            {

                FileAttachment currentFile = (FileAttachment) attachment;
                String filePath=dynamicOutputDirectory+"/"+user+currentFile.getName();
                File file=new File(filePath);
                FileOutputStream fio=new FileOutputStream(file);
                currentFile.load(fio);
                inlineFiles.put(currentFile.getContentId(), filePath);
                fio.close();
            }
        }
    }
Logography answered 11/3, 2016 at 7:1 Comment(0)
P
1

References to inlined images are replaced by cid: URNs like <img src="cid:SOMEID">, because there are no filenames in an email. SOMEID refers to the Content-ID of the Multipart "objects".

In order to get it work, you have to store the multipart attachments to files (e.g., temporary names) and replace the cid URNs by the real file names.

Pulmotor answered 4/8, 2014 at 17:55 Comment(2)
How this is possible i have same problem. i have download inline image successfully but unable reply src:cid with my original path please give me solution if possible and as soon as possible.Lugo
Did not work for me either. Still looking for a way out.Trellas
L
1

I wrote below code to convert email body text to pdf including inline images. in code i replaced the image code(ex: cid:[email protected]) with download image path. I am constructing the "hashmap" for image key and download path while downloading the image.

 HTMLWorker htmlWorker = new HTMLWorker(document);
            if(bodyStr!=null)
            {

                //find inline images
                inlineImages=downloadInLineImage(mostRecentMatch, dynamicOutputDirectory);
                if(inlineImages!=null)
                {

                    for (Map.Entry<String, String> entry : inlineImages.entrySet()) {
                        //System.out.println("Key = " + entry.getKey() + ", Value = " + entry.getValue());
                        bodyStr=bodyStr.replaceAll("cid:"+entry.getKey() , entry.getValue());
                    }
                }
                htmlWorker.parse(new StringReader(bodyStr));

        }

Download Inline image with passing Item.

 private HashMap<String,String> downloadInLineImage(Item item, String dynamicOutputDirectory)
        throws Exception, ServiceLocalException {
    //create output directory if not present

        //bind the item to a new email message. if you do not bind, then the getHasAttachments() function will fail
    EmailMessage mostRecentMatch = (EmailMessage)item;
    String from = mostRecentMatch.getFrom().getAddress();
    String user =StringUtils.substringBefore(from, "@");
    AttachmentCollection collection=item.getAttachments();

    HashMap<String,String> inlineFiles=new HashMap<String,String>();

    if(collection.getCount()>0)
    {
        for (Attachment attachment : collection.getItems()) {

            if(attachment.getIsInline())
            {

                FileAttachment currentFile = (FileAttachment) attachment;
                String filePath=dynamicOutputDirectory+"/"+user+currentFile.getName();
                File file=new File(filePath);
                FileOutputStream fio=new FileOutputStream(file);
                currentFile.load(fio);
                inlineFiles.put(currentFile.getContentId(), filePath);
                fio.close();
            }
        }
    }
Logography answered 11/3, 2016 at 7:1 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.