Recommendations on parsing .eml files in C#
Asked Answered
C

8

48

I have a directory of .eml files that contain email conversations. Is there a recommended approach in C# of parsing files of this type?

Cancel answered 1/6, 2009 at 19:44 Comment(0)
S
68

Added August 2017: Check out MimeKit: https://github.com/jstedfast/MimeKit. It supports .NET Standard, so will run cross-platform.

Original answer: I posted a sample project to illustrate this answer to Github

The CDO COM DLL is part of Windows/IIS and can be referenced in .net. It will provide accurate parsing and a nice object model. Use it in conjuction with a reference to ADODB.DLL.

public CDO.Message LoadEmlFromFile(String emlFileName)
{
    CDO.Message msg = new CDO.MessageClass();
    ADODB.Stream stream = new ADODB.StreamClass();

    stream.Open(Type.Missing, ADODB.ConnectModeEnum.adModeUnknown, ADODB.StreamOpenOptionsEnum.adOpenStreamUnspecified, String.Empty, String.Empty);
    stream.LoadFromFile(emlFileName);
    stream.Flush();
    msg.DataSource.OpenObject(stream, "_Stream");
    msg.DataSource.Save();

    stream.Close();
    return msg;
}
Shadrach answered 24/7, 2010 at 13:20 Comment(10)
Ries, I have searched for solution the whole day, found many parsers and .net libraries which are partly working. Your suggested Windows library works 100%. This answer should be in the first place and above others.Gean
have you been able to use this from a 64bit app? Where did you reference the DLL?Tabber
I was able to use this solution on Windows 2008 Standard R2, unfortunately it do not work on Windows 2008 Standard(not R2). Also gives many compatible issues while using on Windows 2008 or Windows 7.Gean
The reference to the dll can be added from "COM" tab inside the "Add reference" dialog. It is listed under "Microsoft CDO for Windows 2000 Library". As Ries said, it's included with IIS.Bleeding
I get compilation errors - The type 'CDO.MessageClass' has no constructors defined ... similarly for 'ADOBD.StreamClass'. Any idea?Godmother
FYI It looks like the ADODB Stream class adds 2 bytes to the beginning of the stream for encoding which prevents the CDO Message class from successfully reading the first header line in the file. To prevent this, try setting the Stream.Type property to StreamTypeEnum.adTypeBinary prior to calling Open.Somniferous
I gave up and used something else, but: can someone who successfully used this edit this answer to fix some of the glaring omissions? There's a return statement but no method, vital bits of info that only appear in these comments, etc...Noranorah
@MCOwen see the new sample project on Github: github.com/riesvriend/CDO-EML-Parsing-SampleShadrach
Thanks, It works on VS2017/.NET4.5/Win10. you can copy the ADODB and CDO from obj into a seperate folder, and add reference (with copy to local true) from there to bypass the compile error cannot create...CDOWrapper. Seen not to parse "from" address if it's the first line in mail file. Add a blank line as first line to bypass this issue.Siana
In addition, change to CDO.Message msg = new CDO.Message(); //from => new CDO.MessageClass(); and ADODB.Stream stream = new ADODB.Stream(); //from => new ADODB.StreamClass(); in LoadMessage()Siana
S
12

Follow this link for a good solution:

The summary of the article is 4 steps(The second step below is missing in the article but needed):

  1. Add a reference to "Microsoft CDO for Windows 2000 Library", which can be found on the ‘COM’ tab in the Visual Studio ‘Add reference’ dialog. This will add 2 references to "ADODB" and "CDO" in your project.

  2. Disable embedding of Interop types for the 2 reference "ADODB" and "CDO". (References -> ADODB -> Properties -> Set 'Embed Interop Types' to False and repeat the same for CDO)

  3. Add the following method in your code:

    protected CDO.Message ReadMessage(String emlFileName)
    {
        CDO.Message msg = new CDO.MessageClass();
        ADODB.Stream stream = new ADODB.StreamClass();
        stream.Open(Type.Missing, 
                       ADODB.ConnectModeEnum.adModeUnknown, 
                       ADODB.StreamOpenOptionsEnum.adOpenStreamUnspecified,                                                                         
                       String.Empty, 
                       String.Empty);
        stream.LoadFromFile(emlFileName);
        stream.Flush();
        msg.DataSource.OpenObject(stream, "_Stream");
        msg.DataSource.Save();
        return msg;
    }
    
  4. Call this method by passing the full path of your eml file and the CDO.Message object it returns will have all the parsed info you need including To,From, Subject, Body.

Schear answered 26/9, 2013 at 17:31 Comment(1)
Can this also pass the Attachment included in EML?Gooseflesh
A
11

LumiSoft includes a Mime parser.

Sasa includes a Mime parser as well.

Antiar answered 15/5, 2010 at 1:30 Comment(2)
reference usage in case anyone is looking: github.com/fschwiet/ManyConsole/blob/master/SampleConsole/…Tabber
It's now at github.com/fschwiet/ManyConsole/blob/master/SampleConsole/…Rosalbarosalee
S
4

What you probably need is an email/MIME parser. Parsing all the header field is not very hard, but separating out various MIME types like images, attachments, various text and html parts etc. can get very complex.

We use a third party tool but there are many C# tools/libraries out there. Search for free C# email MIME parser in Google. Like I got this one:

http://www.codeproject.com/Articles/11882/Advanced-MIME-Parser-Creator-Editor http://www.lumisoft.ee/lswww/download/downloads/Net/info.txt

Springe answered 1/6, 2009 at 19:53 Comment(0)
T
3

Getting a decent MIME parser would be probably a way to go. You may try to use a free MIME parser (such as this one from codeproject) but comments from code author like this

I worked on this at about the same time that I worked on a wrapper class for MSG files. Big difference in difficulty. Where the EML wrapper class maybe took a day to read the spec and get right, the MSG wrapper class took a week.

made me curious about the code quality. I'm sure that you can hack a mime parser which parses 95% of email correctly in a few days/hours. I'm also sure that getting right the remaining 5% will take months. Consider handling S/MIME (encrypted and signed email), unicode, malformed emails produced by misbehaving mail clients and servers, several encoding schemas, internationalization issues, making sure that intentionally mallformed emails will not crash your app, etc...

If email you need to parse are comming from single source the quick & dirty parser may be enough. If you need to parse emails from the wild a better solution could be needed.

I would recommend our Rebex Secure Mail component, but I'm sure that you get decent result with components from other vendors as well.

Making sure that the parser of your choice is working correctly on the infamous "Mime Torture Sample message" prepared by Mike Crispin (co-author of MIME and IMAP RFCs). The testing message is displayed in MIME Explorer sample and can be downloaded in the installation package.

Following code shows how to read and parse EML file:

using Rebex.Mail;

MailMessage message = new MailMessage();
message.Load("file.eml");
Tutuila answered 27/1, 2010 at 18:33 Comment(1)
+1 for "I'm also sure that getting right the remaining 5% will take months"Hassock
R
3

I just started using the Mime-part of Papercut for this. It seems decent and simple at first sight.

    public void ProcessRawContents(string raw)
    {
        // NB: empty lines may be relevant for interpretation and for content !!
        var lRawLines = raw.Split(new []{"\r\n"}, StringSplitOptions.None);
        var lMailReader = new MimeReader(lRawLines);
        var lMimeEntity = lMailReader.CreateMimeEntity();
        MailMessageEx Email = lMimeEntity.ToMailMessageEx();
        // ...
    }

(MailMessageEx is, of course, derived from MailMessage.)

Redound answered 31/5, 2013 at 12:53 Comment(0)
S
2

Try:

  • febootimail
  • SmtpExpress
  • LinkWS Newsletter Turbo
  • emlBridge - importing eml files into Outlook and virtually any other email client
  • Newsletter 2.1 Turbo
  • ThunderStor (emlResender)
  • Ruby (using eml2mbox). See jimbob method.
  • Evolution - create new message, attach the eml file,

Write a program:

Workarounds:

  • $ cat mail.eml | mail -s -c But headers won't be parsed, neither attachments.
  • drop them into your GMail (Firefox will save them as attachments)
Spousal answered 13/12, 2010 at 13:20 Comment(1)
The question specifically stated C# .NET yet your answers in the Try section point to non C# solutions.Balkin
R
2

Aspose.Email for .NET

Aspose.Email for .NET is a collection of components for working with emails from within your .NET applications. It makes it easy to work with a number of email message formats and message storage files (PST/OST) along with message sending and receiving capabilities.

Aspose.Email makes it easy to create, read and manipulate a number of message formats such as MSG, EML, EMLX and MHT files without the need of installing Microsoft Outlook. You can not only change the message contents, but also manipulate (add, extract and remove) attachments from a message object. You can customize message headers by adding or removing recipients, changing the subject or other properties. It also gives you complete control over an email message by providing access to its Mapi properties.

C# Outlook MSG file reader without the need for Outlook

MSGReader is a C# .NET 4.0 library to read Outlook MSG and EML (Mime 1.0) files. Almost all common object in Outlook are supported.

Rizika answered 9/4, 2015 at 13:23 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.