Know if email was a reply using IMAP in PHP

S

2

6

I am not sure if it is in the headers or not, but I am looking for a way to tell if an email I receive is a response to an email I sent, and if so, to only grab the new text, not "quoted text"

A little background: I am creating a script that will send out emails automatically. I am creating a cron job to run at periodic intervals to check to see if there were any replies. If there were replies, I only want to grab the new stuff, and not the old stuff.

In the past, I would send out emails with the id in the subject (You have a new response [1234]), and would then check the subject for the stuff in between the [ and ]. Then I would grab all the message and store it since every web browser/email uses a different character or style for quoted-text. Some do ">" some do a horizontal rule, some do absolutely nothing.

Anyways, I am just looking for something in the email header that would indicate they are replying and what the new text might be. If it's not possible, I will just keep on doing what I am doing.

Sensitize answered 28/11, 2011 at 3:54 Comment(0)

T

3

Unfortunately, e-mail clients can essentially do whatever they want with your message, and there is no reliable standard for determining how a received message originated at the client. In addition, IMAP doesn't really have anything to do with it. E-mails can be sent a number of different ways, including webmail.

The best you can do is look for an ID number in the subject line (assuming folks don't change it, which they rarely do). You can also do what Google does... fuzzy match the reply text to e-mail you sent to that address. If it matches, consider it part of the reply. This takes great effort though.

Tiebout answered 28/11, 2011 at 3:58 Comment(4)

Yeah, this is what I was thinking, especially when it comes to checking for a reply. It will be hard to do since every webmail/email client is different and sends data differently. Thanks! – Sensitize 28/11, 2011 at 3:59

There's also sometimes a header. The header will read in-reply-to or something similar and will contain the message ID of the original message. – Cohn 28/11, 2011 at 12:23

@SteveSmith, It isn't required that the client send it. You will often see it, but not always. – Tiebout 28/11, 2011 at 14:8

@Tiebout absolutely, that's why I said sometimes but if you add all these together you can hopefully make something a little more robust, just in case the client edits the subject line. – Cohn 28/11, 2011 at 21:53

L

17

You can know if an email is the reply of another email or not by using the combination of In-Reply-To and References.

Every email has a unique ID in its header called Message-ID, according to this RFC 1, you can track the ancestors of any email.

I have checked it and it is working in all clients (Outlook, Thunderbird)
I will give an example to use.

1- In the header of the email you send for the first time, you (your mail server or you in code) send an ID (Message-ID), if you open source of the email you will see it like this in top section:

... // You (your code) send:
Message-ID: <[email protected]>    
...

You just need to keep this Message-ID in your program. any subsequent reply will refer to this ID.

2- Client will reply email 1 to you. Client will send a crucial header for you to tell you for which email this reply is in addition to its own Message-ID.

... // Client(Thunderbird) send:
Message-ID: <[email protected]>    
In-Reply-To: <[email protected]>
...

When you receive the second email, it will be easy for you to keep track of the previous email you have sent because the ID of mail(1) is in the In-Reply-To header of the mail(2).

3- if you want to reply back this email again inside your code, you just need to put the Message-ID of the mail(2) in In-Reply-To header and Message-ID of mail(1) and mail(2) in References header. So the client will understand the chain correctly.

... // You (your code) send:
Message-ID: <[email protected]>
In-Reply-To: <[email protected]>
References: <[email protected]> <[email protected]>   
...

By this header, you are telling the client that this email is a reply to the mail(2) and the ancestors are mail(1) and mail(2).

I have worked with them and read about them and it is working, my problem now is to just get the text of the last email and not the quoted text from the replies. (we are running our own Ticketing system, we create a comment for each email)

Lentissimo answered 29/1, 2017 at 12:45 Comment(7)

The length of the 'In-Reply-To' header doesn't have a definitive max-length, so when I store it I just store a sha256 hash of it (32 bytes). I don't really care to parse it out - I just need to do a lookup to see if it matches an email I've sent. – Eurypterid 15/6, 2017 at 23:57

I did not get you. so what is the problem? whether you are store it in plain or hash, you can still compare it. If you store every message-id you are sending, then when you receive an email just compare these two plain text or compare hash of both with appropriate methods (maybe you are asking how you can compare two hashes?) – Lentissimo 17/6, 2017 at 5:51

I just did that because there doesn't seem to be an official limit on the length of the token. In theory it could be about 995 characters long. I'm sure on average it is much less but I decided to just use a hash right away – Eurypterid 17/6, 2017 at 5:53

If it worked for you, it is ok. In our case, we just use the plain mode and it was good for us. by the way it was interesting to know your way. – Lentissimo 17/6, 2017 at 6:30

i didn't see any upvotes to this answer, but the answer seems pretty legit – Blesbok 25/7, 2017 at 17:29

i tried to help the community, this website helped me a lot so far. I do not need up vote. i just have hope to not mislead anyone by a partial answer. – Lentissimo 13/1, 2020 at 23:31

I tested this approach in gmail as well and it worked. But is it possible this could break depending on the mail client? For example if there are some clients other than outlook, gmail etc. that do not follow this convention, or if they set their own message-id for some reason (ex. gmail does this if it determines the existing message id doesn't follow RFC standards)? – Vassal 1/6, 2022 at 4:18

T

3