How to extract only HTML from imap_body result
Asked Answered
G

3

7

I want to extract only the HTML content from a imap_body result. The imap_body give a verbatim copy of the mail.

Genovera answered 25/8, 2014 at 17:18 Comment(2)
Possible duplicate of Extract body text from Email PHPLh
imap_fetchbody($inbox, $number, 2); worked for meCastrate
G
20

I found a solution:

function getBody($uid, $imap)
{
    $body = $this->get_part($imap, $uid, "TEXT/HTML");
    // if HTML body is empty, try getting text body
    if ($body == "") {
        $body = $this->get_part($imap, $uid, "TEXT/PLAIN");
    }
    return $body;
}

function get_part($imap, $uid, $mimetype, $structure = false, $partNumber = false)
{
    if (!$structure) {
        $structure = imap_fetchstructure($imap, $uid, FT_UID);
    }
    if ($structure) {
        if ($mimetype == $this->get_mime_type($structure)) {
            if (!$partNumber) {
                $partNumber = 1;
            }
            $text = imap_fetchbody($imap, $uid, $partNumber, FT_UID);
            switch ($structure->encoding) {
                case 3:
                    return imap_base64($text);
                case 4:
                    return imap_qprint($text);
                default:
                    return $text;
            }
        }

        // multipart
        if ($structure->type == 1) {
            foreach ($structure->parts as $index => $subStruct) {
                $prefix = "";
                if ($partNumber) {
                    $prefix = $partNumber . ".";
                }
                $data = $this->get_part($imap, $uid, $mimetype, $subStruct, $prefix . ($index + 1));
                if ($data) {
                    return $data;
                }
            }
        }
    }
    return false;
}

function get_mime_type($structure)
{
    $primaryMimetype = ["TEXT", "MULTIPART", "MESSAGE", "APPLICATION", "AUDIO", "IMAGE", "VIDEO", "OTHER"];

    if ($structure->subtype) {
        return $primaryMimetype[(int)$structure->type] . "/" . $structure->subtype;
    }
    return "TEXT/PLAIN";
}
Genovera answered 26/8, 2014 at 14:1 Comment(2)
great solution!Icehouse
gave the HTML output perfectlyNason
B
8

http://php.net/manual/en/function.imap-fetchbody.php

Parameter 3, "the section" is as follows:

The part number. It is a string of integers delimited by period which index into a body part list as per the IMAP4 specification

(empty) - Entire message
0 - Message header
1 - MULTIPART/ALTERNATIVE
1.1 - TEXT/PLAIN
1.2 - TEXT/HTML
2 - file.ext

Therefore, to grab the HTML part of the mail, you would have to use the 1.2 option as the third parameter. Like so:

$message = imap_fetchbody($inbox, $number, 1.2);
Belomancy answered 25/8, 2014 at 17:25 Comment(4)
This only applies /if/ the message follows this structure. Many emails do not, and they will not follow this structure if it includes attachments. The best way method is to parse the bodystructure to find the HTML section you want.Mcvay
Thank you for your answer. Unfortunatly, i tried this method but it didn't work.Genovera
@Mcvay can you help me with a method for body structure.Genovera
imap_fetchbody($inbox, $number, 2); worked for meCastrate
S
4

I don't have enough reputation to add a comment, but I just wanted to clarify in @GunniH's answer that your call to the function should look like this:

$message = imap_fetchbody($inbox, $number, '1.2');

instead of this

$message = imap_fetchbody($inbox, $number, 1.2);

That final argument should be a string, not an int.

Slushy answered 3/3, 2016 at 4:21 Comment(2)
1.2 is not an int but a float. But both inputs should work.Lh
imap_fetchbody($inbox, $number, 2); worked for meCastrate

© 2022 - 2024 — McMap. All rights reserved.