Getting cdata content while parsing xml file
Asked Answered
Y

3

9

I have an xml file

<?xml version="1.0" encoding="utf-8"?>
<xml>
    <events date="01-10-2009" color="0x99CC00" selected="true"> 
       <event>
            <title>You can use HTML and CSS</title>
            <description><![CDATA[This is the description ]]></description>
        </event>
    </events>
</xml>

I used xpath and and xquery for parsing the xml.

$xml_str = file_get_contents('xmlfile');
$xml = simplexml_load_string($xml_str);
if(!empty($xml))
{
    $nodes = $xml->xpath('//xml/events');
}

i am getting the title properly, but iam not getting description.How i can get data inside the cdata

Yarndyed answered 6/9, 2010 at 9:5 Comment(0)
G
12

SimpleXML has a bit of a problem with CDATA, so use:

$xml = simplexml_load_file('xmlfile', 'SimpleXMLElement', LIBXML_NOCDATA);
if(!empty($xml))
{
    $nodes = $xml->xpath('//xml/events');
}
print_r( $nodes );

This will give you:

Array
(
    [0] => SimpleXMLElement Object
        (
            [@attributes] => Array
                (
                    [date] => 01-10-2009
                    [color] => 0x99CC00
                    [selected] => true
                )

            [event] => SimpleXMLElement Object
                (
                    [title] => You can use HTML and CSS
                    [description] => This is the description 
                )

        )

)
Gourami answered 6/9, 2010 at 10:31 Comment(1)
Wrong! SimpleXML has no problem with CDATA, and this is a persistent myth which should not be perpetuated. It is only print_r which cannot see the CDATA, because SimpleXML does not actually store its data as a "real" PHP object, it just coughs it up on demand.Greaseball
G
11

You are probably being misled into thinking that the CDATA is missing by using print_r or one of the other "normal" PHP debugging functions. These cannot see the full content of a SimpleXML object, as it is not a "real" PHP object.

If you run echo $nodes[0]->Description, you'll find your CDATA comes out fine. What's happening is that PHP knows that echo expects a string, so asks SimpleXML for one; SimpleXML responds with all the string content, including CDATA.

To get at the full string content reliably, simply tell PHP that what you want is a string using the (string) cast operator, e.g. $description = (string)$nodes[0]->Description.

To debug SimpleXML objects and not be fooled by quirks like this, use a dedicated debugging function such as one of these: https://github.com/IMSoP/simplexml_debug

Greaseball answered 11/12, 2012 at 23:58 Comment(0)
E
2

This could also be another viable option, which would remove that code and make life a little easier.

$xml = str_replace("<![CDATA[", "", $xml);
$xml = str_replace("]]>", "", $xml);
Elation answered 24/1, 2017 at 13:32 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.