If all you need to do is produce a copy of the XML with comment nodes removed, then the first parameter of toStringC14N
is a flag that says whether you want comments in the output. Omitting all parameters implicitly sets the first to a false value, so
$doc->toStringC14N
will reproduce the XML trimmed of comments. Note that the Canonical XML form specified by C14N doesn't include an XML declaration header. It is always XML 1.0 encoded in UTF-8.
- If you need to remove the comments from the in-memory structure of the document before processing it further, then
findnodes
with the XPath expression //comment()
will locate them for you, and unbindNode
will remove them from the XML.
This program demonstrates
use strict;
use warnings;
use XML::LibXML;
my $doc = XML::LibXML->load_xml(string => <<END_XML);
<TT>
<A>xyz</A>
<!-- my comment -->
</TT>
END_XML
# Print everything
print $doc->toString, "\n";
# Print without comments
print $doc->toStringC14N, "\n\n";
# Remove comments and print everything
$_->unbindNode for $doc->findnodes('//comment()');
print $doc->toString;
output
<?xml version="1.0"?>
<TT>
<A>xyz</A>
<!-- my comment -->
</TT>
<TT>
<A>xyz</A>
</TT>
<?xml version="1.0"?>
<TT>
<A>xyz</A>
</TT>
Update
To select a specific comment, you can add a predicate expression to the XPath selector. To find the specific comment in your example data you could write
$doc->findnodes('//comment()[. = " my comment "]')
Note that the text of the comment includes everything except the leading and trailing --
, so spaces are significant as shown in that call.
If you want to make things a bit more lax, you could use normalize=space
, which removes leading and trailing whitespace, and contracts every sequence of whitespace within the string to a single space. Now you can write
$doc->findnodes('//comment()[normalize-space(.) = "my comment"]')
And the same call would find your comment even if it looked like this.
<!--
my
comment
-->
Finally, you can make use of contains
, which, as you would expect, simply checks whether one string contains another. Using that you could write
$doc->findnodes('//comment()[contains(., "comm")]')
The one to choose depends on your requirement and your situation.
8
, useXML_READER_TYPE_COMMENT
(fromuse XML::LibXML::Reader qw( XML_READER_TYPE_COMMENT );
) – Mikkimiko