What function do you use to get innerHTML of a given DOMNode in the PHP DOM implementation? Can someone give reliable solution?
Of course outerHTML will do too.
What function do you use to get innerHTML of a given DOMNode in the PHP DOM implementation? Can someone give reliable solution?
Of course outerHTML will do too.
Compare this updated variant with PHP Manual User Note #89718:
<?php
function DOMinnerHTML(DOMNode $element)
{
$innerHTML = "";
$children = $element->childNodes;
foreach ($children as $child)
{
$innerHTML .= $element->ownerDocument->saveHTML($child);
}
return $innerHTML;
}
?>
Example:
<?php
$dom= new DOMDocument();
$dom->preserveWhiteSpace = false;
$dom->formatOutput = true;
$dom->load($html_string);
$domTables = $dom->getElementsByTagName("table");
// Iterate over DOMNodeList (Implements Traversable)
foreach ($domTables as $table)
{
echo DOMinnerHTML($table);
}
?>
DOMDocument
. Also one might want to replace the trim
with an ltrim
(or even remove it completely) to preserve a bit of the whitespace like line-breaks. –
Ictus DOMElement
instead of a DOMNode
as I was passing the return from DOMDocument::getElementById()
. Just in case it trips someone else up. –
Backboard saveHTML()
on the $table
? Look: PHP outerHTML S/O –
Giltedged 11<br>22<br/>33
you will not get exact version. –
Evans Here is a version in a functional programming style:
function innerHTML($node) {
return implode(array_map([$node->ownerDocument,"saveHTML"],
iterator_to_array($node->childNodes)));
}
To return the html
of an element, you can use C14N():
$dom = new DOMDocument();
$dom->loadHtml($html);
$x = new DOMXpath($dom);
foreach($x->query('//table') as $table){
echo $table->C14N();
}
A simplified version of Haim Evgi's answer:
<?php
function innerHTML(\DOMElement $element)
{
$doc = $element->ownerDocument;
$html = '';
foreach ($element->childNodes as $node) {
$html .= $doc->saveHTML($node);
}
return $html;
}
Example usage:
<?php
$doc = new \DOMDocument();
$doc->loadHTML("<body><div id='foo'><p>This is <b>an <i>example</i></b> paragraph<br>\n\ncontaining newlines.</p><p>This is another paragraph.</p></div></body>");
print innerHTML($doc->getElementById('foo'));
/*
<p>This is <b>an <i>example</i></b> paragraph<br>
containing newlines.</p>
<p>This is another paragraph.</p>
*/
There's no need to set preserveWhiteSpace
or formatOutput
.
In addition to trincot's nice version with array_map
and implode
but this time with array_reduce
:
return array_reduce(
iterator_to_array($node->childNodes),
function ($carry, \DOMNode $child) {
return $carry.$child->ownerDocument->saveHTML($child);
}
);
Still don't understand, why there's no reduce()
method which accepts arrays and iterators alike.
function setnodevalue($doc, $node, $newvalue){
while($node->childNodes->length> 0){
$node->removeChild($node->firstChild);
}
$fragment= $doc->createDocumentFragment();
$fragment->preserveWhiteSpace= false;
if(!empty($newvalue)){
$fragment->appendXML(trim($newvalue));
$nod= $doc->importNode($fragment, true);
$node->appendChild($nod);
}
}
$node->ownerDocument
. –
Fetterlock Here's another approach based on this comment by Drupella on php.net, that worked well for my project. It defines the innerHTML()
by creating a new DOMDocument
, importing and appending to it the target node, instead of explicitly iterating over child nodes.
Let's define this helper function:
function innerHTML( \DOMNode $n, $include_target_tag = true ) {
$doc = new \DOMDocument();
$doc->appendChild( $doc->importNode( $n, true ) );
$html = trim( $doc->saveHTML() );
if ( $include_target_tag ) {
return $html;
}
return preg_replace( '@^<' . $n->nodeName .'[^>]*>|</'. $n->nodeName .'>$@', '', $html );
}
where we can include/exclude the outer target tag through the second input argument.
Here we extract the inner HTML for a target tag given by the "first" id attribute:
$html = '<div id="first"><h1>Hello</h1></div><div id="second"><p>World!</p></div>';
$doc = new \DOMDocument();
$doc->loadHTML( $html );
$node = $doc->getElementById( 'first' );
if ( $node instanceof \DOMNode ) {
echo innerHTML( $node, true );
// Output: <div id="first"><h1>Hello</h1></div>
echo innerHTML( $node, false );
// Output: <h1>Hello</h1>
}
Live example:
http://sandbox.onlinephpfunctions.com/code/2714ea116aad9957c3c437d46134a1688e9133b8
Old query, but there is a built-in method to do that. Just pass the target node to DomDocument->saveHtml()
.
Full example:
$html = '<div><p>ciao questa è una <b>prova</b>.</p></div>';
$dom = new DomDocument($html);
@$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
$node = $xpath->query('.//div/*'); // with * you get inner html without surrounding div tag; without * you get inner html with surrounding div tag
$innerHtml = $dom->saveHtml($node);
var_dump($innerHtml);
Output: <p>ciao questa è una <b>prova</b>.</p>
For people who want to get the HTML from XPath query, here is my version:
$xpath = new DOMXpath( $my_dom_object );
$DOMNodeList = $xpath->query('//div[contains(@class, "some_custom_class_in_html")]');
if( $DOMNodeList->count() > 0 ) {
$page_html = $my_dom_object->saveHTML( $DOMNodeList->item(0) );
}
innerHTML using C14N()
and xpath query:
$node->C14N(
true, // parse only xpath query nodes
false, // without comments
["query" => ".//node()|.//*//@*"] // select all inner nodes & attributes
);
mb_convert_encoding
with HTML-ENTITIES
is deprecated in PHP 8.
function setInnerHTML($element, $content) {
$DOMInnerHTML = new DOMDocument();
$DOMInnerHTML->loadHTML(
<<<HTML
<html>
<head>
<meta charset="utf-8">
</head>
<body>
$content
</body>
</html>
HTML,
);
foreach (
$DOMInnerHTML->getElementsByTagName('body')->item(0)->childNodes
as $contentNode
) {
$contentNode = $element->ownerDocument->importNode($contentNode, true);
$element->appendChild($contentNode);
}
}
Including an HTML boilerplate is probably the best way to achieve clean, UTF-8 encoded text within the added DOM nodes. I've tried creating a drop-in replacement for mb_convert_encoding
with HTML-ENTITIES
, but I always ended up with mojibake.
After experimenting with some implementations I found here, I engineered the perfect solution that you can use to set inner HTML:
function setInnerHTML($element, $content) {
$DOMInnerHTML = new DOMDocument();
$DOMInnerHTML->loadHTML(
mb_convert_encoding("<div>$content</div>", 'HTML-ENTITIES', 'UTF-8')
);
foreach (
$DOMInnerHTML->getElementsByTagName('div')->item(0)->childNodes
as $contentNode
) {
$contentNode = $element->ownerDocument->importNode($contentNode, true);
$element->appendChild($contentNode);
}
}
Notes:
mb_convert_encoding
function, this also requires the mbstring
extension. If you omit the call here, this might cause mojibake.<div>
element to prevent creating an implicit <p>
if there is no root element. This prevents problems when embedding into an element like <title>
.DocumentFragment
, this fetches a DOMNodeList
of the nodes, iterates through it, and appends each node to the element.I created this to implement a basic templating system into a school project of mine.
© 2022 - 2024 — McMap. All rights reserved.