php domdocument read element inner text
Asked Answered
B

1

9

I'm new to php so please forgive the simple question:

How do I extract the text from an element?

     <span id="myElement">Some text I want to read</span>

I have this for a start:

<?php
      $data = $dom->getElementById("myElement");
      $html = $dom->saveHTML($data);

But then? What is the correct instruction?

Brython answered 11/6, 2013 at 22:19 Comment(0)
P
22

To get the text that an element contains, you need the textContent property:

$text = $data->textContent;
Pickle answered 11/6, 2013 at 22:24 Comment(3)
Abby way to emulate innerText specifically? So for example <div>some text</div><div>more text</div> doesn't become some textmore text but separates the words?Preachy
@Preachy , given how much time has passed from this post til now, my hunch is that you will have to iterate through each child element and get its textContent?Thao
The answer to this is that XML/DOM isn't intended to be able to know when whitespace should or shouldn't be implicitly inserted as that's a detail of a higher level. If you take the simple approach of always assuming implied whitespace at element boundaries there are cases where you'll get undesired whitespace such as around certain inline elements. So yes you do need to do a more sophisticated application-level conversion from eg HTML to formatted text. That's basically what innerText is - it's not standard DOM because it's application aware and is formatting with knowledge of HTML.Preachy

© 2022 - 2024 — McMap. All rights reserved.