JavaScript DOMParser access innerHTML and other properties
Asked Answered
J

3

33

I am using the following code to parse a string into DOM:

var doc = new DOMParser().parseFromString(string, 'text/xml');

Where string is something like <!DOCTYPE html><html><head></head><body>content</body></html>.

typeof doc gives me object. If I do something like doc.querySelector('body') I get a DOM object back. But if I try to access any properties, like you normally can, it gives me undefined:

doc.querySelector('body').innerHTML; // undefined

The same goes for other properties, e.g. id. The attribute retrieval on the other hand goes fine doc.querySelector('body').getAttribute('id');.

Is there a magic function to have access to those properties?

Jamikajamil answered 12/2, 2012 at 16:43 Comment(0)
S
59

Your current method fails, because HTML properties are not defined for the given XML document. If you supply the text/html MIME-type, the method should work.

var string = '<!DOCTYPE html><html><head></head><body>content</body></html>';
var doc = new DOMParser().parseFromString(string, 'text/html');
doc.body.innerHTML; // or doc.querySelector('body').innerHTML
// ^ Returns "content"

The code below enables the text/html MIME-type for browsers which do not natively support it yet. Is retrieved from the Mozilla Developer Network:

/* 
 * DOMParser HTML extension 
 * 2012-02-02 
 * 
 * By Eli Grey, http://eligrey.com 
 * Public domain. 
 * NO WARRANTY EXPRESSED OR IMPLIED. USE AT YOUR OWN RISK. 
 */  

/*! @source https://gist.github.com/1129031 */  
/*global document, DOMParser*/  

(function(DOMParser) {  
    "use strict";  
    var DOMParser_proto = DOMParser.prototype  
      , real_parseFromString = DOMParser_proto.parseFromString;

    // Firefox/Opera/IE throw errors on unsupported types  
    try {  
        // WebKit returns null on unsupported types  
        if ((new DOMParser).parseFromString("", "text/html")) {  
            // text/html parsing is natively supported  
            return;  
        }  
    } catch (ex) {}  

    DOMParser_proto.parseFromString = function(markup, type) {  
        if (/^\s*text\/html\s*(?:;|$)/i.test(type)) {  
            var doc = document.implementation.createHTMLDocument("")
              , doc_elt = doc.documentElement
              , first_elt;

            doc_elt.innerHTML = markup;
            first_elt = doc_elt.firstElementChild;

            if (doc_elt.childElementCount === 1
                && first_elt.localName.toLowerCase() === "html") {  
                doc.replaceChild(first_elt, doc_elt);  
            }  

            return doc;  
        } else {  
            return real_parseFromString.apply(this, arguments);  
        }  
    };  
}(DOMParser));
Shiller answered 12/2, 2012 at 17:50 Comment(14)
PS. For clarification, when you're using text/xml, doc is an instance of XMDocument. Using text/html, it's an instance of HTMLDocument.Shiller
Waaw, quite a useful answer! Couldn't have found that one myself. Just the mime type and enabling that mime type :)Jamikajamil
@RobW I assume you mean XMLDocument.Syndesis
Thanks @RobW. This was useful for the reverse process where one was able to use regex to edit a text string to add html and then build a replacement node avoiding innerHTML Your solution worked perfectly!Cletacleti
The solution is not bad but: - Why are you using the coma operator and not just 3 instructions? this option is more "obscure" and does not add any advantage. Further more, the first_elt use create a global variable in the window scope (what is soo bad).Champollion
@AdrianMaire All variables are local, the comma's are part of the var statement, not the comma operator. Note that I didn't write the code, and the code is not perfect. For instance, in Internet Explorer 9-, the code will fail because document.documentElement.innerHTML is read-only.Shiller
Very interesting, thanks you for the clarification. As far as I know, document.implementation.createHTMLDocument is only available since IE9, so this code does not support any IE? Do you know any equivalent for IE6-8? Anyway, it still useful for "complient" browsers like firefox.Champollion
@AdrianMaire It really depends on your purposes. A few months ago, I've written a toDOM method for the major 5 browsers, which should not load external resources. Initially, I tried to use a hidden iframe. The DOM is parsed well, but external resources are obviously loaded. In the end, I ended up using document.createElement('html') plus additional expando methods. And unless a significant part of your clients uses IE6/7, I strongly recommend to drop support for these browsers. Practically no-one uses them any more. Some argue that even IE8 may be ignored...Shiller
@RobW : Do you know where I can find a list of browsers which supportDOMParser().parseFromString()but not with thetext/htmlmime type?Forbidding
@Forbidding All modern desktop browsers support this feature, including Chrome 30+, Firefox 12+, IE 10+, Opera 17+ and Safari 7.1+. This information is also available at MDN: developer.mozilla.org/en-US/docs/Web/API/…Shiller
@RobW : Currently, I’concerned by Opera Presto (seems to work with XML documents) (Blink versions can be considered as downgrades); and IE8/IE6, since many companies force their Users to use the preinstalled browser of XP (the same way, they were still using 2000 in 2011).Forbidding
@Forbidding DOMParser+text/html is not supported by Presto. The polyfill works almosy flawlessly though (one notable difference: With the polyfill, if you have <img src> in the HTML, then the image will be loaded). IE6-8 do not support createHTMLDocument, and I don't see why you want to support them, since Windows XP is already end-of-life (so no reason to use IE8 and definitely not IE 6). There is no solid alternative to DOMParser+text/html. You could use document.createElement and assign HTML to it (any external content (<img src>, styles, .) will also be parsed and loaded).Shiller
@RobW :Windows XP is already end-of-life Yes but some companies have even chosen to not upgrade. It may represent 8% of users (I can’t check since logging with a proxy can be considered as an indirect law requirement for computers services here).Forbidding
@Syndesis it used to return a Document instance though.Protomorphic
H
3

Try something like this:

const fragment = document.createRange().createContextualFragment(html);

whereas html is the string you want to convert.

Holloway answered 17/7, 2019 at 13:20 Comment(1)
Yes, this is the best solution if you want also execute scripts, like: https://mcmap.net/q/41868/-domparser-appending-lt-script-gt-tags-to-lt-head-gt-lt-body-gt-but-not-executingFacient
K
0

Use element.getAttribute(attributeName) for XML/HTML elements

Kelleher answered 5/3, 2017 at 17:44 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.