Converting Range or DocumentFragment to string

T

6

37

Is there a way to get the html string of a JavaScript Range Object in W3C compliant browsers?

For example, let us say the user selects the following: Hello <b>World</b>
It is possible to get "Hello World" as a string using the Range.toString() method. (In Firefox, it is also possible using the document's getSelection method.)

But I can't seem to find a way to get the inner HTML.

After some searching, I've found that the range can be converted to a DocumentFragment Object.

But DocumentFragments have no innerHTML property (at least in Firefox; have not tried Webkit or Opera).
Which seems odd to me: It would seem obvious that there should be some way to acces the selected items.

I realize that I can create a documentFragment, append the document fragment to another element, and then get the innerHTML of that element.
But that method will auto close any open tags within the area I select.
Besides that there surely is an obvious "better way" than attaching it to the dom just to get it as a string.

So, how to get the string of the html of a Range or DocFrag?

Tapes answered 4/2, 2011 at 10:28 Comment(1)

Same here. Looking for a way to traverse the Range. – Recurrence 4/11, 2018 at 11:25

A

5

No, that is the only way of doing it. The DOM Level 2 specs from around 10 years ago had almost nothing in terms of serializing and deserializing nodes to and from HTML text, so you're forced to rely on extensions like innerHTML.

Regarding your comment that

But that method will auto close any open tags within the area I select.

... how else could it work? The DOM is made up of nodes arranged in a tree. Copying content from the DOM can only create another tree of nodes. Element nodes are delimited in HTML by a start and sometimes an end tag. An HTML representation of an element that requires an end tag must have an end tag, otherwise it is not valid HTML.

Astern answered 4/2, 2011 at 11:2 Comment(2)

The end tags are created when the range is converted to a document fragment (isn't that correct?). However, it should be possible to find out what is contained in the range before it is converted into nodes - even if the range contains invalid markup. – Tapes 6/2, 2011 at 8:14

I disagree. When invalid markup is parsed, the browser handles it however it sees fit and creates the appropriate nodes in the DOM, which is the browser's own representation of the document. That invalid markup is essentially then thrown away, at least as far as the DOM (which is what JavaScript can access) is concerned. You need to stop thinking of the DOM in terms of a string and start thinking of it as a tree. End tags are a product of serializing this tree to an HTML string (such as via innerHTML). They do not exist as entities within the tree. – Astern 6/2, 2011 at 12:46

S

20

So, how to get the string of the html of a Range or DocFrag?

Contrary to the other responses, it is possible to directly turn a DocumentFragment object into a DOMString using the XMLSerializer.prototype.serializeToString method described at https://w3c.github.io/DOM-Parsing/#the-xmlserializer-interface.

To get the DOMString of a Range object, simply convert it to a DocumentFragment using either of the Range.prototype.cloneContents or Range.prototype.extractContents methods and then follow the procedure for a DocumentFragment object.

I've attached a demo, but the gist of it is in these two lines:

const serializer = new XMLSerializer();
const document_fragment_string = serializer.serializeToString(document_fragment);

(() => {
	"use strict";
	const HTML_namespace = "http://www.w3.org/1999/xhtml";
	document.addEventListener("DOMContentLoaded", () => {
		/* Create Hypothetical User Range: */
		const selection = document.defaultView.getSelection();
		const user_range_paragraph = document.getElementById("paragraph");
		const user_range = document.createRange();
		user_range.setStart(user_range_paragraph.firstChild, 0);
		user_range.setEnd(user_range_paragraph.lastChild, user_range_paragraph.lastChild.length || user_range_paragraph.lastChild.childNodes.length);
		selection.addRange(user_range);

		/* Clone Hypothetical User Range: */
		user_range.setStart(selection.anchorNode, selection.anchorOffset);
		user_range.setEnd(selection.focusNode, selection.focusOffset);
		const document_fragment = user_range.cloneContents();

		/* Serialize the User Range to a String: */
		const serializer = new XMLSerializer();
		const document_fragment_string = serializer.serializeToString(document_fragment);

		/* Output the Serialized User Range: */
		const output_paragraph = document.createElementNS(HTML_namespace, "p");
		const output_paragraph_code = document.createElementNS(HTML_namespace, "code");
		output_paragraph_code.append(document_fragment_string);
		output_paragraph.append(output_paragraph_code);
		document.body.append(output_paragraph);
	}, { "once": true });
})();

<p id="paragraph">Hello <b>World</b></p>

Syntax answered 18/4, 2018 at 8:37 Comment(2)

this works, but in inserts a 'xmlns' attribute on the node, which is weird. – Weigel 25/6, 2018 at 9:12

You could use regEx to remove that

const xmlnAttribute = ' xmlns="http://www.w3.org/1999/xhtml"';   const regEx = new RegExp(xmlnAttribute, 'g');   const newstr = document_fragment_string.replace(regEx, '');

– Delight 24/2, 2022 at 21:16

U

18

FWIW, the jQuery way:

$('<div>').append(fragment).html()

Utopian answered 17/7, 2014 at 18:18 Comment(0)

P

16

To spell out an example from here:

//Example setup of a fragment 
var frag = document.createDocumentFragment(); //make your fragment 
var p = document.createElement('p'); //create <p>test</p> DOM node
p.textContent = 'test';
frag.appendChild( p  ); 

//Outputting the fragment content using a throwaway intermediary DOM element (div):
var div = document.createElement('div');
div.appendChild( frag.cloneNode(true) );
console.log(div.innerHTML); //output should be '<p>test</p>'

Precious answered 10/7, 2014 at 19:20 Comment(2)

That is the approach described in the question. The OP wants a better way (which sadly doesn't exist). – Astern 11/7, 2014 at 8:34

Well, yes. But the question does not contain any code example. So, this answer does have its worth. Also, in the question is no mention of cloneNode(). – Antler 23/1 at 9:50

L

7

Another way to do it would be to iterate over childNodes:

Array.prototype.reduce.call(
    documentFragment.childNodes, 
    (result, node) => result + (node.outerHTML || node.nodeValue),
    ''
);

Wouldn't work for inlined SVG, but something could be done to get it to work. It also helps if you need to do some chained manipulation with the nodes and get an html string as a result.

Londonderry answered 1/2, 2019 at 6:36 Comment(0)

A

5

No, that is the only way of doing it. The DOM Level 2 specs from around 10 years ago had almost nothing in terms of serializing and deserializing nodes to and from HTML text, so you're forced to rely on extensions like innerHTML.

Regarding your comment that

But that method will auto close any open tags within the area I select.

... how else could it work? The DOM is made up of nodes arranged in a tree. Copying content from the DOM can only create another tree of nodes. Element nodes are delimited in HTML by a start and sometimes an end tag. An HTML representation of an element that requires an end tag must have an end tag, otherwise it is not valid HTML.

Astern answered 4/2, 2011 at 11:2 Comment(2)

The end tags are created when the range is converted to a document fragment (isn't that correct?). However, it should be possible to find out what is contained in the range before it is converted into nodes - even if the range contains invalid markup. – Tapes 6/2, 2011 at 8:14

I disagree. When invalid markup is parsed, the browser handles it however it sees fit and creates the appropriate nodes in the DOM, which is the browser's own representation of the document. That invalid markup is essentially then thrown away, at least as far as the DOM (which is what JavaScript can access) is concerned. You need to stop thinking of the DOM in terms of a string and start thinking of it as a tree. End tags are a product of serializing this tree to an HTML string (such as via innerHTML). They do not exist as entities within the tree. – Astern 6/2, 2011 at 12:46

D

-1

Could DocumentFragment.textContent give you what you need?

var frag = document.createRange().createContextualFragment("Hello <b>World</b>.");

console.log(frag.textContent)

Dentation answered 31/8, 2018 at 13:14 Comment(1)

textContent would not include tags, only text – Previous 1/10, 2021 at 9:17

Recommended topics

Hot tags