Can I escape HTML special chars in JavaScript?

Asked 4/6, 2011 at 4:50 Answered 13/3 at 19:24

368

I want to display text to HTML by a JavaScript function. How can I escape HTML special characters in JavaScript? Is there an API?

Hinckley answered 4/6, 2011 at 4:50 Comment(3)

This is not a duplicate, since this question does not asks about jQuery. I am interested only in this one, since I do not use jQuery... – Pavel 7/8, 2013 at 16:8

possible duplicate of HtmlSpecialChars equivalent in Javascript? – Ganley 7/8, 2013 at 16:19

Note that the browsers are working on a new HTML Sanitizer API. – Minh 26/1, 2022 at 20:5

531

Here's a solution that will work in practically every web browser:

function escapeHtml(unsafe)
{
    return unsafe
         .replace(/&/g, "&amp;")
         .replace(/</g, "&lt;")
         .replace(/>/g, "&gt;")
         .replace(/"/g, "&quot;")
         .replace(/'/g, "&#039;");
 }

If you only support modern web browsers (2020+), then you can use the new replaceAll function:

const escapeHtml = (unsafe) => {
    return unsafe.replaceAll('&', '&amp;').replaceAll('<', '&lt;').replaceAll('>', '&gt;').replaceAll('"', '&quot;').replaceAll("'", '&#039;');
}

Patella answered 4/6, 2011 at 5:0 Comment(15)

Why "'" and not "'" ? – Hattie 9/11, 2011 at 13:32

because: #2084254 – Ragsdale 27/3, 2013 at 21:33

I think regular expressions in replace() calls are unnecessary. Plain old single-character strings would do just as well. – Dorthadorthea 30/5, 2014 at 14:47

don't forget about .replace(/ /g, ' '), if you convert text with space indentations, spaces may be lost. – Chronological 4/7, 2015 at 7:32

Possibly it's useful to test whether 'unsafe' is a string. It's not unlikely for UI code to deal with numbers. – Rehabilitate 23/11, 2016 at 16:26

I had an issue with ".replace(/'/g, "'");", browser was converting it back to apostrophe in the inline javascript. Workaround was to add escape characther: .replace(/'/g, "\\'"); – Wier 28/9, 2018 at 18:52

@StepanYakovenko That's better handled with CSS. As it is, replacing every space with   will prevent text breaks on spaces (  means "non-breaking space"). – Polychromy 11/8, 2019 at 3:42

is there any standard API or this is the only way? – Saltwort 6/1, 2020 at 7:2

' is valid in HTML5 but not in HTML4 – Susceptible 17/9, 2020 at 16:13

That's the only characters you need to escape? I'd thought it would be larger, but whatever... – Reba 11/4, 2021 at 20:12

@Reba OWASP recommends you escape more than just those, I'm not sure why, but I think it's for futureproofing and some old browsers. This will likely work in most cases, but you can see my answer shows the full character range to escape. – Slowly 30/4, 2021 at 21:46

In case anyone wants to use in javascript strings and not in html then better to use unicode characters for example less than and greater than symbols can use var lt = '\u003c' and gt = '\u003e', here is another reference: #13093626 – Bumboat 7/1, 2022 at 23:52

@Dorthadorthea No, if you do that it only replaces the first occurrence of &, first occurrence of <, etc. You need to use regexes with the /g flag. – Meshach 2/3, 2022 at 12:5

add .replace(String.fromCharCode(92),String.fromCharCode(92,92)) if you also need to mask backslashes like in my case... – Penchant 20/9, 2023 at 12:54

@Dorthadorthea yes, you can use Strings and they are even faster, see benchmark jsbench.me/tfltptr4hv/1 – Pacer 13/3 at 13:17

function escapeHtml(html){
  var text = document.createTextNode(html);
  var p = document.createElement('p');
  p.appendChild(text);
  return p.innerHTML;
}

// Escape while typing & print result
document.querySelector('input').addEventListener('input', e => {
  console.clear();
  console.log( escapeHtml(e.target.value) );
});

<input style='width:90%; padding:6px;' placeholder='&lt;b&gt;cool&lt;/b&gt;'>

Allomorphism answered 20/8, 2014 at 2:50 Comment(2)

Working Here but Not working for me offline in browser – Singlebreasted 15/7, 2018 at 12:30

Note that this doesn't escape quotes (" or ') so strings from this function can still do damage if they are used in HTML tag attributes. – Nettle 30/4, 2021 at 19:32

Using Lodash:

_.escape('fred, barney, & pebbles');
// => 'fred, barney, &amp; pebbles'

Source code

Entire answered 30/10, 2016 at 19:41 Comment(3)

what is the opposite of this? name of the function that does the opposite of this? – Saltwort 6/1, 2020 at 7:4

Same functions in underscore: underscorejs.org/#escape & underscorejs.org/#unescape – Agripina 21/5, 2020 at 10:38

Doesn't seem to work for IP addresses when you try _.escape(192.168.1.1), but if I add quotes, then it works: _.escape('52.60.62.147') even though I'm referencing a variable where the value is not a string. LoDash is so great! – Lorin 7/9, 2022 at 2:23

You can use jQuery's .text() function.

For example:

http://jsfiddle.net/9H6Ch/

From the jQuery documentation regarding the .text() function:

We need to be aware that this method escapes the string provided as necessary so that it will render correctly in HTML. To do so, it calls the DOM method .createTextNode(), does not interpret the string as HTML.

Previous Versions of the jQuery Documentation worded it this way (emphasis added):

We need to be aware that this method escapes the string provided as necessary so that it will render correctly in HTML. To do so, it calls the DOM method .createTextNode(), which replaces special characters with their HTML entity equivalents (such as < for <).

Domenech answered 4/6, 2011 at 5:1 Comment(2)

You can even use it on a fresh element if you just want to convert like this: const str = "foo<>'\"&"; $('<div>').text(str).html() yields foo<>'"& – Mariselamarish 14/11, 2017 at 21:46

Note that this leaves quotes ' and " unescaped, which may trip you up – Novitiate 4/9, 2021 at 8:23

This is, by far, the fastest way I have seen it done. Plus, it does it all without adding, removing, or changing elements on the page.

function escapeHTML(unsafeText) {
    let div = document.createElement('div');
    div.innerText = unsafeText;
    return div.innerHTML;
}

Bayle answered 2/1, 2018 at 0:11 Comment(4)

Warning: it does not escape quotes so you can't use the output inside attribute values in HTML code. E.g. var divCode = '<div data-title="' + escapeHTML('Jerry "Bull" Winston') + '">Div content</div>' will yield invalid HTML! – Auk 17/7, 2019 at 11:59

Using div.textContent instead of div.innerText would probably be more idiomatic. – Brantbrantford 27/11, 2021 at 20:16

Just wondering, would repeatedly calling this eventually leave document full of extra div elements? Or does it get garbage collected? – Rugen 1/2, 2022 at 13:16

@Rugen The div isn't attached to the DOM, so it will eventually be garbage collected. So no, this will not fill the document with useless elements. – Mandamus 14/7, 2022 at 11:33

I think I found the proper way to do it...

// Create a DOM Text node:
var text_node = document.createTextNode(unescaped_text);

// Get the HTML element where you want to insert the text into:
var elem = document.getElementById('msg_span');

// Optional: clear its old contents
//elem.innerHTML = '';

// Append the text node into it:
elem.appendChild(text_node);

Pavel answered 7/8, 2013 at 16:16 Comment(4)

I learnt something new about HTML today. w3schools.com/jsref/met_document_createtextnode.asp. – Prinz 27/6, 2018 at 22:39

Be aware that the content of the text node is not escaped if you try to access it like this: document.createTextNode("<script>alert('Attack!')</script>").textContent – Vetavetch 14/3, 2019 at 15:2

This is the correct way if all you're doing is setting text. That's also textContent but apparently it's not well supported. This won't work however if you're building up a string with some parts text some html, then you need to still escape. – Arvo 16/10, 2019 at 10:29

I really like this, because it's using the DOM properly. It feels less "hacky" than most of the other options. – Groats 18/6, 2021 at 15:14

It was interesting to find a better solution:

var escapeHTML = function(unsafe) {
  return unsafe.replace(/[&<"']/g, function(m) {
    switch (m) {
      case '&':
        return '&amp;';
      case '<':
        return '&lt;';
      case '"':
        return '&quot;';
      default:
        return '&#039;';
    }
  });
};

I do not parse > because it does not break XML/HTML code in the result.

Here are the benchmarks: http://jsperf.com/regexpairs Also, I created a universal escape function: http://jsperf.com/regexpairs2

Deliquescence answered 11/2, 2015 at 15:41 Comment(4)

It's interesting to see that using the switch is significantly faster than the map. I didn't expect this! Thanks for sharing! – Incipit 16/6, 2017 at 20:35

There are many many more unicode characters than you could possible code & take into account. I wouldn't recommend this manual method at all. – Fleabitten 13/6, 2018 at 13:31

Why would you escape multi-byte characters at all? Just use UTF-8 everywhere. – Cupel 20/4, 2019 at 14:40

Skipping > can potentially break code. You must keep in mind that inside the <> is also html. In that case skipping > will break. If you're only escaping for between tags then you probably only need escape < and &. – Arvo 16/10, 2019 at 10:32

The most concise and performant way to display unencoded text is to use textContent property.

Faster than using innerHTML. And that's without taking into account escaping overhead.

document.body.textContent = 'a <b> c </b>';

Bioluminescence answered 29/11, 2017 at 2:57 Comment(1)

@ZzZombo, it is completely normal that it doesn't work with style and script tags. When you add content to them, you add code, not text, use innerHTML in this case. Moreover, you don't need to escape it, these are two special tags that are not parsed as HTML. When parsing, their content is treated as text until the closing sequence </ is met. – Bioluminescence 25/12, 2017 at 16:47

By the books

When editing HTML attributes use recommended "HTML Attribute Encoding":

OWASP recommends that "[e]xcept for alphanumeric characters, [you should] escape all characters with ASCII values less than 256 with the &#xHH; format (or a named entity if available) to prevent switching out of [an] attribute."

So here's a function that does that, with a usage example:

function escapeHTML(unsafe) {
  return unsafe.replace(
    /[\u0000-\u002F\u003A-\u0040\u005B-\u0060\u007B-\u00FF]/g,
    c => '&#' + ('000' + c.charCodeAt(0)).slice(-4) + ';'
  )
}

document.querySelector('div').innerHTML =
  '<span class=' +
  escapeHTML('"fakeclass" onclick="alert("test")') +
  '>' +
  escapeHTML('<script>alert("inspect the attributes")\u003C/script>') +
  '</span>'

<div></div>

You should verify the entity ranges I have provided to validate the safety of the function yourself. You could also use this regular expression which has better readability and should cover the same character codes, but is about 10% less performant in my browser:

/(?![0-9A-Za-z])[\u0000-\u00FF]/g

When editing HTML content between `<tags>`, use "HTML Entity Encoding":

For this, OWASP recommends you to "look at the .textContent attribute as it is a Safe Sink and will automatically HTML Entity Encode."

Slowly answered 4/3, 2021 at 19:40 Comment(0)

DOM Elements support converting text to HTML by assigning to innerText. innerText is not a function but assigning to it works as if the text were escaped.

document.querySelectorAll('#id')[0].innerText = 'unsafe " String >><>';

Inmesh answered 21/8, 2017 at 10:27 Comment(2)

At least in Chrome assigning multiline text adds <br> elements in place of newlines, that can break certain elements, like styles or scripts. The createTextNode is not prone to this problem. – Pavis 25/12, 2017 at 4:30

innerText has some legacy/spec issues. Better to use textContent. – Polychromy 11/8, 2019 at 3:35

You can encode every character in your string:

function encode(e){return e.replace(/[^]/g,function(e){return"&#"+e.charCodeAt(0)+";"})}

Or just target the main characters to worry about (&, inebreaks, <, >, " and ') like:

function encode(r){
return r.replace(/[\x26\x0A\<>'"]/g,function(r){return"&#"+r.charCodeAt(0)+";"})
}

test.value=encode('How to encode\nonly html tags &<>\'" nice & fast!');

/*************
* \x26 is &ampersand (it has to be first),
* \x0A is newline,
*************/

<textarea id=test rows="9" cols="55">&#119;&#119;&#119;&#46;&#87;&#72;&#65;&#75;&#46;&#99;&#111;&#109;</textarea>

Loferski answered 26/7, 2015 at 13:54 Comment(1)

Writing your own escape function is generally a bad idea. Other answers are better in this regard. – Spoor 13/10, 2016 at 12:29

If you already use modules in your application, you can use escape-html module.

import escapeHtml from 'escape-html';
const unsafeString = '<script>alert("XSS");</script>';
const safeString = escapeHtml(unsafeString);

Unconsidered answered 11/3, 2020 at 15:13 Comment(0)

I came across this issue when building a DOM structure. This question helped me solve it. I wanted to use a double chevron as a path separator, but appending a new text node directly resulted in the escaped character code showing, rather than the character itself:

var _div = document.createElement('div');
var _separator = document.createTextNode('&raquo;');
//_div.appendChild(_separator); /* This resulted in '&raquo;' being displayed */
_div.innerHTML = _separator.textContent; /* This was key */

Conal answered 30/7, 2019 at 8:36 Comment(0)

For a quick one-liner, the following works:

const escaped = new Option(unescaped).innerHTML;

For example:

const unescaped = "<h1>Header</h1>";
const escaped = new Option(unescaped).innerHTML; // "&lt;h1&gt;Header&lt;/h1&gt;"

Frequentative answered 13/3 at 19:24 Comment(0)

Just write the code in between <pre><code class="html-escape">....</code></pre>. Make sure you add the class name in the code tag. It will escape all the HTML snippet written in
<pre><code class="html-escape">....</code></pre>.

const escape = {
    '"': '&quot;',
    '&': '&amp;',
    '<': '&lt;',
    '>': '&gt;',
}
const codeWrappers = document.querySelectorAll('.html-escape')
if (codeWrappers.length > 0) {
    codeWrappers.forEach(code => {
        const htmlCode = code.innerHTML
        const escapeString = htmlCode.replace(/"|&|<|>/g, function (matched) {
            return escape[matched];
        });
        code.innerHTML = escapeString
    })
}

<pre>
    <code class="language-html html-escape">
        <div class="card">
            <div class="card-header-img" style="background-image: url('/assets/card-sample.png');"></div>
            <div class="card-body">
                <p class="card-title">Card Title</p>
                <p class="card-subtitle">Srcondary text</p>
                <p class="card-text">Greyhound divisively hello coldly wonderfully marginally far upon
                    excluding.</p>
                <button class="btn">Go to </button>
                <button class="btn btn-outline">Go to </button>
            </div>
        </div>
    </code>
</pre>

Overact answered 14/4, 2021 at 8:41 Comment(0)

-1

Use this to remove HTML tags from a string in JavaScript:

const strippedString = htmlString.replace(/(<([^>]+)>)/gi, "");

console.log(strippedString);

Indenture answered 10/9, 2020 at 14:40 Comment(1)

Escaping does not mean removing – Allies 20/10, 2021 at 19:57

-2

Try this, using the prototype.js library:

string.escapeHTML();

Try a demo

Geldens answered 16/4, 2014 at 20:48 Comment(1)

This requires the "prototype.js" library, which wasn't immediately apparent from the demo. :( – Jubilate 1/8, 2014 at 19:56

-7

I came up with this solution.

Let's assume that we want to add some HTML to the element with unsafe data from the user or database.

var unsafe = 'some unsafe data like <script>alert("oops");</script> here';

var html = '';
html += '<div>';
html += '<p>' + unsafe + '</p>';
html += '</div>';

element.html(html);

It's unsafe against XSS attacks. Now add this: $(document.createElement('div')).html(unsafe).text();

So it is

var unsafe = 'some unsafe data like <script>alert("oops");</script> here';

var html = '';
html += '<div>';
html += '<p>' + $(document.createElement('div')).html(unsafe).text(); + '</p>';
html += '</div>';

element.html(html);

To me this is much easier than using .replace() and it'll remove!!! all possible HTML tags (I hope).

Delphina answered 30/3, 2016 at 9:53 Comment(2)

this is dangerous idea, it parses the unsafe HTML String as HTML, if the element were attached to the DOM it would exeute. use .innerText instead. – Inmesh 21/8, 2017 at 10:21

This is not safe. It converts <script> into <script>. – Thoughtless 24/1, 2018 at 17:31

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

By the books

When editing HTML attributes use recommended "HTML Attribute Encoding":

When editing HTML content between <tags>, use "HTML Entity Encoding":

Recommended topics

Hot tags

When editing HTML content between `<tags>`, use "HTML Entity Encoding":