Javascript sanitization: The most safe way to insert possible XSS html string
Asked Answered
A

3

6

Currently i'm using this method with jQuery solution, to clean string from possible XSS attacks.

sanitize:function(str) {
    // return htmlentities(str,'ENT_QUOTES');
    return $('<div></div>').text(str).html().replace(/"/gi,'&quot;').replace(/'/gi,'&apos;');   
}

But i have a feeling it's not safe enough. Do i miss something?

I have tried htmlentities from phpjs project here: http://phpjs.org/functions/htmlentities:425/

But it's kinda bugged and returns some additional special symbols. Maybe it's an old version?

For example:

htmlentities('test"','ENT_QUOTES');

Produces:

test&amp;quot;

But should be:

test&quot;

How are you handling this via javascript?

Andy answered 2/7, 2012 at 10:48 Comment(5)
How do you intend to use the "sanitized" string?Enumerate
Insert into html document ofc as text. As href="sanitized" or src="sanitized", or <div>sanitized</div>Andy
From where the insert is triggered? Do you want to insert the string into an already opened page dynamically using Javascript, or into the server-generated HTML document using PHP?Enumerate
Yes dynamically using javascript. String comes from untrusted source.Andy
Use Caja's html_sanitize.js. #12254186Thurston
M
3

If your string is supposed to be plain text without HTML formatting, just use .createTextNode(text)/assigning to .data property of existing text node. Whatever you put there will always be interpreted as text and needs no additional escaping.

Mover answered 2/7, 2012 at 11:7 Comment(2)
What about ' or " and others symbols of which i maybe don't know?Andy
I wrote "whatever" and it is "whatever" indeed. You'd be manipulating field in DOM structure that can hold only text. Those operations won't ever cause any automagical invocations of HTML parser like famous innerHTML would do (this, BTW, is still considered one of the worst design flaws of that feature).Mover
E
3

Yes dynamically using javascript. String comes from untrusted source.

Then you don't need to sanitize it manually. With jQuery you can just write

​var str = '<div>abc"def"ghi</div>​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​';

​$​('test').text(str);
$('test').attr('alt', str);

Browser will separate the data from the code for you.

Example: http://jsfiddle.net/HNQvd/

Enumerate answered 2/7, 2012 at 11:14 Comment(4)
While alt is safe, you will still have problems with attributes that can be interpreted as non-text down the line. Most obviously onclick and other handlers, but also src.Mover
Everything could be considered harmful. If the OP is going to change arbitrary attributes (with user-supplied names), then they have much bigger problem than just sanitizing the values.Enumerate
People. I have a working solution in my post. I just want you people, to updated it if it have some vulnerabilities if used inside <div>HERE</div> or <a href="HERE"></a> , or <img src="HERE"/>. That's all. :)Andy
@Beck Once again, you should not re-invent the bicycle (which may have some vulnerabilities); you should rather use a well-supported standard solution which separates the data and the code. For example, instead of trying hard to sanitize the string and then writing your sort-of-sanitized result into the innerHtml, you should rather leave the source string as-is and write it into the innerText.Enumerate
M
1

You should quote other characters too:

'
"
<
>
(
)
;

They all can be used for XSS attacks.

Mure answered 2/7, 2012 at 11:11 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.