Is using htmlspecialchars() sufficient in all situations?
Asked Answered
W

4

10

My users are allowed to insert anything into my database.

So using a whitelist / blacklist of characters is not an option.

I'm not worried (covered it) about the database end (SQL injection), but rather code injection in my pages.

Are there any situations where htmlspecialchars() wouldn't be sufficient to prevent code injection?

Wohlert answered 12/11, 2011 at 18:28 Comment(7)
So, are they allowed to enter anything or disallowed to enter at least angle brackets?Fissi
@Col.Shrapnel: Anything is allowed.Wohlert
@downvoter: why the downvote? dupe somewhere? tell me please.Wohlert
if anything is allowed, you don't have to replace angle brackets. Make your mind at last.Fissi
@Col.Shrapnel: Now why would you say that??? Anything is allowed as input doesn't mean anything should get 'executed' when displayed... E.g. when a user signs up he can use <script>alert('test')</script> as a username, but I don't want that javascript to get executed when another user visits his profile.Wohlert
so, javascript is not allowed. You are contradict with yourself :)Fissi
@Col.Shrapnel: since you fail to read: Col. you're right! :)Wohlert
U
4

Plain htmlspecialchars is not sufficient when inserting user text into single quoted attributes. You need to add ENT_QUOTES in that case and you need to pass the encoding.

<tag attr='<?php echo htmlspecialchars($usertext);?>'> //dangerous if ENT_QUOTES is not used

When inserting user text into javascript/json as string you'll need additional escaping.

I think it fails for stange character sets too. But if you use one of the usual charsets UTF-8, Latin1,... it will work as expected.

Unmannered answered 12/11, 2011 at 18:37 Comment(1)
I hate single quoted attributes :) Thanks for the heads up on the JS part since it will be the case in some point in the future.Wohlert
L
3

Using htmlspecialchars is sufficient when inserting inside HTML code. The way it encodes the characters makes it impossible for the resulting text to “break out” of the current element. That way it can neither create other elements, nor script segments etc.

However in all other situations, htmlspecialchars it not automatically enough. For example when you use it to insert code within some JavaScript area, for example when you fill a JavaScript string with it, you will need additional methods to make it safe. In that case addslashes could help.

So depending on where you insert the resulting text, htmlspecialchars gives you either enough security or not. As the function name already suggests, it just promises security for HTML.

Listel answered 12/11, 2011 at 18:36 Comment(0)
C
3

No, it's not sufficient in all situations. It highly depends on your codebase. For example, if you use JavaScript to make certain AJAX requests to a database, htmlspecialchars() will sometimes not be enough (depending where you use it). If you want to protect cookies from JavaScript XSS, htmlspecialchars() will also not be good enough.

Here are some examples of when htmlspecialchars() may fail: https://www.owasp.org/index.php/Interpreter_Injection#Why_htmlspecialchars_is_not_always_enough. Your question is also highly dependent on what database you're using (not everyone uses MySQL). If you're writing a complex applicaton I highly suggest using one of the many frameworks out there that abstract these annoying little idiosyncrasies and let you worry about the application code.

Carnelian answered 12/11, 2011 at 18:36 Comment(2)
Why is the database relevant? DB access and encoding output are separate concerns.Unmannered
As far as Postgres is concerned it doesn't matter (they both use SQL) - but some databases don't use SQL.Carnelian
R
2

htmlspecialchars will suffice. With < and > being converted to &lt; and &gt; you cannot include scripts anymore.

Rigel answered 12/11, 2011 at 18:35 Comment(6)
Depends on where the user text is inserted into the html. If it's outside a tag/script it will be enough.Unmannered
True, forgot to mention that. If you don't want the content to be interpreted as a script, you should not include it in script tags.Rigel
Not true for attribute values.Threaten
@Threaten when we output user's input back into the front end, it's because we want to show their profile or their comments on the front end. I don't see a scenario why I would want to insert what my user freely typed into my <script> tag or html attribute.Stability
@DanielWu I was brief such that my point was lost. I would say more fully "In response to the original answer, there are unfortunately edge cases to be aware of where htmlspecialchars is not enough because valid escaped html is treated differently. One example of this is in html attributes, where attributes can get interpreted differently by the browser. As a result, in a handful of scenarios, beware that it is not enough simply escape everything."Threaten
One simple use case is if you let a user specify an image url, and then use that data in the <img src= attribute, even if you escape for html, attributes are treated differently by the browser, so XSS may still be able to occur.Threaten

© 2022 - 2024 — McMap. All rights reserved.