Htmlentities vs addslashes vs mysqli_real_escape_string
Asked Answered
N

7

7

I've been doing some reading on securing PHP applications, and it seems to me that mysqli_real_escape_string is the correct function to use when inserting data into MySQL tables because addslashes can cause some weird things to happen for a smart attacker. Right?

However, there is one thing that is confusing me. I seem to remember being advised addslashes is better than htmlentities when echoing user-entered data back to users to protect their data, but it seems like addslashes is the one with the vulnerability. Is this true, or am I remembering incorrectly?

Nodular answered 6/2, 2010 at 18:43 Comment(0)
L
6

There are different contexts for your data. The context of inserting data into the database needs to be escaped differently than the context of rendering html/xml or even an email message.

Escaping data going into a db should be deprecated in all new code in favor of prepared statements. Anyone who tells you otherwise is doing you a great disservice.

Escaping data going to the browser needs to be escaped in a number of different ways depending on the target. Sometimes htmlspecialchars is enough, sometimes you need to use htmlentities. Sometimes you need numeric entities. It is a topic you should do some research on to know all of the nuances.

The general rule I live by is validate (not filter, reject if incorrect) input & escape output (based on context).

Lincoln answered 6/2, 2010 at 18:55 Comment(4)
Speedwise, prepared statements are only better if you're using the prepared query multiple times in a single script. Otherwise, you're making two round trips to the server, once to prepare, and once to execute. If you are already using code that automatically ensures input validation, I don't know why prepared statements are better. the mysql docs cover this. dev.mysql.com/tech-resources/articles/4.1/… If I'm wrong (it's possible and probable), please tell me why.Craver
Greets Mike. I've seen this counter argument too many times and I think it needs to be put to rest. Please understand, that I don't disagree with what you've said. I do however disagree with this advice in a forum like this. I think safety should be a higher priority than error prone performance in this case. The OP was not complaining about his site, but rather a lack of clear definition between the different escaping techniques. Would you not agree that someone beginning on this journey should take the safer road?Lincoln
I asbolutely agree. I was asking about performance for myself because I'm not sure of the answer. I didn't make that clear.Craver
shiflett.org/blog/2006/jan/… has been fixed, try out the exploit on a fully patched php5 mysql5.Vanettavang
I
12

They are different tools for different purposes.

mysqli_real_escape_string makes data safe for inserting into MySQL (but parametrized queries are better).

Htmlentities makes data safe for outputting into an HTML document

addslashes makes data safe for a few other situations, but is insufficient for MySQL

Inanimate answered 6/2, 2010 at 18:49 Comment(3)
You should post a case where addslashes() fails and mysql_real_escape_string() stops it. I don't think you can, but here is a query where they both fail: mysql_query("select * from user where id=".$_GET[id]);Vanettavang
addslashes assumes everything is 8bit. mysql_real_escape_string takes the character encoding into account when doing its encoding. Proof: shiflett.org/blog/2006/jan/…Lincoln
@Eric, that has been fixed. Try out his exploit on a fully patched system, it will fail.Vanettavang
L
6

There are different contexts for your data. The context of inserting data into the database needs to be escaped differently than the context of rendering html/xml or even an email message.

Escaping data going into a db should be deprecated in all new code in favor of prepared statements. Anyone who tells you otherwise is doing you a great disservice.

Escaping data going to the browser needs to be escaped in a number of different ways depending on the target. Sometimes htmlspecialchars is enough, sometimes you need to use htmlentities. Sometimes you need numeric entities. It is a topic you should do some research on to know all of the nuances.

The general rule I live by is validate (not filter, reject if incorrect) input & escape output (based on context).

Lincoln answered 6/2, 2010 at 18:55 Comment(4)
Speedwise, prepared statements are only better if you're using the prepared query multiple times in a single script. Otherwise, you're making two round trips to the server, once to prepare, and once to execute. If you are already using code that automatically ensures input validation, I don't know why prepared statements are better. the mysql docs cover this. dev.mysql.com/tech-resources/articles/4.1/… If I'm wrong (it's possible and probable), please tell me why.Craver
Greets Mike. I've seen this counter argument too many times and I think it needs to be put to rest. Please understand, that I don't disagree with what you've said. I do however disagree with this advice in a forum like this. I think safety should be a higher priority than error prone performance in this case. The OP was not complaining about his site, but rather a lack of clear definition between the different escaping techniques. Would you not agree that someone beginning on this journey should take the safer road?Lincoln
I asbolutely agree. I was asking about performance for myself because I'm not sure of the answer. I didn't make that clear.Craver
shiflett.org/blog/2006/jan/… has been fixed, try out the exploit on a fully patched php5 mysql5.Vanettavang
L
0

You could also use the PDO libs which does most of the escaping for you, in case you may use PHP5 on the servers.

On echoing back I'd personally prefer htmlspecialchars, but one might correct me

Lagan answered 6/2, 2010 at 18:45 Comment(0)
C
0

yes, use the mysqli_real_escape_string or a library like PDO on all user input. When echoing back, I use htmlentities with ENT_QUOTES as the second parameter, as it escapes all applicable characters to their html entities, including quotes.

Craver answered 6/2, 2010 at 18:49 Comment(0)
D
0

Note: Using htmlentities() in an UTF-8 encoded document should be avoided. See:

Pay attention to (quoted from phpwact.org):

With modern web browsers and widespead support for UTF-8, you don’t need htmlentities because all of these characters can be represented directly in UTF-8. More importantly, in general, only browsers support HTML‘s special characters - a normal text editor, for example, is unaware of HTML entities. Depending on what you’re doing, using htmlentities may reduce the ability of other systems to “consume” your content.

Also (not confirmed but sounds reasonable - from anon comment here), character entities (stuff like » or —) do not work when a document is served as application/xml+xhtml (unless you define them). You can still get away with the numeric form though.

Daryldaryle answered 6/2, 2010 at 19:13 Comment(2)
The common use of htmlentities() is not rendering "weird" characters but rather avoid potentially dangerous text to be parsed as markup, and this has nothing to do with encoding.Becoming
Avoiding potentially dangerous text can be achieved with htmlspecialchars(). htmlentities() has the overhead of converting 'safe' characters and it presents additional problems which I mentioned earlier.Daryldaryle
V
0

Another interesting solution for PHP 5.2 and above is to use the filter extension: http://www.php.net/manual/en/book.filter.php

It allows you to validate and sanitize user inputs. There are many built-in filters available and they can be combined with flags to tweak their behaviour. In addition hese filters can also be used to validate/sanitize ints, floats, emails, specific regular expressions.

I personally have started using them in my projects to validate forms and to output user-entered data, and I am very glad I did. Although, when I insert values in a MySQL database, I use prepared queries for added security. These solutions together can help avoid most SQL injections and XSS-type attacks.

Vanzandt answered 6/2, 2010 at 19:24 Comment(0)
V
0

You can't have one "escape" function and expect it to work all of the time. There are different attacks that require specific sanitation routines. The only way to understand this concept is to write some vulnerable code and then exploit it. Writing exploit code is vital to the understanding of any security system.

For instance this query is vulnerable to Sql injection:

$host=htmlspecialchars($_GET[host],ENT_QUOTES);
$name=htmlspecialchars($_GET[name],ENT_QUOTES);
mysql_query("select * from user where Host='$host' and Name='$name' ");

Exploit: http://localhost/sqli_test.php?host=\&name=%20sleep(20)--%201

The best escape function for mysql is mysqli_real_escape_string() but this can fail:

mysql_query("select * from user where id=".mysqli_real_escape_string($_GET[id]));

exploit: http://localhost/sqli_test.php?id=1%20or%20sleep(20)

In fact the best way to take care of sql injection isn't calling an escape function, Its using ADODB's parametrized quires for sql injection. Use htmlspecialcahrs($var,ENT_QUTOES) for XSS. Read the OWASP top 10 because there is a whole lot more than can go wrong with web application security.

Vanettavang answered 6/2, 2010 at 21:27 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.