is it better to escape/encode the user input before storing it to database or to store it as it is in database and escape it while retrieving?
Asked Answered
M

4

7

I am using htmlspecialchars() function to prevent XSS attacks. I have doubt regarding what is the better method to store the data in database from following.

Method 1 : Store the user input values after applying htmlspecialchars() function. Using this it user input "<script>" will become "&lt;script&gt;" .

Method 2 : Store the user input as it is and apply htmlspecialchars() method while retrieving the data and displaying it on the page.

The reason for my doubt is that I believe using method 1 there will be overhead on database, while using method 2 data need to be converted again and again when requested through php. So I am not sure which one is better.

For more information, I am using htmlspecialchars($val, ENT_QUOTES, "UTF-8") so that will convert ' and " as well.

Please help me clear my doubt. Also provide explanation if possible.

Thanks.

Madeline answered 1/3, 2012 at 8:28 Comment(0)
J
7

An even better reason is that on truncating to fit a certain space you'll get stuck with abominations such as "&quo...". Resist the temptation to fiddle with your data more than the minimum required. If you're worried about reprocessing the data, cache it.

Jello answered 1/3, 2012 at 8:30 Comment(0)
H
13
  1. Why do you expect that you will always use the data in an HTML context? "I <3 you" and "I &lt;3 you" is not the same data. Therefore, store the data as it's intended in the database. There's no reason to store it escaped.
  2. HTML escaping the data when and only when necessary gives you the confidence to know what you're doing. This:

    echo htmlspecialchars($data);
    

    is a lot better than:

    echo $data; // The data should already come escaped from the database.
                // I hope.
    
Haarlem answered 1/3, 2012 at 8:33 Comment(3)
3. If there is a bug in the escaping function, how do fix the problem? Edit the entire database?Esurient
@Esurient of course, but that's got nothing to do with this case in particular. If faulty data is written, you'll have to repair the content when you find out. That's true everywhere you write to a database (or to any document).Riproaring
@Mr Lister: It's very relevant to the question asked. If you store data uenescaped an escape on display you don't have to touch the database when the faulty escaping logic is discovered. Only change the escaping logic. In most cases this is simpler to do than fix the data in the database.Esurient
J
7

An even better reason is that on truncating to fit a certain space you'll get stuck with abominations such as "&quo...". Resist the temptation to fiddle with your data more than the minimum required. If you're worried about reprocessing the data, cache it.

Jello answered 1/3, 2012 at 8:30 Comment(0)
R
4

My recommendation is to store the data in the database in its purest form. The only reason you want to convert it into &lt;script&gt; is because you'll need to display it in a HTML document later. But the database itself doesn't have a need to know about what you do with the data after you retrieve it.

Riproaring answered 1/3, 2012 at 8:36 Comment(0)
T
-1

As well as XSS attacks, shouldn't you also be worried about SQL injection attacks if you're putting user input into a database? In which case, you will want to escape the user input BEFORE putting it into the database anyway.

Toulouselautrec answered 1/3, 2012 at 8:32 Comment(1)
htmlspecialchars() won't stop SQL injection attacks. And besides, everyone should be using parametrized queries anyways.Jello

© 2022 - 2024 — McMap. All rights reserved.