Sanitize table/column name in Dynamic SQL in .NET? (Prevent SQL injection attacks)
Asked Answered
H

3

12

I am generating some Dynamic SQL and would like to ensure that my code is safe from SQL injection.

For sake of argument here is a minimal example of how it is generated:

var sql = string.Format("INSERT INTO {0} ({1}) VALUES (@value)",
    tableName, columnName);

In the above, tableName, columnName, and whatever is bound to @value come from an untrusted source. Since placeholders are being used @value is safe from SQL injection attacks, and can be ignored. (The command is executed via SqlCommand.)

However, tableName and columnName cannot be bound as placeholders and are therefor vulnerable to injection attacks. Since this a "truly dynamic" scenario, there is no whitelist of tableName or columnName available.

The question is thus:

Is there a standard, built-in way to check and/or sanitize tableName and columnName? (SqlConnection, or a helper class, etc.) If not, what is a good way to perform this task without using a 3rd party library?

Notes:

  • All SQL identifiers, including the schema, should by accepted: e.g. [schema].[My Table].column is just as "safe" as table1.
  • Can either sanitize the identifiers or detect an invalid identifier. (It does not need to ensure that the table/column is actually valid in context; the resulting SQL can be invalid, but must be "safe".)

Update:

Just found this, and thought it was somewhat interesting: There is a SqlFunctions.QuoteName function in .NET4 (EF4?). Okay, it doesn't really help me here...

Hydrograph answered 11/3, 2012 at 1:9 Comment(0)
S
4

Since you are using an SqlConnection, the assumption is that this is an SQL Server database.

Given that assumption, you could validate the table and field names using a regular expression that follows the SQL Server identifier rules as defined in MSDN. While I am a complete and utter novice at regular expressions, I did find this one that should come close:

[\p{L}{\p{Nd}}$#_][\p{L}{\p{Nd}}@$#_]*

However, a regular expression will not address SQL Server keywords and it does not ensure that the table and/or column actually exists (although you indicated that wasn't much of an issue).

If this were my application, I would first ensure the end user was not trying to perform injection by rejecting any request that contained semi-colons (;).

Next, I would validate the table existence by removing the valid name delimiters (", ', [, ]), splitting the table name by a period to see if a schema was specified, and executing a query against INFORMATION_SCHEMA.TABLES to determine the existence of the table.

For example:

SELECT 1 
FROM   INFORMATION_SCHEMA.TABLES 
WHERE  TABLE_NAME = 'tablename' 
AND    TABLE_SCHEMA = 'tableschema'

If you create this query using parameters, then you should further protect yourself from injection.

Finally, I would validate the existence of each column name by performing a similar set of steps, only using INFORMATION_SCHEMA.COLUMNS to determine the validity of the column(s) once the table has been determined to be valid.

I would probably fetch the list of valid columns for this table from SQL Server, then verify that each request column was in the list within my code. That way you could tell exactly which columns were in error and provide that feedback to the user.

Scatology answered 11/3, 2012 at 1:48 Comment(5)
You are correct, it is indeed SQL Server. I am leaning towards a minimally-accepted regular expression route, but I was hoping that something premade (and tested ;-) existed. I really like the additional idea about checking it against the schema metadata.Hydrograph
If you use a parameterized query to test the table/schema, then check the existence of each column name in code against the complete list of column names for the table, then you actually do not need to perform any validity checks on the incoming values, which would be the ultimate in minimality :).Scatology
Well, it's a little larger implementation :)Hydrograph
Thank you for your input. The approach we are now using (that is not smart enough to look at the schema) is to split into components, remove all but alphanumeric and normal spaces, and then rejoin in the "[]" form. Of course it will behave "unexpectedly" on on things like [a . b] (will turn it into [a ].[ b], but I am not sure if the former, or even the latter, is valid anyway...)Hydrograph
Looking at msdn.microsoft.com/en-us/library/ms175874.aspx, identifiers may not start with $, but may start with @. I'm now using ^[\p{L}@_#][\p{L}\p{Nd}$@_#]*$Kwh
F
28

I'm not sure if you're still looking into this, but the DbCommandBuilder class provides a method QuoteIdentifier for this purpose. The main benefits of this are that it's database-independent and doesn't involve any RegEx mess.

As of .NET 4.5, you have everything you need to sanitize table and column names just using your DbConnection object:

DbConnection connection = GetMyConnection(); // Could be SqlConnection
DbProviderFactory factory = DbProviderFactories.GetFactory(connection);

// Sanitize the table name
DbCommandBuilder commandBuilder = factory.CreateCommandBuilder();

string tableName = "This Table Name Is Long And Bad";
string sanitizedTableName = commandBuilder.QuoteIdentifier(tableName);

IDbCommand command = connection.CreateCommand();
command.CommandText = "SELECT * FROM " + sanitizedTableName;

// Becomes 'SELECT * FROM [This Table Name Is Long And Bad]' in MS-SQL,
// 'SELECT * FROM "This Table Name Is Long And Bad"' in Oracle, etc.

(Pre-4.5, you'll need some other way to get your DbProviderFactory -- maybe from the data provider name in your application configuration or hard-coded somewhere.)

Favian answered 27/9, 2012 at 20:0 Comment(8)
Thanks, I didn't even know that class existed. I am still using .NET3.5 however, so no magic factory helper there :( I am sure it will be of use to other people, though (or perhaps for me in a few years).Hydrograph
Oh, you can still do it in .NET 3.5, you just need some other way to get your hands on a DbProviderFactory object. Or even just create a DbCommandBuilder manually. If you're sure you'll always be using MS-SQL, you could just do DbCommandBuilder commandBuilder = new SqlCommandBuilder(); and skip all the factory mess.Favian
Ahh, yes. That makes sense -- I am using LINQ2SQL so .. yeah, always SQL Server :)Hydrograph
Any reason why you (@user166390) didn't mark this as the best answer? Seems to be the most correct one to me?Mayfair
Any way to do this without having to have a database connection object available?Dyeing
@notfed Yes, if you know what type of database you're connecting to. Instead of line 2 you could just instantiate the correct type of DbProviderFactory (e.g. var factory = new SqlClient.SqlClientFactory() if you're using SQL Server).Favian
SqlCommandBuilder.QuoteIdentifier has been available since .NET 2.0 according to this: msdn.microsoft.com/en-us/library/…. It is what I used to sanitize table/column name input. Just instantiate an instance of SqlCommandBuilder and use the QuoteIdentifier method to sanitize input.Brooke
Does DbCommandBuilder.QuoteIdentifier protect you from an embedded semicolon?Possing
S
4

Since you are using an SqlConnection, the assumption is that this is an SQL Server database.

Given that assumption, you could validate the table and field names using a regular expression that follows the SQL Server identifier rules as defined in MSDN. While I am a complete and utter novice at regular expressions, I did find this one that should come close:

[\p{L}{\p{Nd}}$#_][\p{L}{\p{Nd}}@$#_]*

However, a regular expression will not address SQL Server keywords and it does not ensure that the table and/or column actually exists (although you indicated that wasn't much of an issue).

If this were my application, I would first ensure the end user was not trying to perform injection by rejecting any request that contained semi-colons (;).

Next, I would validate the table existence by removing the valid name delimiters (", ', [, ]), splitting the table name by a period to see if a schema was specified, and executing a query against INFORMATION_SCHEMA.TABLES to determine the existence of the table.

For example:

SELECT 1 
FROM   INFORMATION_SCHEMA.TABLES 
WHERE  TABLE_NAME = 'tablename' 
AND    TABLE_SCHEMA = 'tableschema'

If you create this query using parameters, then you should further protect yourself from injection.

Finally, I would validate the existence of each column name by performing a similar set of steps, only using INFORMATION_SCHEMA.COLUMNS to determine the validity of the column(s) once the table has been determined to be valid.

I would probably fetch the list of valid columns for this table from SQL Server, then verify that each request column was in the list within my code. That way you could tell exactly which columns were in error and provide that feedback to the user.

Scatology answered 11/3, 2012 at 1:48 Comment(5)
You are correct, it is indeed SQL Server. I am leaning towards a minimally-accepted regular expression route, but I was hoping that something premade (and tested ;-) existed. I really like the additional idea about checking it against the schema metadata.Hydrograph
If you use a parameterized query to test the table/schema, then check the existence of each column name in code against the complete list of column names for the table, then you actually do not need to perform any validity checks on the incoming values, which would be the ultimate in minimality :).Scatology
Well, it's a little larger implementation :)Hydrograph
Thank you for your input. The approach we are now using (that is not smart enough to look at the schema) is to split into components, remove all but alphanumeric and normal spaces, and then rejoin in the "[]" form. Of course it will behave "unexpectedly" on on things like [a . b] (will turn it into [a ].[ b], but I am not sure if the former, or even the latter, is valid anyway...)Hydrograph
Looking at msdn.microsoft.com/en-us/library/ms175874.aspx, identifiers may not start with $, but may start with @. I'm now using ^[\p{L}@_#][\p{L}\p{Nd}$@_#]*$Kwh
B
4

For SQL Server, it's pretty simple to sanitize an identifier:

// To make a string safe to use as an SQL identifier :
// 1. Escape single closing bracket with double closing bracket
// 2. Wrap in square brackets
string.Format("[{0}]", identifier.Replace("]", "]]"));

Once wrapped in brackets and escaped, the only thing that won't work as an identifier is an empty/null string.

Balladmonger answered 16/4, 2015 at 7:38 Comment(1)
I don't know why you were downvoted, because this is exactly what DbCommandBuilder.QuoteIdentifier does.Brodie

© 2022 - 2024 — McMap. All rights reserved.