SQL Server's hashing function HASHBYTES
has an input limit of 8000 bytes.
How do you hash larger strings?
SQL Server's hashing function HASHBYTES
has an input limit of 8000 bytes.
How do you hash larger strings?
You could hash 8k (or 4k or 2k) chunks of the input and then either concatenate those hashes or hash them into a new hash value. This might get difficult though if you have to create a similar algorithm (in an external .NET app for example) to compare hashes created outside of SQL Server.
Another option: Lean on SQL Server's CLR integration and perform the hashing in a .NET assembly.
You could write a SQL CLR function:
[Microsoft.SqlServer.Server.SqlFunction]
public static SqlBinary BigHashBytes(SqlString algorithm, SqlString data)
{
var algo = HashAlgorithm.Create(algorithm.Value);
var bytes = Encoding.UTF8.GetBytes(data.Value);
return new SqlBinary(algo.ComputeHash(bytes));
}
And then it can be called in SQL like this:
--these return the same value
select HASHBYTES('md5', 'test stuff')
select dbo.BigHashBytes('md5', 'test stuff')
The BigHashBytes
is only necessary if the length would be over 8k.
[SqlFacet(MaxSize = -1)]
otherwise bytes after the 8000th would simply be ignored! Took me a while to get that one! –
Flagitious NVARCHAR(4000)
for SqlString
and NVARCHAR(MAX)
for SqlChars
. But since around the time of Visual Studio 2013, the default was changed to use NVARCHAR(MAX)
for both. Still, it is always better to be explicit and use [SqlFacet()]
as you mention, but folks using the newer SSDT versions wouldn't run into that. Also, SQLCLR only support NVARCHAR
so it would actually truncate at 4000 chars :-). –
Oldworld You could hash 8k (or 4k or 2k) chunks of the input and then either concatenate those hashes or hash them into a new hash value. This might get difficult though if you have to create a similar algorithm (in an external .NET app for example) to compare hashes created outside of SQL Server.
Another option: Lean on SQL Server's CLR integration and perform the hashing in a .NET assembly.
Like Paul's idea, one idea that comes to mind for chunking would be to store the hashed string in an XML column, with each chunk as a separate XML element.
© 2022 - 2024 — McMap. All rights reserved.