It seems like since the field I am doing the hashbytes on is nvarchar(max) the result of the hashbytes is nvarchar(max).
No, that is not possible, especially since the return value of HASHBYTES is a VARBINARY
. Also, since your tests were just SELECT statements and not an INSERT statement, there is no way for the return value to get a truncation error. The truncation error is coming from the input value. As stated in that linked MSDN page for HASHBYTES
(for SQL Server 2012 and 2014):
Allowed input values are limited to 8000 bytes. The output conforms to the algorithm standard: 128 bits (16 bytes) for MD2, MD4, and MD5; 160 bits (20 bytes) for SHA and SHA1; 256 bits (32 bytes) for SHA2_256, and 512 bits (64 bytes) for SHA2_512.
That really says it all: the input is limited to 8000 bytes, and the output is a fixed number of bytes, based on the specified algorithm.
The updated documentation, for SQL Server 2016 (which has removed the 8000 byte limitation), states:
For SQL Server 2014 and earlier, allowed input values are limited to 8000 bytes.
You can run a simple test:
DECLARE @Test NVARCHAR(MAX) = REPLICATE(CONVERT(NVARCHAR(MAX), N't'), 50000);
SELECT LEN(@Test);
SELECT HASHBYTES('MD5', @Test);
Returns:
50000
Msg 8152, Level 16, State 10, Line 3
String or binary data would be truncated.
If you want to pass in more than 8000 bytes to a hash function in a version of SQL Server prior to 2016, then you need to use SQLCLR. You can either write your own function, or you can download and install the Free version of the SQL# SQLCLR library (which I created), and use the Util_Hash and Util_HashBinary functions:
DECLARE @Test NVARCHAR(MAX) = REPLICATE(CONVERT(NVARCHAR(MAX), N't'), 50000);
SELECT LEN(@Test);
SELECT SQL#.Util_Hash('MD5', CONVERT(VARBINARY(MAX), @Test));
SELECT SQL#.Util_HashBinary('MD5', CONVERT(VARBINARY(MAX), @Test));
Returns:
50000
40752EB301B41EEAEB309348CE9711D6
0x40752EB301B41EEAEB309348CE9711D6
UPDATE
In the case of using a VARCHAR(MAX)
column or variable but with 8000 or fewer characters (or an NVARCHAR(MAX)
column or variable with 4000 or fewer characters), there will be no issue and everything will work as expected:
DECLARE @Test VARCHAR(MAX) = REPLICATE('t', 5000);
SELECT LEN(@Test) AS [Characters],
HASHBYTES('MD5', @Test) AS [MD5];
Returns:
5000 0x6ABFBA10B49157F2EF8C85862B6E6313