It appears that SQL Server does not automatically use a CHECKSUM/hash index unless the CHECKSUM column is explicitly included in the search arguments for the query. This is a problem because I do not control the applications that query the table, and I may not break their performance.
Is there any way to get SQL Server to use a new CHECKSUM/hash index without modifying queries to include the new CHECKSUM/hash column?
Repro script
CREATE TABLE big_table
(
id BIGINT IDENTITY CONSTRAINT pk_big_table PRIMARY KEY,
wide_col VARCHAR(50),
wide_col_checksum AS CHECKSUM(wide_col),
other_col INT
)
CREATE INDEX ix_checksum ON big_table (wide_col_checksum)
Insert some test data:
SET NOCOUNT ON
DECLARE @count INT = 0
BEGIN TRANSACTION
WHILE @count < 10000
BEGIN
SET @count = @count + 1
INSERT INTO big_table (wide_col, other_col)
VALUES (SUBSTRING(master.dbo.fn_varbintohexstr(CRYPT_GEN_RANDOM(25)), 3, 50), @count)
IF @count % 1000 = 0
BEGIN
COMMIT TRANSACTION
BEGIN TRANSACTION
END
END
COMMIT TRANSACTION
INSERT INTO big_table (wide_col, other_col)
VALUES ('ABCDEFGHIJKLMNOPQRSTUVWXYZ', 9999999)
Legacy query. Causes Clustered Index Scan (BAD):
SELECT * FROM big_table
WHERE wide_col = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
Updated query. Causes NonClustered Index Seek (good):
SELECT * FROM big_table
WHERE wide_col = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
AND wide_col_checksum = CHECKSUM('ABCDEFGHIJKLMNOPQRSTUVWXYZ')
Background
My table is very large (many hundreds of millions of rows), has several indexes (~ 20), all of which are required. Some of the indexed columns are a little wide (~ 50 bytes) and have few duplicate values. The columns are only searched on equality. The table is inserted into constantly.
Here is a table comparing "normal" indexes and CHECKSUM/hash indexes on the sample table above, both compressed and non-compressed. Data from freshly rebuilt indexes on tables with 1 million rows:
Page compression alone is pretty ineffective on the sample data (real data should compress a bit better). The hash index achieves a 4X index size reduction. Page compression on the hash index achieves a 6X index size reduction.
My aims with using hash indexes are:
- Reduce the size of these indexes in memory, thereby allowing SQL Server to cache a greater fraction in RAM, thereby avoiding physical reads.
- Reduce index storage size.
- Reduce index I/O for INSERT operations.
INSTEAD OF SELECT
trigger and no such thing exists so pretty sure the answer is "no". – BelikeINSTEAD OF SELECT
, see my answer – Persevere