Multiple Indexes vs Multi-Column Indexes
Asked Answered
D

5

865

What is the difference between creating one index across multiple columns versus creating multiple indexes, one per column?

Are there reasons why one should be used over the other?

For example:

Create NonClustered Index IX_IndexName On TableName
(Column1 Asc, Column2 Asc, Column3 Asc)

Versus:

Create NonClustered Index IX_IndexName1 On TableName
(Column1 Asc)

Create NonClustered Index IX_IndexName2 On TableName
(Column2 Asc)

Create NonClustered Index IX_IndexName3 On TableName
(Column3 Asc)
Decimal answered 7/10, 2008 at 15:36 Comment(0)
S
424

I agree with Cade Roux.

This article should get you on the right track:

One thing to note, clustered indexes should have a unique key (an identity column I would recommend) as the first column. Basically it helps your data insert at the end of the index and not cause lots of disk IO and Page splits.

Secondly, if you are creating other indexes on your data and they are constructed cleverly they will be reused.

e.g. imagine you search a table on three columns

state, county, zip.

  • you sometimes search by state only.
  • you sometimes search by state and county.
  • you frequently search by state, county, zip.

Then an index with state, county, zip. will be used in all three of these searches.

If you search by zip alone quite a lot then the above index will not be used (by SQL Server anyway) as zip is the third part of that index and the query optimiser will not see that index as helpful.

You could then create an index on Zip alone that would be used in this instance.

By the way We can take advantage of the fact that with Multi-Column indexing the first index column is always usable for searching and when you search only by 'state' it is efficient but yet not as efficient as Single-Column index on 'state'

I guess the answer you are looking for is that it depends on your where clauses of your frequently used queries and also your group by's.

The article will help a lot. :-)

Swastika answered 7/10, 2008 at 16:10 Comment(5)
So would the best thing to do be to define an index for state, county, and zip in addition to an individual index for each column?Monition
@jball Am I missing something here? It looks like the article is mostly about the differences between SQL Server version limitations. Could the article have been moved?Ovenware
@Ian it does look like something has been lost in the soon to be 3 years since I sorted out the original link from now over 4 years ago. I can tell you that the blog post has the correct title as was linked to by evilhomer, but it looks like the followup blogs in the series are no longer easily findable from that first post. You'll have to around on Kimberly's blog archive to see if you can turn up the others in the series.Bencher
1) "Basically [Clustered Index with IDENTITY column as first] helps your data insert at the end of the index" is correct. "and not cause lots of disk IO and Page splits" is totally false in a multi-user system. The truth is, it guarantees high contention (low concurrency) in a multi-user system. 2) Clustered index should be a Relational Key, ie. not an IDENTITY, GUID, etc. 3) "Then an index with state, county, zip. will be used in all three of these searches." is false, and contradicts "the first column is usable". The 2nd & subs cols in the index are not usable for search.Attached
It seem the first link is no longer pointing towards a live page... :(Zenas
D
93

Yes. I recommend you check out Kimberly Tripp's articles on indexing.

If an index is "covering", then there is no need to use anything but the index. In SQL Server 2005, you can also add additional columns to the index that are not part of the key which can eliminate trips to the rest of the row.

Having multiple indexes, each on a single column may mean that only one index gets used at all - you will have to refer to the execution plan to see what effects different indexing schemes offer.

You can also use the tuning wizard to help determine what indexes would make a given query or workload perform the best.

Decurrent answered 7/10, 2008 at 15:41 Comment(3)
Kimberly Tripp knows what she is talking about. I was at a talk of hers and she knows this stuff inside out. Great advice.Swastika
@CadeRoux If most of times my where clause has 2 columns in '&' relationship, will it be better to have a multi-column index on them or single column index on both of themFregger
@RachitGupta One index with both columnsDecurrent
D
62

The multi-column index can be used for queries referencing all the columns:

SELECT *
FROM TableName
WHERE Column1=1 AND Column2=2 AND Column3=3

This can be looked up directly using the multi-column index. On the other hand, at most one of the single-column index can be used (it would have to look up all records having Column1=1, and then check Column2 and Column3 in each of those).

Dastard answered 7/10, 2008 at 15:46 Comment(3)
This is correct. However, having these columns as a single index each would still speed up things dramatically. Usually one of the values in the columns will reduce the resulting set so much that it doesn't matter to look up the rest without an index and the optimizer is good at picking this value.Commercialize
Why would at most only one column be used? When it does a lookup for column1 can't it also use an index for column2 as well?Candancecandela
@Candancecandela I think its like this. The first index jumbles the order of rows that are returned and you cant use further index on these rows.Heaveho
O
19

One item that seems to have been missed is star transformations. Index Intersection operators resolve the predicate by calculating the set of rows hit by each of the predicates before any I/O is done on the fact table. On a star schema you would index each individual dimension key and the query optimiser can resolve which rows to select by the index intersection computation. The indexes on individual columns give the best flexibility for this.

Oran answered 7/10, 2008 at 19:19 Comment(0)
U
16

If you have queries that will be frequently using a relatively static set of columns, creating a single covering index that includes them all will improve performance dramatically.

By putting multiple columns in your index, the optimizer will only have to access the table directly if a column is not in the index. I use these a lot in data warehousing. The downside is that doing this can cost a lot of overhead, especially if the data is very volatile.

Creating indexes on single columns is useful for lookup operations frequently found in OLTP systems.

You should ask yourself why you're indexing the columns and how they'll be used. Run some query plans and see when they are being accessed. Index tuning is as much instinct as science.

Unsupportable answered 7/10, 2008 at 15:46 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.