Mysql covering vs composite vs column index
Asked Answered
S

4

41

In the following query

SELECT  col1,col2
FROM    table1
WHERE   col3='value1'
  AND   col4='value2'

If I have 2 separate indexes one on col3 and the other on col4, Which one of them will be used in this query ?

I read somewhere that for each table in the query only one index is used. Does that mean that there is no way for the query to use both indexes ?

Secondly, If I created a composite index using both col3 and col4 together but used only col3 in the WHERE clause will that be worse for the performance? example:

SELECT  col1,col2
FROM    table1
WHERE   col3='value1'

Lastly, Is it better to just use Covering indexes in all cases ? and does it differ between MYISAM and innodb storage engines ?

Samuels answered 21/11, 2011 at 14:19 Comment(0)
S
47

A covering index is not the same as a composite index.

If I have 2 separate indexes one on col3 and the other on col4, Which one of them will be used in this query ?

The index with the highest cardinality.
MySQL keeps statistics on which index has what properties.
The index that has the most discriminating power (as evident in MySQL's statistics) will be used.

I read somewhere that for each table in the query only one index is used. Does that mean that there is no way for the query to used both indexes ?

You can use a subselect.
Or even better use a compound index that includes both col3 and col4.

Secondly, If I created a composite index using both col3 and col4 together but used only col3 in the WHERE clause will that be worse for the performance? example:

Compound index
The correct term is compound index, not composite.
Only the left-most part of the compound index will be used.
So if the index is defined as

index myindex (col3, col4)  <<-- will work with your example.
index myindex (col4, col3)  <<-- will not work. 

See: http://dev.mysql.com/doc/refman/5.0/en/multiple-column-indexes.html

Note that if you select a left-most field, you can get away with not using that part of the index in your where clause.
Imagine we have a compound index

Myindex(col1,col2)

SELECT col1 FROM table1 WHERE col2 = 200  <<-- will use index, but not efficiently
SELECT * FROM table1 where col2 = 200     <<-- will NOT use index.  

The reason this works is that the first query uses the covering index and does a scan on that.
The second query needs to access the table and for that reason scanning though the index does not make sense.
This only works in InnoDB.

What's a covering index
A covering index refers to the case when all fields selected in a query are covered by an index, in that case InnoDB (not MyISAM) will never read the data in the table, but only use the data in the index, significantly speeding up the select.
Note that in InnoDB the primary key is included in all secondary indexes, so in a way all secondary indexes are compound indexes.
This means that if you run the following query on InnoDB:

SELECT indexed_field FROM table1 WHERE pk = something

MySQL will always use a covering index and will not access the actual table. Although it could use a covering index, it will prefer the PRIMARY KEY because it only needs to hit a single row.

Supremacist answered 21/11, 2011 at 14:28 Comment(10)
Thanks for the explanation and especially the part about the compound index. I didn't know the only the left-most part of the index is checked first. I thought if any part of the compound index is used the other part will be just discarded.Samuels
Great info!! Do you have document about this "index myindex (col4, col3) <<-- will not work. "? I do not not trust you, but I just want to read more about this.Seigler
@SurasinTancharoen, the official docs or a great source, note though that this is not the whole story. See the updated answer.Supremacist
SELECT x FROM tbl WHERE pk=1 -- for InnoDB, the PK is stored with the data. So, whether it is 'covering' or not is moot; it does access the data.Postmark
SELECT col1 FROM table1 WHERE col2 = 200 <<-- will use index(col1, col2). True, but... It has to scan the entire index. This is probably better than scanning the table, but only because the index is probably smaller than the table.Postmark
You say "works only in InnoDB". Once you understand that a MyISAM PK is separate from the data. And that InnoDB's secondary keys implicitly include the PK field(s), then the rules for the two engines are essentially the same.Postmark
index myindex (col4, col3) <<-- will not work. -- that is referring to SELECT ... WHERE col3=..., not the other SELECT.Postmark
@Johan, You stated that select * from table1 where col2 = 200 will not use index, but when I ran the tests above, explain shows that the indexes are indeed used. Which OS are you on?Comanchean
for the last sql, I think you mean ``` SELECT id,indexed_field_1 FROM table1 WHERE indexed_field_2 = something``` id, or say cluster key or primary key, is on select clause.Microbalance
@SurasinTancharoen - If it used that index, it would have to scan the entire index to find the col3 value(s). This case is not even considered by the Optimizer.Postmark
S
5

I upvoted Johan's answer for completeness, but I think the following statement he makes regarding secondary indexes is incorrect and/or confusing;

Note that in InnoDB the primary key is included in all secondary indexes, 
so in a way all secondary indexes are compound indexes.

This means that if you run the following query on InnoDB:

SELECT indexed_field FROM table1 WHERE pk = something

MySQL will always use a covering index and will not access the actual table.

While I agree the primary key is INCLUDED in the secondary index, I do not agree MySQL "will always use a covering index" in the SELECT query specified here.

To see why, note that a full index "scan" is always required in this case. This is not the same as a "seek" operation, but is instead a 100% scan of the secondary index contents. This is due to the fact the secondary index is not ordered by the primary key; it is ordered by "indexed_field" (otherwise it would not be much use as an index!).

In light of this latter fact, there will be cases where it is more efficient to "seek" the primary key, and then extract indexed_field "from the actual table," not from the secondary index.

Sestina answered 21/11, 2013 at 19:46 Comment(5)
how does it matter what it's ordered by(?) if there's an explicit equity: pk = somethingPhenomenon
@Phenomenon where is the corresponding 'pk' to be found in the secondary index? the database engine has no idea. it has to begin at the start of the secondary index and 'scan' every entry in it, from beginning to end. think about the index section of a recipe book: it is ordered by the alphabet, not by the page number. if you wanted to find the title of a recipe and you already knew the page number, you might simply turn to that page and observe the title directly, rather than scanning the index from beginning to end for all occurrences of the page number.Sestina
here is my favorite book on the subject, incidentally: amazon.com/Server-Execution-Plans-Grant-Fritchey/dp/1906434026. it refers directly to MS SQL Server, but the concepts are virtually the same as MySQL, other relational database engines.Sestina
"where is the corresponding 'pk' to be found in the secondary index?" — "Note that in InnoDB the primary key is included in all secondary indexes, so in a way all secondary indexes are compound indexes."Phenomenon
@Phenomenon what i mean is, if i am looking for the pk with the value e.g. "123," where sequentially do i find this "123" in the secondary index? the database engine has no idea, as the secondary index is not ordered by the primary key. it has to scan every entry from beginning to end.Sestina
C
1

This is a question I hear a lot and there is a lot of confusion around the issues due to:

  • The differences in mySQL over the years. Indexes and multiple index support changed over the years (towards being supported)

  • the InnoDB / myISAM differences There are some key differences (below) but I do not believe multiple indexes are one of them

MyISAM is older but proven. Data in MyISAM tables is split between three different files for:- table format, data, and indexes.
InnoDB is relatively newer than MyISAM and is transaction safe. InnoDB also provides row-locking as opposed to table-locking which increases multi-user concurrency and performance. InnoDB also has foreign-key constraints.
Because of its row-locking feature InnoDB is well suited to high load environments.

To be sure about things, make sure to use explain_plan to analyze the query execution.

Clothe answered 21/11, 2011 at 14:36 Comment(2)
Will you consider InnoDB "proven" now?Comanchean
@Comanchean - I would say that InnoDB was "proven" well before the Answer was posted. 8.0 gets rid of MyISAM. That's how confident Oracle is with InnoDB today.Postmark
S
-1

Compound index is not the same as a composite index.

  • Composite index covers all the columns in your filter, join and select criteria. All of these columns are stored on all of the index pages accordingly throughout the index B-tree.
  • Compound index covers all the filter and join key columns in the B-tree, but keeps the select columns only on the leaf pages as they will not be searched, rather only extracted! This saves space and consequently creates less index pages, hence faster I/O.
Sphery answered 3/6, 2013 at 12:34 Comment(1)
No. A "Covering" index contains all the columns used throughout the SELECT. A "Composite" or "Compound" index has more than one column in the index.Postmark

© 2022 - 2024 — McMap. All rights reserved.