Force MySQL to use two indexes on a Join
Asked Answered
W

2

19

I am trying to force MySQL to use two indexes. I am joining a table and I want to utilize the cross between the two indexes. The specific term is Using intersect and here is a link to MySQL documentation:

http://dev.mysql.com/doc/refman/5.0/en/index-merge-optimization.html

Is there any way to force this implementation? My query was using it (and it sped stuff up), but now for whatever reason it has stopped.

Here is the JOIN I want to do this on. The two indexes I want the query to use are scs.CONSUMER_ID_1 and scs_CONSUMER_ID_2

JOIN survey_customer_similarity AS scs
    ON cr.CONSUMER_ID=scs.CONSUMER_ID_2 
    AND cal.SENDER_CONSUMER_ID=scs.CONSUMER_ID_1 
    OR cr.CONSUMER_ID=scs.CONSUMER_ID_1 
    AND cal.SENDER_CONSUMER_ID=scs.CONSUMER_ID_2
Wooden answered 30/1, 2011 at 3:50 Comment(2)
Please post MySql version, table definitions and the EXPLAIN output.Cutlerr
Additionally, can you show the entire query... or at least most of the rest that might not be some consideration of "confidential"... select X from WhatTable... Join... Where... group by...Hower
B
24

See MySQL Docs for FORCE INDEX.

JOIN survey_customer_similarity AS scs 
FORCE INDEX (CONSUMER_ID_1,CONSUMER_ID_2)
ON
cr.CONSUMER_ID=scs.CONSUMER_ID_2 
AND cal.SENDER_CONSUMER_ID=scs.CONSUMER_ID_1 
OR cr.CONSUMER_ID=scs.CONSUMER_ID_1 
AND cal.SENDER_CONSUMER_ID=scs.CONSUMER_ID_2

As TheScrumMeister has pointed out below, it depends on your data, whether two indexes can actually be used at once.


Here's an example where you need to force the table to appear twice to control the query execution and intersection.

Use this to create a table with >100K records, with roughly 1K rows matching the filter i in (2,3) and 1K rows matching j in (2,3):

drop table if exists t1;
create table t1 (id int auto_increment primary key, i int, j int);
create index ix_t1_on_i on t1(i);
create index ix_t1_on_j on t1(j);
insert into t1 (i,j) values (2,2),(2,3),(4,5),(6,6),(2,6),(2,7),(3,2);
insert into t1 (i,j) select i*2, j*2+i from t1;
insert into t1 (i,j) select i*2, j*2+i from t1;
insert into t1 (i,j) select i*2, j*2+i from t1;
insert into t1 (i,j) select i*2, j*2+i from t1;
insert into t1 (i,j) select i*2, j*2+i from t1;
insert into t1 (i,j) select i*2, j*2+i from t1;
insert into t1 (i,j) select i*2, j*2+i from t1;
insert into t1 (i,j) select i*2, j*2+i from t1;
insert into t1 (i,j) select i*2, j*2+i from t1;
insert into t1 (i,j) select i*2, j*2+i from t1;
insert into t1 (i,j) select i*2, j*2+i from t1;
insert into t1 (i,j) select i*2, j*2+i from t1;
insert into t1 (i,j) select i, j from t1;
insert into t1 (i,j) select i, j from t1;
insert into t1 (i,j) select 2, j from t1 where not j in (2,3) limit 1000;
insert into t1 (i,j) select i, 3 from t1 where not i in (2,3) limit 1000;

When doing:

select t.* from t1 as t where t.i=2 and t.j=3 or t.i=3 and t.j=2

you get exactly 8 matches:

+-------+------+------+
| id    | i    | j    |
+-------+------+------+
|     7 |    3 |    2 |
| 28679 |    3 |    2 |
| 57351 |    3 |    2 |
| 86023 |    3 |    2 |
|     2 |    2 |    3 |
| 28674 |    2 |    3 |
| 57346 |    2 |    3 |
| 86018 |    2 |    3 |
+-------+------+------+

Use EXPLAIN on the query above to get:

id | select_type | table | type  | possible_keys         | key        | key_len | ref  | rows | Extra
1  | SIMPLE      | t     | range | ix_t1_on_i,ix_t1_on_j | ix_t1_on_j | 5       | NULL | 1012 | Using where

Even if we add FORCE INDEX to the query on two indexes EXPLAIN will return the exact same thing.

To make it collect across two indexes, and then intersect them, use this:

select t.* from t1 as a force index(ix_t1_on_i)

join t1 as b force index(ix_t1_on_j) on a.id=b.id

where a.i=2 and b.j=3 or a.i=3 and b.j=2

Use that query with explain to get:

id | select_type | table | type  | possible_keys | key        | key_len | ref  | rows | Extra
1  | SIMPLE      | a     | range | ix_t1_on_i    | ix_t1_on_i | 5       | NULL | 1019 | Using where
1  | SIMPLE      | b     | range | ix_t1_on_j    | ix_t1_on_j | 5       | NULL | 1012 | Using where; Using index

This proves that the indexes are being used. But that may or may not be faster depending on many other factors.

Baucom answered 30/1, 2011 at 3:58 Comment(6)
I believe if the original explain was using 1 of the indexes only, FORCE INDEX wont force it to use both.Cutlerr
I thought force index is stronger than use index, i.e. if it can be used at all, it will be. So wouldn't 1+2 be hashed for the intersection?Baucom
force index is stronger, and it tells the query optimizer to use any/all the indexes in the list. So if the original plan used a table scan because it decided a scan is cheaper, it will work. however if - for whatever reason - the optimizer was unable to use both indexes, force index... won't do the trick.Cutlerr
@RichardTheKiwi, Weird, I'm getting 116k rows when I run your inserts above. How did you get 400k?Blackpoll
@Blackpoll Thanks for pointing that out. I added two more inserts to beef it up to 460K.Baucom
@RichardTheKiwi, Hmm, why don't we change the "> 400k" to "> 100k" instead? Because the select t.* from t1 as t where t.i=2 and t.j=3 or t.i=3 and t.j=2 is now showing 32 rows instead of 8.Blackpoll
T
2

MySQL only supports using a single index per join. If you want it to utilize two columns as indices in the join, you should create a single index over those two columns. Note that this isn't as bad as it seems, because an index over (a,b) doubles as an index over just a.

See the MySQL manual

MySQL cannot use an index if the columns do not form a leftmost prefix of the index.

Teratology answered 30/1, 2011 at 7:28 Comment(2)
Not true due to index merge optimizations.Blackpoll
This is an obsolete comment. Just for reference.Existential

© 2022 - 2024 — McMap. All rights reserved.