Why the EXTRA is NULL in Mysql EXPLAIN? Why >= is Using index condition?
Asked Answered
M

2

7
mysql> CREATE TABLE `t` (
     `id` int(11) NOT NULL,
     `a` int(11) DEFAULT NULL,
     `b` int(11) DEFAULT NULL,
     PRIMARY KEY (`id`),
     KEY `a` (`a`),
     KEY `b` (`b`)
   ) ENGINE=InnoDB

there is a table named t and it has two indexes named a and b. Insert into t 100000 rows data

mysql> create procedure idata()
  begin
   declare i int;
     set i=1;
     while(i<=100000)do
       insert into t values(i, i, i);
       set i=i+1;
     end while;
   end;
Query OK, 0 rows affected (0.01 sec)

mysql> delimiter ;
mysql> call idata();

I do some experiments, some are as follows

there are some experiments

Now, i want to know;

(1)why explain select * from t where a >= 90000; extra is Using index condition? it has index key, but it doesn't have index filter and table filter, so why it is Using index condition?

(2)why explain select * from t where a = 90000; extra is NULL? is needs to have an access to the table,if the first case is Using index condition, why the second can't be Using index condition?

(3)why explain select a from t where a >= 90000; extra is Using where; Using index? i know it uses cover index, so extra has Using index;but why extra has Using where? it means server needs to filter the data? but storage engine has already return the correct, why server needs to filer?

Masqat answered 18/6, 2020 at 16:35 Comment(1)
Some helpful info on Covering indexes: #609843 and a MySQL doc on explaining "explain": dev.mysql.com/doc/refman/8.0/en/execution-plan-information.htmlCirone
B
3

First, terminology...

"Using index" means that the (in this case) INDEX(a) contains all the columns needed. That is "the index is covering".

"Using index condition" is quite different. Internally, it is called ICP (Index Condition Pushdown). This refers to whether the "handler" checks the expression or whether the "condition" (a >= 90000) is handed off to the Engine (InnoDB) to do the work.

As for "Using where"; that is still a mystery to me, even after using MySQL for 20 years and looking thousands of Explains. I ignore it.

In all 3 of your cases, INDEX(a) is used. This is indicated primarily by "key" ("a"--the name of the key, not the column), "key_len" ("5": 4-byte INT plus 1 for NULLable), and secondarily by "type" (which does not say "All").

Further

  • If you change the 90000 to 70000, you may find that it will switch to a table scan. Why bounce back and forth between the Index's BTree and the data's BTree (via the PRIMARY KEY). The Optimizer will assume that it will be faster to simply scan all the table, ignoring the rows that fail the WHERE clause.

  • EXPLAIN FORMAT=JSON SELECT -- Gives you a lot more information. (Perhaps not much more info for this simple query.) One useful surprise is that it will show how many sorts the single mention of "filesort" really refers to. (A possibly easy way to make this happen is GROUP BY x ORDER BY y; that is group and order by different columns.)

  • Explain rarely has such clean numbers, like your "10001". Usually, the "rows" columns is an approximation, sometimes a terrible approx.

  • The slowlog records "Rows examined"; it will probably say 10001 (or maybe only 10000) and 1 for your tests. For a table scan, it would be a full 100K.

  • Another way to get "Rows examined" is via the "Handler" STATUS values. See http://mysql.rjweb.org/doc.php/index_cookbook_mysql#handler_counts

Blueweed answered 20/6, 2020 at 3:34 Comment(0)
E
0

Your first and last query make use of WHERE with implicit comparison to other rows, in that case it makes use of the index and shows it in the extra field (type range).

When you make a condition with 0-1 results, it can directly access them (O(1) lookup). No comparison or ordering happens, just take one row, return it.

Eley answered 18/6, 2020 at 17:14 Comment(5)
so...The type of both the first and last query is range, but in my view, storage engine has already filtered the correct data by the index, so why the server needs to have a implicit comparison?Masqat
@Masqat because with > you compare the rows to the number 90000Eley
@Masqat - "range" refers to a bunch of consecutive rows in the BTree (index or data). In your case, it is "the items starting with 90000 and going until the end".Blueweed
@DanielW. so, in your opinion, the data that you select has already been acquired from secondary index(in the third case), but because of the > in the where clause, server must have an unnecessary compare?Masqat
@RickJames I really know the meaning of "range", but I am confused in the third case why the extra has Using Where, in my opinion "Using Where" means server has a filter according to the "table filter", but column "a" has returned to the server, why server needs to have an unnecessary comparison...I think it has no pointMasqat

© 2022 - 2024 — McMap. All rights reserved.