Fix Using where; Using temporary; Using filesort
Asked Answered
V

1

6

I have two simple tables:

CREATE TABLE cat_urls (
  Id int(11) NOT NULL AUTO_INCREMENT,
  SIL_Id int(11) NOT NULL,
  SiteId int(11) NOT NULL,
  AsCatId int(11) DEFAULT NULL,
  Href varchar(2048) NOT NULL,
  ReferrerHref varchar(2048) NOT NULL DEFAULT '',
  AddedOn datetime NOT NULL DEFAULT CURRENT_TIMESTAMP,
  GroupId int(11) DEFAULT NULL,
  PRIMARY KEY (Id),
  INDEX SIL (SIL_Id, AsCatId)
)

CREATE TABLE products (
  Id int(11) NOT NULL AUTO_INCREMENT,
  CatUrlId int(11) NOT NULL,
  Href varchar(2048) NOT NULL,
  SiteIdentity varchar(2048) NOT NULL,
  Price decimal(12, 2) NOT NULL,
  IsAvailable bit(1) NOT NULL,
  ClientCode varchar(256) NOT NULL,
  PRIMARY KEY (Id),
  INDEX CatUrl (CatUrlId)
)

And I have pretty simple query:

SELECT cu.Href, COUNT(p.CatUrlId) FROM cat_urls cu
       JOIN products p ON p.CatUrlId=cu.Id
       WHERE sil_id=4601038
GROUP by cu.Id

EXPLAIN says:

id  select_type table   type    possible_keys   key     key_len ref                     rows    Extra
1   SIMPLE      cu      ref     PRIMARY,SIL     SIL     4       const                   303     Using where; Using temporary; Using filesort
1   SIMPLE      p       ref     CatUrl          CatUrl  4       blue_collar_logs.cu.Id  6       Using index

Please tell me is there any way to fix "Using where; Using temporary; Using filesort" and improve perfomance of this query?

Vanvanadate answered 25/12, 2014 at 8:57 Comment(3)
How about explain select href, sum from cat_urls c join ( SELECT cu.id, sum(sil_id=4601038) as sum FROM cat_urls cu JOIN products p ON p.CatUrlId=cu.Id GROUP by cu.Id ) t on t.id = c.idDirigible
id select_type table type possible_keys key key_len ref rows Extra 1 PRIMARY <derived2> ALL (null) (null) (null) (null) 178525398 (null) 1 PRIMARY c eq_ref PRIMARY PRIMARY 4 t.Id 1 (null) 2 DERIVED cu index PRIMARY,SIL PRIMARY 4 (null) 29754233 (null) 2 DERIVED p ref CatUrl CatUrl 4 blue_collar_logs.cu.Id 6 Using indexVanvanadate
this query executes very long timeVanvanadate
C
2

It looks that, for some reason, MySQL chooses to use the index SIL on the first table and it uses it both for lookup (WHERE sil_id = 4601038) and grouping (GROUP BY cu.Id).

You can tell it to use the PK of the table

SELECT cu.Href, COUNT(p.CatUrlId) FROM cat_urls cu
    USE INDEX FOR JOIN (PRIMARY)
JOIN products p ON p.CatUrlId=cu.Id
WHERE sil_id=4601038
GROUP by cu.Id

and it will produce this execution plan:

id | select_type | table | type  | possible_keys | key     | key_len | ref              | rows | Extra
---+-------------+-------+-------+---------------+---------+---------+------------------+------+-------------
1  | SIMPLE      | cu    | index | PRIMARY       | PRIMARY | 4       | NULL             | 1    | Using where
1  | SIMPLE      | p     | ref   | CatUrl        | CatUrl  | 4       | cbs-test-1.cu.Id | 1    | Using index

Ignore the values reported in column rows; they are not correct because my tables are empty.

Notice the Extra column now contains only Using where but also notice that the join type column changed from ref (very good) to index (full index scan, not quite good).

A better solution is to add an index on column SIL_Id. I know, SIL_Id is a prefix of index SIL(SIL_Id, AsCatId) and in theory another index on column SIL_Id is completely useless. But it seems it solves the issue on this case.

ALTER TABLE cat_urls
  ADD INDEX (SIL_Id)
;

Now use it in the query:

SELECT cu.Href, COUNT(p.CatUrlId) FROM cat_urls cu
    USE INDEX FOR JOIN (SIL_Id)
JOIN products p ON p.CatUrlId=cu.Id
WHERE sil_id=4601038
GROUP by cu.Id

The query execution plan looks much better now:

id | select_type | table | type | possible_keys | key    | key_len | ref              | rows | Extra
---+-------------+-------+------+---------------+--------+---------+------------------+------+-------------
1  | SIMPLE      | cu    | ref  | SIL_Id        | SIL_Id | 4       | const            | 1    | Using where
1  | SIMPLE      | p     | ref  | CatUrl        | CatUrl | 4       | cbs-test-1.cu.Id | 1    | Using index

The drawback is that we have an extra index that is (theoretically) useless. It occupies storage space and it consumes processor cycles every time a row is added, deleted or have its SIL_Id field modified.

Coffeng answered 13/1, 2015 at 12:21 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.