MySQL Explain: what's causing 'Using temporary; Using filesort'
Asked Answered
S

3

15

I'm planning on creating a view using this SQL SELECT, but the explain for it shows it's using temporary and using filesort. I can't figure out what indices I need in order to fix this problem. Mostly, I'm wondering why it's using filesort intead of using an index to sort.

Here are my tables:

CREATE TABLE `learning_signatures` (
  `signature_id` int(11) NOT NULL AUTO_INCREMENT,
  `signature_file` varchar(100) NOT NULL,
  `signature_md5` varchar(32) NOT NULL,
  `image_file` varchar(100) NOT NULL,
  PRIMARY KEY (`signature_id`),
  UNIQUE KEY `unique_signature_md5` (`signature_md5`)
) ENGINE=InnoDB AUTO_INCREMENT=640 DEFAULT CHARSET=latin1

CREATE TABLE `learning_user_suggestions` (
  `user_suggestion_id` int(11) NOT NULL AUTO_INCREMENT,
  `signature_id` int(11) NOT NULL,
  `ch` char(1) NOT NULL,
  `time_suggested` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  `user_id` int(11) NOT NULL,
  PRIMARY KEY (`user_suggestion_id`),
  KEY `char_index` (`ch`),
  KEY `ls_sig_id_indx` (`signature_id`),
  KEY `user_id_indx` (`user_id`),
  KEY `sig_char_indx` (`signature_id`,`ch`)
) ENGINE=InnoDB AUTO_INCREMENT=1173 DEFAULT CHARSET=latin1

And here is the problematic SQL statement I'm planning on using in my view:

select ls.signature_id, ls.signature_file, ls.signature_md5, ls.image_file, sug.ch , count(sug.ch) AS suggestion_count
from (`learning_signatures` `ls` left join `learning_user_suggestions` `sug` on(ls.signature_id = sug.signature_id))
group by ls.signature_id, sug.ch;

Output from explain:

id  select_type table   type    possible_keys                   key             key_len ref                 rows    Extra
1   SIMPLE      ls      ALL     NULL                            NULL            NULL    NULL                514     "Using temporary; Using filesort"
1   SIMPLE      sug     ref     ls_sig_id_indx,sig_char_indx    ls_sig_id_indx  4       wwf.ls.signature_id 1

Another example, this time using a where clause:

explain select ls.signature_id, ls.signature_file, ls.signature_md5, ls.image_file, sug.ch , count(sug.ch) AS suggestion_count
from (`learning_signatures` `ls` left join `learning_user_suggestions` `sug` on(ls.signature_id = sug.signature_id))
WHERE signature_md5 = '75f8a5b1176ecc2487b90bacad9bc4c'
group by ls.signature_id, sug.ch;

Explain output:

id  select_type table   type    possible_keys                key                    key_len ref     rows    Extra
1   SIMPLE      ls      const   unique_signature_md5         unique_signature_md5   34      const   1       "Using temporary; Using filesort"
1   SIMPLE      sug     ref     ls_sig_id_indx,sig_char_indx ls_sig_id_indx         4       const   1   
Sadoff answered 2/5, 2011 at 15:28 Comment(6)
Post EXPLAIN output, post your CREATE VIEW statement..Whiles
Good point, I just added the explain output. I haven't created the view yet, I want to optimize the SELECT firstSadoff
So what is exactly wrong here? You don't have a factor that limits the dataset, entire learning_signatures table will be pulled. Using temporary is probably because of your innodb_buffer_pool size.Whiles
@Michael, you're right. There is no where clause in that query. However if I add a WHERE, it still says using temporary and filesortSadoff
See the updated example with the where clause. Why is it using filesort?Sadoff
Here's some reading if you want to know what "using filesort" is: mysqlperformanceblog.com/2009/03/05/…Whiles
Y
18

In your first query, what you do is join your signatures table with user suggestions, getting lots of rows, and then group results using some columns from user suggestions. But there is no index for the joined table to help with grouping as it would have to be defined on previously joined table. What you should instead do is try to create a derived table from user suggestions that is already groupped by ch and signature_id and then join it:

SELECT ls.signature_id, ls.signature_file, ls.signature_md5, ls.image_file, 
       sug.ch, sug.suggestion_count
FROM learning_signatures ls
LEFT JOIN 
  (SELECT s.signature_id, s.ch, count(s.ch) as suggestion_count
    FROM learning_user_suggestions s 
    GROUP BY s.signature_id, s.ch ) as sug
ON ls.signature_id = sug.signature_id

Optimizer should be able now to use your sig_char_indx index for groupping, the derived table will be not bigger than your signatures table and you join both using unique column. You will still have to do a full scan over signatures table, but that cannot be avoided because you are selecting all of it anyway.

As for the second query, if you want to restrict signatures to a single one just append

WHERE ls.signature_md5='75f8a5b1176ecc2487b90bacad9bc4c'

to the end of previous query and group by only s.ch, because only one signature_id will match your md5 anyway. Optimizer should now use md5 index for where and char_index for grouping.

Yeanling answered 2/5, 2011 at 17:49 Comment(1)
Excellent. This worked well, and the optimizer worked as you described with your query. Thanks for the detailed explanation as well, I now see the difference.Sadoff
N
0

Maybe it'll help if you create an index on learning_signatures that contains both signature_md5 and signature_id (and in that order)

`KEY `md5_id` (`signature_md5`,`signature_id`)?

I'm no MySQL expert but I found that created keys that encapsulate both the where clause and the join clause usually helps get rid of temporary and filesort

Nasalize answered 2/5, 2011 at 16:13 Comment(1)
Thanks for the tip, but I just tried it and it didn't get rid of the temporary or filesort.Sadoff
A
-2

Use indexes. Find the fields where you need them by using EXPLAIN on your queries.

If you however have a mostly write-only DB (few reads), you may want to refrain from using indexes, as they may have a negative impact on your write performance.

Autoicous answered 22/6, 2017 at 16:0 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.