Mysql Order by clause using "FileSort"

Asked 12/8, 2012 at 8:51 Answered 12/8, 2012 at 10:28

Solved mysql indexing database-performance explain

I have a table structure like

comment_id primary key
comment_content 
comment_author
comment_author_url

When I fire query like

explain SELECT * FROM comments  ORDER BY comment_id

It outputs the results as

id  select_type     table   type    possible_keys   key     key_len     ref     rows    Extra
1   SIMPLE  comments    ALL     NULL    NULL    NULL    NULL    22563   Using filesort

Why is not able to find the index that I have defined as Primary Key?

Asturias answered 12/8, 2012 at 8:51 Comment(0)

Anytime a sort can’t be performed from an index, it’s a filesort.

The strange thing here is that you should have the index on that field since it is a primary key(and a primary key column is implicitly indexed), testing on a test database i just noticed that MySQL use FileSort anytime you perform a SELECT *, this is a no sense behaviour (i know) but if you rewrite your query in this way :

SELECT comment_id, comment_content, comment_author, comment_author_url 
FROM comments  
ORDER BY comment_id

it will use the index correctly . Maybe could be a bug of mysql ...

Laveralavergne answered 12/8, 2012 at 9:12 Comment(6)

thanks for investigation, I tried it out in Mysql version 5.1, Would upgrading help? planning to use MariaDB – Asturias 12/8, 2012 at 9:21

I don't know i have the version : 5.1.63, you can try upgrading or on MariaDB and see ... – Laveralavergne 12/8, 2012 at 9:27

@Asturias maybe it is better migrate to PostgreSQL :-) – Laveralavergne 12/8, 2012 at 9:50

Changed, storage engine to InnoDB from MyISAM in MariaDB and it seems to work great now – Asturias 12/8, 2012 at 9:51

Sure, their default storage engine seems to be InnoDB – Asturias 12/8, 2012 at 10:8

I Had the same with mariadb10.3.11, Aria table type. Just not using the '*' e.g just selecting a single column, e.g. the 'comment_id' than it uses the index. – Lani 18/10, 2019 at 12:30

It's not because it can't use the index. It's because the optimizer thinks it's faster not to use the index and do a filesort¹. You should see different behaviour in MyiSAM and InnoDB tables.

InnoDB creates the PRIMARY key as a clustered one (or the first UNIQUE if no primary is defined) and this can be used for queries that have ORDER BY pk or WHERE pk BETWEEN low AND high because all the values needed are in this clustered key and in consecutive locations (the clustered key is the table).

MyISAM tables have only B-tree indices so if the query used this index, it would have to read that entire index and it would have the comment_id values in the wanted order (that's really good) but it would then have to read the table as well (not so good) to get all the other wanted columns. So, the optimizer thinks that since it's going to read the table, why not scan it all and do the filesort? You can test that by trying:

SELECT comment_id FROM comments  ORDER BY comment_id ;

It will use the index and do no filesort because the query needs only the values that are stored in the index.

If you want a similar (to InnoDB) behaviour in MyiSAM, you coud try creating an index on (comment_id, comment_content, comment_author, comment_author_url) and then try your query. All the needed values would be found on the index and in correct order so no filesort would be performed.

The additional index will need of course almost as much space on disk as the table.

¹: filesort is not always bad and it does not mean that a file is saved on disk. If the size of the data is small, it is performed in memory.

Aesir answered 12/8, 2012 at 10:28 Comment(10)

Why if you run the query listing all fields by name in the select clause instead of using star operator it use the index ? See my answer ... – Laveralavergne 12/8, 2012 at 10:49

@aleroot: Did you test with both MyISAM and InnoDB? – Siegel 12/8, 2012 at 10:59

@aleroot: Then I could be wrong. How many rows in the table? I'll do some testing, too. – Siegel 12/8, 2012 at 11:3

Do you mean to say that MyISAM performs better than InnoDB in such queries, also the production server has more than 1,00,000 rows, would it be right to use index in that case too? – Asturias 12/8, 2012 at 11:29

No, I didn't mean that. If the table is InnoDB, a SELECT * FROM table ORDER BY pk will just have to read the table in sequential order and do no filesort. But why would you need to read all the million rows for a query, anyway? – Siegel 12/8, 2012 at 12:1

I actually use pagination for comments, so need to fire the select with a limit clause, the current query is been logged in slow queries log, hence planned to optimize the query, I guess changing the storage engine to InnoDB shall help – Asturias 12/8, 2012 at 14:4

The query in your question, compared to ORDER BY comment_id LIMIT 100 and to ORDER BY comment_id LIMIT 100 OFFSET 10000 are three different beasts. The 3rd one is not easy to opimize. – Siegel 12/8, 2012 at 14:8

using something like limit 20,20. Do you feel using InnoDB shall increase the performance? – Asturias 12/8, 2012 at 14:9

With small limits, it's not hard. It's as hard as a LIMIT 40. Moving to InnoDB is sensible in any case as it's the default engine in MySQL now. You can have referential constraints (Foreign Keys) and it usually performs better than myISAM. – Siegel 12/8, 2012 at 14:11

also planning to use MariaDB over MySQL, thanks for your help – Asturias 12/8, 2012 at 14:12