Improve performance on MySQL fulltext search query
Asked Answered
L

3

8

I have a following MySQL query:

SELECT p.*, MATCH (p.description) AGAINST ('random text that you can use in sample web pages or typography samples') AS score 
FROM posts p 
WHERE p.post_id <> 23 
AND MATCH (p.description) AGAINST ('random text that you can use in sample web pages or typography samples') > 0 
ORDER BY score DESC LIMIT 1

With 108,000 rows, it takes ~200ms. With 265,000 rows, it takes ~500ms.

Under performance testing(~80 concurrent users) it shows ~18sec average latency.

Is any way to improve performance for this query ?

EXPLAIN OUTPUT:

enter image description here

UPDATED

We have added one new mirror MyISAM table with post_id, description and synchronized it with posts table via triggers. Now, fulltext search on this new MyISAM table works ~400ms(with the same performance load where InnoDB shows ~18sec.. this is a huge performance boost) Look like MyISAM is much more quicker for fulltext in MySQL than InnoDB. Could you please explain it ?

MySQL profiler results:

Tested on AWS RDS db.t2.small instance

Original InnoDB posts table:

enter image description here

MyISAM mirror table with post_id, description only:

enter image description here

Laky answered 2/7, 2015 at 10:38 Comment(2)
Which table engine are you using? MyISAM or InnoDB?Erode
Did you understand why the InnoDB query took 18 seconds while the profiling info adds up to less than 0.5 seconds?Loredo
E
6

Here are a few tips what to look for in order to maximise the speed of such queries with InnoDB:

  1. Avoid redundant sorting. Since InnoDB already sorted the result according to ranking. MySQL Query Processing layer does not need to sort to get top matching results.

  2. Avoid row by row fetching to get the matching count. InnoDB provides all the matching records. All those not in the result list should all have ranking of 0, and no need to be retrieved. And InnoDB has a count of total matching records on hand. No need to recount.

  3. Covered index scan. InnoDB results always contains the matching records' Document ID and their ranking. So if only the Document ID and ranking is needed, there is no need to go to user table to fetch the record itself.

  4. Narrow the search result early, reduce the user table access. If the user wants to get top N matching records, we do not need to fetch all matching records from user table. We should be able to first select TOP N matching DOC IDs, and then only fetch corresponding records with these Doc IDs.

I don't think you cannot get that much faster looking only at the query itself, maybe try removing the ORDER BY part to avoid unnecessary sorting. To dig deeper into this, maybe profile the query using MySQLs inbuild profiler.

Other than that, you might look into the configuration of your MySQL server. Have a look at this chapter of the MySQL manual, it contains some good informations on how to tune the fulltext index to your needs.

If you've already maximized the capabilities of your MySQL server configuration, then consider looking at the hardware itself - sometimes even a lost cost solution like moving the tables to another, faster hard drive can work wonders.

Erode answered 2/7, 2015 at 10:57 Comment(1)
That sounds weird, but might be something entirely different, like query cache, bad indexing or something else. Please try profiling the query using MySQLs inbuild profiler with InnoDB and MyISAM to look for the differences.Erode
V
4

My best guess for the performance hit is the number of rows being returned by the query. To test this, simply remove the order by score and see if that improves the performance.

If it does not, then the issue is the full text index. If it does, then the issue is the order by. If so, the problem becomes a bit more difficult. Some ideas:

  • Determine a hardware solution to speed up the sorts (getting the intermediate files to be in memory).
  • Modifying the query so it returns fewer values. This might involve changing the stop-word list, changing the query to boolean mode, or other ideas.
  • Finding another way of pre-filtering the results.
Verify answered 2/7, 2015 at 10:52 Comment(0)
T
2

The issue here is WHERE p.post_id <> 23

Design your system in such a way so that non-indexed columns — like post_id — need not be added to the WHERE clause.

Basically MySQL will search for the full-text indexed column and then filter the post_id. Hence, if there are a lot of matches returned by the full text search, the response time will not be as expected.

Tenenbaum answered 15/10, 2019 at 9:53 Comment(1)
Great man, this should be the right answerCampball

© 2022 - 2024 — McMap. All rights reserved.