mysql select query optimization and how limit works in mysql
Asked Answered
K

1

7

I am using mysql database which has only one table "data" with 17,151257 rows.This table has a column string. i want to print all the rows in which string column contains a particular query string (stored in "entered_query" variable), so i used following:

SELECT DISTINCT * from data WHERE string LIKE '%".$entered_query."%' limit 10

As obvious above query is taking too much time to execute.

I have read that indexing can be used but how in this case ?
I have also thought of dividing whole data into 10 different rows and then executing 10 parallel queries using
perl DBI.

Now I have following questions :

  1. How to reduce the execution time ?
  2. I have heard that "LIKE" in mysql query avoids the indexing so is there any better alternative for above query ?
  3. When we use limit 10 in mysql query then does mysql stops executing as soon as it finds first 10 results or first it searches whole data for given query then it returns first 10 results
Kinswoman answered 9/7, 2013 at 5:47 Comment(5)
I hope that $entered_query doesn't contain quite what its name suggest. Otherwise that might be a way to speed up your queries that you wouldn't want.Dishwater
What is DISTINCT * supposed to do and is there a lower limit to the size of your entered string?Dishwater
@Dishwater DISTINCT * is used to select only unique tuple. $entered_query contains the query which is entered by users to search across "string" column od table "data". *sorry for selecting confusing example.Kinswoman
I was hinting at three things with my questions: 1. if $entered_query contains exactly what the user entered, you have a sql-injection problem. 2. If your table has a primary key (like id), the DISTINCT is unnecessary, yet expensive. 3. If there is a lower limit to the length of the search terms your users can enter, you might have a chance to store your data differently without drowning in combinatorial explosion: you might consider storing each possible substring.Dishwater
@Dishwater thanks for suggestions, i ll definitely consider your points 1 and 2 but for 3rd all search queries are already stored in my database and i think storing all permutations of each string is not feasible.Kinswoman
B
11

Regular indexing can't be used to improve that query. MySQL indexes are B-trees, which means they can find a prefix of the indexed column very quickly. But since your LIKE query has % at the beginning, there's no unique prefix to search for. So every row has to be scanned to match against the pattern.

However, MySQL also supports full-text searching. This creates an index of all the words in the column, and it can find these words quickly. See the documentation for details.

If you use LIMIT 10, it will stop scanning as soon as it finds the first 10 rows that satisfy the conditions. Unless you also use ORDER BY -- then it has to find all the rows so that it can sort them before selecting the first 10.

Burglarious answered 9/7, 2013 at 5:54 Comment(2)
Thanks....Is there any why to achieve the same result without using LIKE? and what about dividing whole data into 10 different rows and then executing 10 parallel queries using perl DBI.Kinswoman
I told you how to do it without using LIKE, use full-text searching. Splitting it up into separate queries might help, it depends on how many cores you have on the database server. But the bottleneck is likely to be the disk, so it won't help too much.Burglarious

© 2022 - 2024 — McMap. All rights reserved.