MySQL Match ... Against query very slow
Asked Answered
B

1

6

I work on a website running on Propel ORM and I have this query:

if(isset($args["QueryText"]) && $args["QueryText"] != "") {
  $query = $query
    ->withColumn("(MATCH (Request.Subject, Request.Detail) AGAINST ('" . $args["QueryText"] . "' IN BOOLEAN MODE) + MATCH (Response.Response) AGAINST ('" . $args["QueryText"] . "' IN BOOLEAN MODE))", "RequestRelevance")
    ->condition('cond1', "(MATCH (Request.Subject, Request.Detail) AGAINST ('" . $args["QueryText"] . "' IN BOOLEAN MODE) + MATCH (Response.Response) AGAINST ('" . $args["QueryText"] . "' IN BOOLEAN MODE)) > 0.2")
    ->condition('cond2', 'Request.Id = ?', $args["QueryText"])
    ->where(array('cond1', 'cond2'), 'or')
    ->orderBy("RequestRelevance", Criteria::DESC);
}

which translates to the following in SQL:

SELECT DISTINCT 
(MATCH (requests.subject, requests.detail) AGAINST ('46104' IN BOOLEAN MODE) + 
    MATCH (Response.response) AGAINST ('46104' IN BOOLEAN MODE)) 
  AS RequestRelevance, requests.requestID AS "Id", requests.subject AS "Subject", requests.detail AS "Detail", requests.created AS "CreatedDate", requests.lastresponsedate AS "LastResponseDate", SupportStatus.supportstatusID AS "SupportStatus.Id", SupportStatus.supportstatus AS "SupportStatus.Name", SupportStatus.isnew AS "SupportStatus.IsNew", SupportStatus.isclosed AS "SupportStatus.IsClosed", CustomerGroup.customergroupID AS "CustomerGroup.Id", CustomerGroup.customergroup AS "CustomerGroup.Name", Site.siteID AS "Site.Id", Site.site AS "Site.Name", InternalUser.userID AS "InternalUser.Id", InternalUser.username AS "InternalUser.Username", User.userID AS "User.Id", User.username AS "User.Username", Customer.customerID AS "Customer.Id", Customer.customer AS "Customer.Name", Customer.customergroupID AS "Customer.CustomerGroupId", Customer.rate AS "Customer.Rate" 
FROM requests 
  LEFT JOIN responses Response ON (requests.requestID=Response.requestID) 
  INNER JOIN supportstatus SupportStatus ON (requests.supportstatusID=SupportStatus.supportstatusID) 
  INNER JOIN customergroups CustomerGroup ON (requests.customergroupID=CustomerGroup.customergroupID)
  INNER JOIN customers Customer ON (requests.customerID=Customer.customerID) 
  INNER JOIN sites Site ON (requests.siteID=Site.siteID) 
  LEFT JOIN users InternalUser ON (requests.internaluserID=InternalUser.userID) 
  LEFT JOIN users User ON (requests.userID=User.userID)
WHERE ((MATCH (requests.subject, requests.detail) AGAINST ('46104' IN BOOLEAN MODE) + 
  MATCH (Response.response) AGAINST ('46104' IN BOOLEAN MODE)) > 0.2 OR requests.requestID = '46104') 
ORDER BY requests.created ASC,RequestRelevance DESC

It takes a good 20 second to load on the website using Propel and 7.020 seconds when running the SQL query.

I tried the following one instead:

SELECT DISTINCT 
  requests.requestID AS "Id", requests.subject AS "Subject", requests.detail AS "Detail", requests.created AS "CreatedDate", requests.lastresponsedate AS "LastResponseDate", SupportStatus.supportstatusID AS "SupportStatus.Id", SupportStatus.supportstatus AS "SupportStatus.Name", SupportStatus.isnew AS "SupportStatus.IsNew", SupportStatus.isclosed AS "SupportStatus.IsClosed", CustomerGroup.customergroupID AS "CustomerGroup.Id", CustomerGroup.customergroup AS "CustomerGroup.Name", Site.siteID AS "Site.Id", Site.site AS "Site.Name", InternalUser.userID AS "InternalUser.Id", InternalUser.username AS "InternalUser.Username", User.userID AS "User.Id", User.username AS "User.Username", Customer.customerID AS "Customer.Id", Customer.customer AS "Customer.Name", Customer.customergroupID AS "Customer.CustomerGroupId", Customer.rate AS "Customer.Rate" 
FROM requests 
  LEFT JOIN responses Response ON (requests.requestID=Response.requestID) 
  INNER JOIN supportstatus SupportStatus ON (requests.supportstatusID=SupportStatus.supportstatusID) 
  INNER JOIN customergroups CustomerGroup ON (requests.customergroupID=CustomerGroup.customergroupID)
  INNER JOIN customers Customer ON (requests.customerID=Customer.customerID) 
  INNER JOIN sites Site ON (requests.siteID=Site.siteID) 
  LEFT JOIN users InternalUser ON (requests.internaluserID=InternalUser.userID) 
  LEFT JOIN users User ON (requests.userID=User.userID) 
WHERE (requests.subject LIKE '%46104%' OR requests.detail LIKE '%46104%' OR  Response.response LIKE '%46104%' OR requests.requestID = '46104') 
ORDER BY requests.created

which takes 3.308 seconds to execute. Removing OR Response.response LIKE '%46104%' from it reduces the time to 0.140 seconds. The responses table contains 288317 rows and the responses.responses column is a TEXT() column with a FULLTEXT index.

What would be the best way to reduce the execution time of this search? I have tried using this https://dba.stackexchange.com/questions/15214/why-is-like-more-than-4x-faster-than-match-against-on-a-fulltext-index-in-mysq answer however when I execute:

SELECT responseID FROM
(
 SELECT * FROM responses
 WHERE requestID = 15000
 AND responseID != 84056
) A
WHERE MATCH(response) AGAINST('twisted');

I get this error:

Error Code: 1191. Can't find FULLTEXT index matching the column list

Help will be greatly appreciated!

EDIT 1:

Tried @Richard EB's query:

ALTER TABLE responses ADD FULLTEXT(response)
288317 row(s) affected Records: 288317  Duplicates: 0  Warnings: 0  78.967 sec

However:

SELECT responseID FROM (     SELECT * FROM responses     WHERE requestID = 15000     AND responseID != 84056 ) A WHERE MATCH(response) AGAINST('twisted') LIMIT 0, 2000 
Error Code: 1191. Can't find FULLTEXT index matching the column list    0.000 sec

Removing DISTINCT reduces the execution time to 0.952 seconds however it doesn't retrieve the results I need.

EDIT 2:

Executing this query:

SELECT DISTINCT 
responses.requestID AS "Id" FROM responses WHERE MATCH(response) AGAINST('twisted')

takes 0.062 secs.

Executing this:

SELECT DISTINCT responses.requestID
  FROM responses 
WHERE (
  MATCH (Responses.response) AGAINST ('twisted' IN BOOLEAN MODE)
) 
ORDER BY responses.requestID ASC

only takes 0.046 secs. However, selecting from Requests and joining Responses is what slows down the query. I'm not sure if it means that the whole query has to be completely rewritten to select from Responses and joining Requests instead?

EDIT 3:

Here are the indexes that I have on Requests and Responses tables:

    # Table, Non_unique, Key_name, Seq_in_index, Column_name, Collation, Cardinality, Sub_part, Packed, Null, Index_type, Comment, Index_comment
    responses, 0, PRIMARY, 1, responseID, A, 288317, , , , BTREE, , 
    responses, 1, requestID, 1, requestID, A, 48052, , , YES, BTREE, , 
    responses, 1, response, 1, response, , 1, , , YES, FULLTEXT, , 
    responses, 1, response_2, 1, response, , 1, , , YES, FULLTEXT, , 
    responses, 1, response_3, 1, response, , 1, , , YES, FULLTEXT, , 

    # Table, Non_unique, Key_name, Seq_in_index, Column_name, Collation, Cardinality, Sub_part, Packed, Null, Index_type, Comment, Index_comment
    requests, 0, PRIMARY, 1, requestID, A, 46205, , , , BTREE, , 
    requests, 1, supportstatusID, 1, supportstatusID, A, 14, , , YES, BTREE, , 
    requests, 1, internaluserID, 1, internaluserID, A, 344, , , YES, BTREE, , 
    requests, 1, customergroupID, 1, customergroupID, A, 198, , , , BTREE, , 
    requests, 1, userID, 1, userID, A, 1848, , , YES, BTREE, , 
    requests, 1, siteID, 1, siteID, A, 381, , , YES, BTREE, , 
    requests, 1, request, 1, subject, , 1, , , YES, FULLTEXT, , 
    requests, 1, request, 2, detail, , 1, , , YES, FULLTEXT, , 
    requests, 1, request, 3, ponumber, , 1, , , YES, FULLTEXT, , 
Berthold answered 21/10, 2015 at 16:2 Comment(0)
R
3

A LIKE search will go through every record and perform non-exact string comparison, which is why it's so slow.

The mysql error you've pasted indicates that the column referenced in the MATCH clause doesn't have a fulltext index (or it does, but it's not as it's referenced in the MATCH clause).

Assuming you're using MyISAM or have MySQL 5.6 and InnoDB, try:

ALTER TABLE responses ADD FULLTEXT(response);

Edit: Answer (in comments): "If there are two columns mentioned in the MATCH statement, then there should be one fulltext index on both of those columns, not two fulltext indexes on each column"

Rickettsia answered 21/10, 2015 at 16:21 Comment(12)
Thank you, I have tried that however it did not resolve the issue. I have edited my original post. The thing is, the LIKE search takes half the time that the MATCH ... AGAINST search does however I assume that by optimising/improving the latter, it could be much faster than the LIKE search. Am I correct?Berthold
What happens when you try just: SELECT * FROM responses WHERE MATCH(response) AGAINST('twisted') If that doesn't work, try: ALTER TABLE table responses ENABLE KEYS If that does work, the problem is not with the fulltext index, but in the way/order you have written the query.Rickettsia
Thanks, I have edited my original post with the queries and the results.Berthold
LIKE is slow. if you're saying that the MATCH query is fast, but it's slow when you join it with another query that does LIKE (as in your post), then that's not surprising. If its slow and all you're doing is joining tables (and not using LIKE), then the problem must be with the joins (are the ID columns indexed?)Rickettsia
Edited the post. I am not the one who created the tables or wrote the query, but should I not have an index on responseID in Requests?Berthold
Please ignore my comment above - the Requests table doesn't have a responseID column.Berthold
This has become very confusing with the various edits. It seems that you have a MATCH query in edit2 that is working as it should. Rebuild your full query from that point in individual steps, and where it slows down, that's your problem. My last observation here is that where you MATCH against multiple fields: "MATCH (requests.subject, requests.detail)", there should be an equivalent fulltext index on the multiple fields.Rickettsia
Basically it slows down when using a WHERE query on a joint table. I rebuilt the same query but by selecting from responses and joining requests, so the other way round, but it didn't make any difference.Berthold
This is where I've got: pastebin.com/q83YHdXz - it uses relevance but takes 2.3 secs. Removing relevance reduces the time to 1.2 secs. It seems that the only FULLTEXT index used is responses.response, while requests.detail and requests.subject indexes aren't used.Berthold
Did you see my comment about how indexes must match the MATCH statement? If there are two columns mentioned in the MATCH statement, then there should be one fulltext index on both of those columns, not two fulltext indexes on each column.Rickettsia
Sorry, I misunderstood that. I had a fulltext index on subject, detail and ponumber. I created another one, just for subject and detail (I don't want to alter or remove that one as someone else using the database might need it) and the execution time has dropped by about 40%. I still cannot see the created index when I call EXPLAIN and I'm not sure why though the execution time is definitely pretty good now (and that includes Relevance).Berthold
If the query time is reduced to acceptable limits then that's what matters. To understand why the EXPLAIN isn't showing the index being used, you'll have to start the query from scratch with just the MATCH clause, and then add query clauses until you see the problem you see now. I've edited the answer, please accept.Rickettsia

© 2022 - 2024 — McMap. All rights reserved.