MySQL full text search plural/singular form of words
Asked Answered
H

3

8

I have a table like this

CREATE TABLE jobs(
    id INT UNSIGNED AUTO_INCREMENT NOT NULL PRIMARY KEY,
    title VARCHAR(200),
    body TEXT,
    FULLTEXT (title,body)
) ENGINE=MyISAM;

And two records in this table

...
7. 10 Senior PHP Developers (Leaders) 
8. 30 PHP Developers..
...

And two queries:

  1. Return 2 records above

    SELECT * FROM jobs WHERE MATCH (title,body) AGAINST ('developers')

  2. Return empty set

    SELECT * FROM jobs WHERE MATCH (title,body) AGAINST ('developer')

I thought that MySQL can found these records with 'developer'. But why it didn't work?

Hesky answered 27/3, 2012 at 15:57 Comment(0)
H
6

You can switch to full text with boolean operators: http://dev.mysql.com/doc/refman/5.5/en/fulltext-boolean.html

And search for:

SELECT * FROM jobs WHERE MATCH (title,body) AGAINST ('developer*' IN BOOLEAN MODE)

You'll get matches for 'developer' first, then 'developers' or any string starting with 'developer'. It's fine for long precise words as 'developer', but searching 'car*' for example could lead to plenty of unexpected results like 'card', 'cardamon', ....

I don't think there is grammatical analysis in MySQL full text search. I was actually researching this issue today and haven't found anything better.

EDIT

This method won't work for words such as 'party' (plural: 'parties'). If your search form is limited to english, i think you can cover most cases with some simple grammatical rules, and for each word, search the word and its plural. And it it's wrong it will probably lead to an invalid word, which should be neutral in your search.

Hydrocortisone answered 27/3, 2012 at 16:12 Comment(1)
thank you, I guess that there is no stemmer inside mysql fulltext search engine. I try to understand and find a best solution for MySQL not for this special case ^^.Hesky
H
4
select * from index_table where item_name rlike '[[:<:]]preform[s]*[es]*[ies]*[[:>:]]';

Check if this helps. My case it worked out. Although will not cover for all plurals but yes for 90-95% cases.

Cheers, Ashish

Holmes answered 14/5, 2014 at 7:20 Comment(1)
Please be aware that there is a big difference between using the fulltext index and something like this. The fulltext index is just that, an index. It stores the words in way that is efficient for search. If you use rlike instead you are not using an index at all, and reduce the search to a full scan of all records.Turnip
T
1

MySQL full-text search functionality is amazing but does not implement several must-to-have features like stemming. If it is critical for you, consider the following options:

  1. Use boolean mode: this has been recommended in one of the previous answer, I will not run into more details, pros and cons here
  2. Pre-processing the input query parameters: add or remove plural/singular before run the query. Of course it is not perfect, several cons to take in mind
  3. Use lucene: although mysql fulltext is cool, a real indexing system is better in dealing with natural language inherent complexities
Thoracoplasty answered 25/3, 2022 at 2:53 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.