mysql match against not return case insensitive results
Asked Answered
S

2

8

I have two tables:

CREATE TABLE IF NOT EXISTS `test1` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `bucket_id` int(10) unsigned NOT NULL COMMENT 'folder this component belongs to',
  `test1_name` varchar(81) NOT NULL COMMENT 'Name of this component',
  `test1_desc` varchar(1024) NOT NULL COMMENT 'Component Description',
  PRIMARY KEY (`id`),
  FULLTEXT KEY `test1_search` (`test1_name`,`test1_desc`)
) ENGINE=MyISAM  DEFAULT CHARSET=utf8 AUTO_INCREMENT=3 ;

CREATE TABLE IF NOT EXISTS `bucket` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `bkt_name` varchar(81) NOT NULL COMMENT 'The name of this bucket',
  `bkt_desc` varchar(1024) NOT NULL COMMENT 'A description of this bucket',
  `bkt_keywords` varchar(512) DEFAULT NULL COMMENT 'keywords for searches',
  PRIMARY KEY (`id`),
  FULLTEXT KEY `fldr_search` (`bkt_desc`,`bkt_keywords`,`bkt_name`)
) ENGINE=MyISAM  DEFAULT CHARSET=utf8 AUTO_INCREMENT=8 ;

Bucket is just a holder while test1 contains all the things that would go into a bucket. For example:

INSERT INTO `bucket` (`id`, `bkt_name`, `bkt_desc`, `bkt_keywords`) VALUES
(1, 'Simpsons', 'The Simpsons Cartoon Family was first successful adult cartoon series', 'Homer, Marge, Lisa and Bart'),
(2, 'Griffins', 'The family from the popular family guy series', 'Peter, Lois, Meg, Chris, Stewie, Brian');

INSERT INTO `test1` (`id`, `bucket_id`, `bkt_name`, `bkt_desc`) VALUES
(1, 1, 'Homer Simpson', 'Homer the figurative head of the Simpsons Family and is the husband of Marge'),
(2, 2, 'Peter Griffin', 'Peter the figurative head of the Griffin family on the hit TV seriers The family Guy');

Now, using the following query I want to look for all buckets whose name, description or keywords contain the search term "family" or whose components contain the words "family")

So far, what I have is this query and it's not returning mixed case results as in "Family" is not found while "family" is.

SELECT *
FROM bucket
RIGHT JOIN test1 ON test1.bucket_id = bucket.id
WHERE
  bucket.isvisible > 0 AND
  MATCH(bucket.bkt_keywords, bucket.bkt_desc, bucket.bkt_name)
    AGAINST('family' IN BOOLEAN MODE) OR
  MATCH(test1.test1_name, test1.test1_desc) 
    AGAINST('family' IN BOOLEAN MODE)

I should also add that all text fields have the collation of utf8_general_ci as does the entire table which is MyISAM.

Smukler answered 4/2, 2013 at 20:3 Comment(4)
Your example provided has "family" in all lower case in both the bucket description and the test description. Can you make a better example to demonstrate your problem?Minhminho
FYI, the field names in your insert statement for test1 are incorrect (should be test1_name, test1_desc). Also your SELECT has bucket.isvisible > 0, which isn't included in the example table. Once I fixed those two things though, the query returned both rows, so unfortunately I don't know the cause of your actual problem. (Unless of course the whole capital thing is a red herring, and the real problem is that bucket.isvisible parameter on the simpsons row!)Rouge
Also you could simplify the query like so: SELECT * FROM bucket RIGHT JOIN test1 ON test1.bucket_id = bucket.id WHERE MATCH(bucket.bkt_keywords, bucket.bkt_desc, bucket.bkt_name, test1.test1_desc, test1.test1_name) AGAINST('family' IN BOOLEAN MODE)Rouge
Maybe your column has a binary collation, even though your table does not? Double-check with SHOW FULL COLUMNS FROM tablename. (from user fenway)Timecard
S
2

The answer is apparently adding some parens around the two match against clauses.

SELECT *
FROM bucket
RIGHT JOIN test1 ON test1.bucket_id = bucket.id
WHERE bucket.isvisible > 0 AND
 ( MATCH(bucket.bkt_keywords, bucket.bkt_desc, bucket.bkt_name)
  AGAINST('family' IN BOOLEAN MODE) OR
  MATCH(test1.test1_name, test1.test1_desc) 
  AGAINST('family' IN BOOLEAN MODE) )
Smukler answered 22/2, 2013 at 19:20 Comment(1)
OMG! Yes, this is because of operator precedence (dev.mysql.com/doc/refman/5.5/en/logical-operators.html)... Without parens, your condition was equivalent to WHERE ( isvisible >0 AND MATCH(...) ) OR ( MATCH(...) )Deniable
D
6

I think your tables do not use utf8_general_ci as collation, but utf8_bin. I was able to reproduce the behaviour you describe after modifying the tables as follows:

ALTER TABLE test1 CONVERT TO CHARACTER SET utf8 COLLATE utf8_bin;
ALTER TABLE bucket CONVERT TO CHARACTER SET utf8 COLLATE utf8_bin;

You should perhaps set your tables' collation explicitely to:

ALTER TABLE test1 CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
ALTER TABLE bucket CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;

If the above changes anything, I would guess your server or session is actually set to use another collation by default (since the collation is not specified in your tables definition). This could be checked with:

SHOW GLOBAL VARIABLES LIKE 'collation_server';
SHOW SESSION VARIABLES LIKE 'collation_server';
Deniable answered 14/2, 2013 at 22:36 Comment(1)
Using phpMyAdmin all the fields AND the tables show the collation as utf8_general_ci. If I export the tables the COLLATE is set to utf8_general_ci.Smukler
S
2

The answer is apparently adding some parens around the two match against clauses.

SELECT *
FROM bucket
RIGHT JOIN test1 ON test1.bucket_id = bucket.id
WHERE bucket.isvisible > 0 AND
 ( MATCH(bucket.bkt_keywords, bucket.bkt_desc, bucket.bkt_name)
  AGAINST('family' IN BOOLEAN MODE) OR
  MATCH(test1.test1_name, test1.test1_desc) 
  AGAINST('family' IN BOOLEAN MODE) )
Smukler answered 22/2, 2013 at 19:20 Comment(1)
OMG! Yes, this is because of operator precedence (dev.mysql.com/doc/refman/5.5/en/logical-operators.html)... Without parens, your condition was equivalent to WHERE ( isvisible >0 AND MATCH(...) ) OR ( MATCH(...) )Deniable

© 2022 - 2024 — McMap. All rights reserved.