Using SQL Server 2012 (in general using SQL Server 2008 R2 up to SQL Server 2016)
This question is a more specific re wording of SQL-Server Full Text Index Unexpected results. Please see here for how we have got to this point and what has been tried already.
I am re-posting now we have found the specific fault. Many thanks to @HoneyBadger.
His help has been invaluable getting to this point.
Table structure:
CREATE TABLE TestFullTextSearch (Id INT NOT NULL, AllText NVARCHAR(400))
CREATE UNIQUE INDEX test_tfts ON TestFullTextSearch(Id)
CREATE FULLTEXT CATALOG ftcat_tfts
CREATE FULLTEXT INDEX ON TestFullTextSearch(AllText)
KEY INDEX test_tfts ON ftcat_tfts
WITH CHANGE_TRACKING AUTO, STOPLIST OFF
Data:
INSERT INTO TestFullTextSearch
VALUES (1, ' 123_456 789 '), (2, ' 789 123_456 '),
(3, ' 123_456 ABC '), (4, ' ABC 123_456 ')
Please note this data is purely to demonstrate the issue, and is not indicative of a live data set. Our live data sets can be over 500,000 rows, with paragraphs of data in a single field being searched - hence using full text searches.
Select 1: Results as expected
SELECT *
FROM TestFullTextSearch
WHERE CONTAINS (AllText, '"123*"')
Id AllText
----------- ------------
1 123_456 789
2 789 123_456
3 123_456 ABC
4 ABC 123_456
SELECT 2: Misses row 2 in the result set
SELECT *
FROM TestFullTextSearch
WHERE CONTAINS (AllText, '"123_*"')
Id AllText
----------- ------------
1 123_456 789
3 123_456 ABC
4 ABC 123_456
SELECT 3: Only returns row 2
SELECT *
FROM TestFullTextSearch
WHERE CONTAINS (AllText, '"123\_*"')
Id AllText
----------- ------------
2 789 123_456
Conclusion: Searching for a number string suffixed with an underscore fails if preceding word is a number string.
Problem: Our clients use full text search and expect a result to be forth coming around part numbers and catalogue references, which may or may not be in a section of text including other number strings. Full text search does not seem to support this in a consistent manner.
Any help gratefully received.
NB: This issue does not happen on SQL SERVER 2008, but does on 2012+
I have also tried switching to the older version of FTS parser. Testing with
SELECT * FROM sys.dm_fts_parser (' "789 123_456" ',1033,0,0)
SELECT * FROM sys.dm_fts_parser (' "789 123_456" ',2057,0,0)
And after reverting to the legacy parser:
So it has had an effect, however I'm still getting the same results.
Are there any other differences in the full text search between 2008 & 2012 that could be having this effect?