Full text catalog/index search for %book%
Asked Answered
I

6

16

I'm trying to wrap my head around how to search for something that appears in the middle of a word / expression - something like searching for "LIKE %book% " - but in SQL Server (2005) full text catalog.

How can I do that? It almost appears as if both CONTAINS and FREETEXT really don't support wildcard at the beginning of a search expression - can that really be?

I would have imagined that FREETEXT(*, "book") would find anything with "book" inside, including "rebooked" or something like that.

Intravenous answered 17/11, 2008 at 10:28 Comment(1)
Check this link for a helper class that is very useful for FTS on Sql Server 2005: http://ewbi.blogs.com/develops/2007/05/normalizing_sql.htmlBeekeeping
A
16

unfortunately CONTAINS only supports prefix wildcards:

CONTAINS(*, '"book*"')
Acclaim answered 17/11, 2008 at 10:43 Comment(1)
Yep, this is how I have been doing my "like" searches with iFTS. Works very well.Lifeless
L
7

SQL Server Full Text Search is based on tokenizing text into words. There is no smaller unit as a word, so the smallest things you can look for are words.

You can use prefix searches to look for matches that start with certain characters, which is possible because word lists are kept in alphabetical order and all the Server has to do is scan through the list to find matches.

To do what you want a query with a LIKE '%book%' clause would probably be just as fast (or slow).

Lucky answered 17/11, 2008 at 10:45 Comment(0)
B
3

If you want to do some serious full text searching then I would (and have) use Lucene.Net. MS SQL Full Text search never seems to work that well for anything other than the basics.

Ballplayer answered 17/11, 2008 at 10:36 Comment(0)
U
1

Here's a suggestion that is a workaround for that wildcard limitation. You create a computed column that contains the same content but in reverse as the column(s) you are searching.

If, for example, you are searching on a column named 'ProductTitle', then create a column named ProductsRev. Then update that field's 'Computed Column Specification' value to be:

(reverse([ProductTitle]))

Include the 'ProductsRev' column in your search and you should now be able to return results that support a wildcard at the beginning of the word. Good luck!!

Ultraviolet answered 5/5, 2011 at 20:33 Comment(2)
that won't find "rebbooked" as in reverse it is "dekoober" and "koob*" still doesn't match.Oblation
This is creative, but as the other comment suggests, it only helps with words that end in the target, not in words that have the target in the middle. Also, it feels so much like a data/code smell that I think implementers are better off using regular 'LIKE %blah%' or switching DB engines.Aldrich
O
1

Full text has a table that lists all the words the engine has found. It should have orders-of-magnitude less rows than your full-text-indexed table. You could select from that table " where field like '%book%' " to get all the words that have 'book' in them. Then use that list to write a fulltext query. Its cumbersome, but it would work, and it would be ok in the speed department. HOWEVER, ultimately you are using fulltext wrong when you are doing this. It might actually be better to educate the source of these feature requests about what fulltext is doing. You want them to understand what it WANTS to do, so they can get high value from fulltext. Example, only use wild cards at the end of a word, which means think of the words in an ordered list.

Oblation answered 22/5, 2012 at 18:59 Comment(0)
C
0

why don't program an assembly in C# to compute all the non repeated sufixes. For example if you have the Text "eat the red meat" you can store in a field "eat at t the he e red ed d meat" (note that is not necesary to add eat at and t again) ind then in this field use full text search. A function for doing that can easily written in Csharp

x) I know it seems od... it's a workarround x) I know I'm adding overhead in the insert / update .... only justified if this overhead is insignificant besides the improvement in the search function x) I know there is also an overhead in the size of the stored data.

But I'm pretty conffident that will be quite fast

Cheddar answered 7/1, 2014 at 19:21 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.