SQL SELECT WHERE field contains words
Asked Answered
N

15

1011

I need a select which would return results like this:

SELECT * FROM MyTable WHERE Column1 CONTAINS 'word1 word2 word3'

And I need all results, i.e. this includes strings with 'word2 word3 word1' or 'word1 word3 word2' or any other combination of the three.

All words need to be in the result.

Nickelsen answered 12/1, 2013 at 6:19 Comment(0)
M
1485

Rather slow, but working method to include any of words:

SELECT * FROM mytable
WHERE column1 LIKE '%word1%'
   OR column1 LIKE '%word2%'
   OR column1 LIKE '%word3%'

If you need all words to be present, use this:

SELECT * FROM mytable
WHERE column1 LIKE '%word1%'
  AND column1 LIKE '%word2%'
  AND column1 LIKE '%word3%'

If you want something faster, you need to look into full text search, and this is very specific for each database type.

Masuria answered 12/1, 2013 at 6:21 Comment(17)
+ 1 I agree it's slower but it can be mitigated with good indexingBusinesslike
@PreetSangha Indexing when you're searching for LIKE beginning with a wild card? Please show me how!Uziel
If SQL Server and if Full Text indexing is available, it might be faster to use contains -- depends on RDBMS. https://mcmap.net/q/54338/-like-vs-contains-on-sql-serverWalking
In PostgreSQL 9.1 and later, you can create trigram index which can index such searches.Masuria
Should it not be: SELECT * FROM mytable WHERE column1 LIKE '%word2 word3 word1%' OR column1 LIKE '%word1 word3 word2%';Stagner
@AquaAlex: your statement will fail if text has word3 word2 word1.Masuria
I know, but original question only listed 3 conditions (I forgot one '%word1 word2 word3%'). In the question he only listed 3 options, where as the answer caters for any combination of all 3 words. (looking at your AND option). So either his question must be rephrased or your answer will give positives for cases other than 3 he listedStagner
@Masuria (But your answer is awesome and helped me with a issue i had) just trying to be true to the question in my suggestionStagner
What do these % symbols represent? And would the CONTAINS() function do the same thing as LIKE?Genius
@Tom: They're placeholders. e.g.: If you search a word ending with 'word' without any characters following that word, you'd write '%word'.Pacifica
Another downside of this approach: '%word%' will also find 'words', 'crosswordpuzzle' and 'sword' (just as an example). I'd have to do a column1 LIKE 'word' OR column1 LIKE 'word %' OR column1 LIKE '% word' OR column1 LIKE ' word ' to just find exact word matches - and it would still fail for entries where words are not just separated with spaces.Stasny
@Stasny you can prevent matching "words", "sword", "swords" etc. with ' '+column1+' ' LIKE '%[^a-z]word[^a-z]%' - slow/inefficient though ...Morphine
agree with @BlaM, this will unnecessarily increase the result count.Diamond
Helped me in implementing a search boxStiles
how can I put variable inside the pattern meaning %word% - here the word would be a variable not a static thing, so that it would be a dynamic searching based on the variable value.Juridical
here is the solution that worked for my question(in spring boot)- @Query(value = " SELECT * FROM emp_table\n" + "WHERE emp_table.first_name LIKE concat('%',?1,'%') \n" + " OR emp_table.last_name LIKE concat('%',?1,'%') \n" + " OR emp_table.email LIKE concat('%',?1,'%') \n", nativeQuery = true)Juridical
List<Employee> searchEmpByQueryWord(String queryWord); Juridical
C
149

Note that if you use LIKE to determine if a string is a substring of another string, you must escape the pattern matching characters in your search string.

If your SQL dialect supports CHARINDEX, it's a lot easier to use it instead:

SELECT * FROM MyTable
WHERE CHARINDEX('word1', Column1) > 0
  AND CHARINDEX('word2', Column1) > 0
  AND CHARINDEX('word3', Column1) > 0

Also, please keep in mind that this and the method in the accepted answer only cover substring matching rather than word matching. So, for example, the string 'word1word2word3' would still match.

Cavit answered 5/9, 2014 at 0:21 Comment(6)
This seems much easier if your search term is a variable rather than having to add the '%' chars before searchingAntechoir
In Microsoft SQL servers and engines we should use InStr() instead CHARINDEXResource
@Resource There is no InStr in MS SQLAged
@Antechoir Rather than adding the % to the variable, just add it in the search '%'+var+'%' yes it is a bit more ty[ing and quite ugly, but probably better than changing your variable's value.Recurve
SELECT * FROM MyTable WHERE (CHARINDEX('word1', Column1) + CHARINDEX('word2', Column1) + CHARINDEX('word3', Column1)) > 0 is a shorter form. It's not very nice, and I don't know if it performs better, than the version with ANDsFirstling
@Firstling Interesting observation, but + would be the analogue to OR, while the analogue of AND would be *. With that said, I think, removing semantics makes the query less readableMcmurray
H
26

With MySQL:

Auxiliar Function

-- Split @str by @sep
-- Returns all parts
CREATE FUNCTION [dbo].[fnSplit] (
  @sep CHAR(1),
  @str VARCHAR(512)
) RETURNS TABLE AS RETURN (
  WITH Pieces(pn, start, stop) AS (
    SELECT
      1,
      1,
      CHARINDEX(@sep, @str)
    UNION ALL
    SELECT
      pn + 1,
      stop + 1,
      CHARINDEX(@sep, @str, stop + 1)
    FROM Pieces
    WHERE stop > 0
  )

  SELECT
    pn AS Id,
    SUBSTRING(@str, start, CASE
      WHEN stop > 0
      THEN stop - start
      ELSE 512
    END) AS Data
  FROM Pieces
)

Query Example

Search words word1, word2, word3 into MyTable.Column1:

-- Create a temporal table (the Data size depends on the length of the word)
DECLARE @FilterTable TABLE (Data VARCHAR(512))

-- Get different and unique words for the search
INSERT INTO @FilterTable (Data)
SELECT DISTINCT S.Data
FROM fnSplit(' ', 'word1 word2 word3') S -- Contains words

-- Search into "MyTable" by "Column1"
SELECT DISTINCT
  T.*
FROM
  MyTable T
  -- Matching records
  INNER JOIN @FilterTable F1 ON T.Column1 LIKE '%' + F1.Data + '%'
  -- Is some word not present?
  LEFT JOIN @FilterTable F2 ON T.Column1 NOT LIKE '%' + F2.Data + '%'
WHERE
  -- Is some word not present?
  F2.Data IS NULL;
Hospitalet answered 30/12, 2014 at 18:23 Comment(4)
Exellent! How to start to learn about this function, Sir? what is Pieces? and can You tell me pseudocode about this line? SUBSTRING(@str, start, CASE WHEN stop > 0 THEN stop - start ELSE 512 END) AS DataPrissie
This move was incredible ,, I am Really JEALOUS :( _______________________________________________________________________________________ INNER JOIN (@FilterTable F1 ON T.Column1 LIKE '%' + F1.Data + '%' LEFT JOIN (@FilterTable F2 ON T.Column1 NOT LIKE '%' + F2.Data + '%'Backcross
An explanation would be in order. E.g., what is the idea/gist? Why does it need to be so complex? Does it actually answer the question? What was it tested on? What SQL flavour and version is assumed, if any? From the Help Center: "...always explain why the solution you're presenting is appropriate and how it works". Please respond by editing (changing) your answer, not here in comments (without "Edit:", "Update:", or similar - the answer should appear as if it was written today).Equilibrate
Brilliant solution and lightning fast!Crifasi
D
24

Instead of SELECT * FROM MyTable WHERE Column1 CONTAINS 'word1 word2 word3', add And in between those words like:

SELECT * FROM MyTable WHERE Column1 CONTAINS 'word1 And word2 And word3'

For details, see CONTAINS (Transact-SQL).

For selecting phrases, use double quotes like:

SELECT * FROM MyTable WHERE Column1 CONTAINS '"Phrase one" And word2 And "Phrase Two"'

P.S.: You have to first enable Full Text Search on the table before using contains keyword. For more details, see Get Started with Full-Text Search.

Diamond answered 26/7, 2016 at 16:42 Comment(1)
SQL Server, I presume?Equilibrate
C
17
SELECT * FROM MyTable WHERE 
Column1 LIKE '%word1%'
AND Column1 LIKE '%word2%'
AND Column1 LIKE  '%word3%'

Changed OR to AND based on edit to question.

Chaffinch answered 12/1, 2013 at 6:24 Comment(1)
I need all words to be contained in the result in any combinationNickelsen
T
10

If you are using Oracle Database then you can achieve this using a contains query. Contains queries are faster than like queries.

If you need all of the words

SELECT * FROM MyTable WHERE CONTAINS(Column1,'word1 and word2 and word3', 1) > 0

If you need any of the words

SELECT * FROM MyTable WHERE CONTAINS(Column1,'word1 or word2 or word3', 1) > 0

Contains need index of type CONTEXT on your column.

CREATE INDEX SEARCH_IDX ON MyTable(Column) INDEXTYPE IS CTXSYS.CONTEXT
Thinia answered 29/6, 2015 at 12:15 Comment(2)
@downvoters A comment is appreciated telling what is wrong with the answer. This same query is running in our enterprise solution more than 1000 times per day, without any issues :)Thinia
OP does not specify which database is using and everyone has assumed that is Sql Server. But since you have specified Oracle in your response I don't understand downvoters.Mycostatin
D
7

If you just want to find a match.

SELECT * FROM MyTable WHERE INSTR('word1 word2 word3', Column1)<>0

SQL Server:

CHARINDEX(Column1, 'word1 word2 word3', 1)<>0

To get exact match. Example: (';a;ab;ac;',';b;') will not get a match.

SELECT * FROM MyTable WHERE INSTR(';word1;word2;word3;', ';'||Column1||';')<>0
Deventer answered 11/11, 2015 at 20:32 Comment(1)
'INSTR' is not a recognized built-in function name. In my SQL Server.Conyers
H
5

One of the easiest ways to achieve what is mentioned in the question is by using CONTAINS with NEAR or '~'. For example, the following queries would give us all the columns that specifically include word1, word2 and word3.

SELECT * FROM MyTable WHERE CONTAINS(Column1, 'word1 NEAR word2 NEAR word3')

SELECT * FROM MyTable WHERE CONTAINS(Column1, 'word1 ~ word2 ~ word3')

In addition, CONTAINSTABLE returns a rank for each document based on the proximity of "word1", "word2" and "word3". For example, if a document contains the sentence, "The word1 is word2 and word3," its ranking would be high because the terms are closer to one another than in other documents.

We can also use proximity_term to find columns where the words are inside a specific distance between them inside the column phrase.

Hawkins answered 1/5, 2018 at 16:6 Comment(1)
Great answer, but note that this won't work if the table or view is not full-text indexed. Contains() will throw an error: Cannot use a CONTAINS or FREETEXT predicate on table or indexed view 'TABLENAME' because it is not full-text indexed.Carbamidine
P
1

The best way is making a full-text index on a column in the table and use contain instead of LIKE

SELECT * FROM MyTable WHERE 
contains(Column1, N'word1')
AND contains(Column1, N'word2')
AND contains(Column1, N'word3')
Pemphigus answered 14/10, 2017 at 6:34 Comment(1)
The text says "contain". The SQL says "contains". What is correct?Equilibrate
D
0

Use "in" instead:

Select *
from table
where columnname in (word1, word2, word3)
Darvon answered 9/11, 2017 at 23:41 Comment(3)
Because it doesn't work. Have you actually tried it?Masuria
I believe this will return only exact matches.Favata
I also misunderstood the original question: they don't want to find an exact match, but a word being part of a (possibly) larger string. For the more simple "exact-matching" case, this works provided the words are between single quotes (cf. SQLfiddle)Shetler
C
0

This should ideally be done with the help of SQL Server full text search if using that.

However, if you can't get that working on your DB for some reason, here is a performance-intensive solution:

-- table to search in
CREATE TABLE dbo.myTable
    (
    myTableId int NOT NULL IDENTITY (1, 1),
    code varchar(200) NOT NULL,
    description varchar(200) NOT NULL -- this column contains the values we are going to search in
    )  ON [PRIMARY]
GO

-- function to split space separated search string into individual words
CREATE FUNCTION [dbo].[fnSplit] (@StringInput nvarchar(max),
@Delimiter nvarchar(1))
RETURNS @OutputTable TABLE (
  id nvarchar(1000)
)
AS
BEGIN
  DECLARE @String nvarchar(100);

  WHILE LEN(@StringInput) > 0
  BEGIN
    SET @String = LEFT(@StringInput, ISNULL(NULLIF(CHARINDEX(@Delimiter, @StringInput) - 1, -1),
    LEN(@StringInput)));
    SET @StringInput = SUBSTRING(@StringInput, ISNULL(NULLIF(CHARINDEX
    (
    @Delimiter, @StringInput
    ),
    0
    ), LEN
    (
    @StringInput)
    )
    + 1, LEN(@StringInput));

    INSERT INTO @OutputTable (id)
      VALUES (@String);
  END;

  RETURN;
END;
GO

-- this is the search script which can be optionally converted to a stored procedure /function


declare @search varchar(max) = 'infection upper acute genito'; -- enter your search string here
-- the searched string above should give rows containing the following
-- infection in upper side with acute genitointestinal tract
-- acute infection in upper teeth
-- acute genitointestinal pain

if (len(trim(@search)) = 0) -- if search string is empty, just return records ordered alphabetically
begin
 select 1 as Priority ,myTableid, code, Description from myTable order by Description
 return;
end

declare @splitTable Table(
wordRank int Identity(1,1), -- individual words are assinged priority order (in order of occurence/position)
word varchar(200)
)
declare @nonWordTable Table( -- table to trim out auxiliary verbs, prepositions etc. from the search
id varchar(200)
)

insert into @nonWordTable values
('of'),
('with'),
('at'),
('in'),
('for'),
('on'),
('by'),
('like'),
('up'),
('off'),
('near'),
('is'),
('are'),
(','),
(':'),
(';')

insert into @splitTable
select id from dbo.fnSplit(@search,' '); -- this function gives you a table with rows containing all the space separated words of the search like in this e.g., the output will be -
--  id
-------------
-- infection
-- upper
-- acute
-- genito

delete s from @splitTable s join @nonWordTable n  on s.word = n.id; -- trimming out non-words here
declare @countOfSearchStrings int = (select count(word) from @splitTable);  -- count of space separated words for search
declare @highestPriority int = POWER(@countOfSearchStrings,3);

with plainMatches as
(
select myTableid, @highestPriority as Priority from myTable where Description like @search  -- exact matches have highest priority
union
select myTableid, @highestPriority-1 as Priority from myTable where Description like  @search + '%'  -- then with something at the end
union
select myTableid, @highestPriority-2 as Priority from myTable where Description like '%' + @search -- then with something at the beginning
union
select myTableid, @highestPriority-3 as Priority from myTable where Description like '%' + @search + '%' -- then if the word falls somewhere in between
),
splitWordMatches as( -- give each searched word a rank based on its position in the searched string
                     -- and calculate its char index in the field to search
select myTable.myTableid, (@countOfSearchStrings - s.wordRank) as Priority, s.word,
wordIndex = CHARINDEX(s.word, myTable.Description)  from myTable join @splitTable s on myTable.Description like '%'+ s.word + '%'
-- and not exists(select myTableid from plainMatches p where p.myTableId = myTable.myTableId) -- need not look into myTables that have already been found in plainmatches as they are highest ranked
                                                                              -- this one takes a long time though, so commenting it, will have no impact on the result
),
matchingRowsWithAllWords as (
 select myTableid, count(myTableid) as myTableCount from splitWordMatches group by(myTableid) having count(myTableid) = @countOfSearchStrings
)
, -- trim off the CTE here if you don't care about the ordering of words to be considered for priority
wordIndexRatings as( -- reverse the char indexes retrived above so that words occuring earlier have higher weightage
                     -- and then normalize them to sequential values
select s.myTableid, Priority, word, ROW_NUMBER() over (partition by s.myTableid order by wordindex desc) as comparativeWordIndex
from splitWordMatches s join matchingRowsWithAllWords m on s.myTableId = m.myTableId
)
,
wordIndexSequenceRatings as ( -- need to do this to ensure that if the same set of words from search string is found in two rows,
                              -- their sequence in the field value is taken into account for higher priority
    select w.myTableid, w.word, (w.Priority + w.comparativeWordIndex + coalesce(sequncedPriority ,0)) as Priority
    from wordIndexRatings w left join
    (
     select w1.myTableid, w1.priority, w1.word, w1.comparativeWordIndex, count(w1.myTableid) as sequncedPriority
     from wordIndexRatings w1 join wordIndexRatings w2 on w1.myTableId = w2.myTableId and w1.Priority > w2.Priority and w1.comparativeWordIndex>w2.comparativeWordIndex
     group by w1.myTableid, w1.priority,w1.word, w1.comparativeWordIndex
    )
    sequencedPriority on w.myTableId = sequencedPriority.myTableId and w.Priority = sequencedPriority.Priority
),
prioritizedSplitWordMatches as ( -- this calculates the cumulative priority for a field value
select  w1.myTableId, sum(w1.Priority) as OverallPriority from wordIndexSequenceRatings w1 join wordIndexSequenceRatings w2 on w1.myTableId =  w2.myTableId
where w1.word <> w2.word group by w1.myTableid
),
completeSet as (
select myTableid, priority from plainMatches -- get plain matches which should be highest ranked
union
select myTableid, OverallPriority as priority from prioritizedSplitWordMatches -- get ranked split word matches (which are ordered based on word rank in search string and sequence)
),
maximizedCompleteSet as( -- set the priority of a field value = maximum priority for that field value
select myTableid, max(priority) as Priority  from completeSet group by myTableId
)
select priority, myTable.myTableid , code, Description from maximizedCompleteSet m join myTable  on m.myTableId = myTable.myTableId
order by Priority desc, Description -- order by priority desc to get highest rated items on top
--offset 0 rows fetch next 50 rows only -- optional paging

Calamite answered 13/2, 2019 at 12:17 Comment(0)
B
-1

Try to use the "Tesarus search" in a full text index in SQL Server. This is much better than using "%" in search if you have millions of records. Tesarus has a smaller amount of memory consumption than the others.

Try to search this functions :)

Brethren answered 31/3, 2017 at 2:35 Comment(1)
What is Tesarus? Not thesaurus?Equilibrate
R
-3
DECLARE @SearchStr nvarchar(100)
SET @SearchStr = ' '



CREATE TABLE #Results (ColumnName nvarchar(370), ColumnValue nvarchar(3630))

SET NOCOUNT ON

DECLARE @TableName nvarchar(256), @ColumnName nvarchar(128), @SearchStr2 nvarchar(110)
SET  @TableName = ''
SET @SearchStr2 = QUOTENAME('%' + @SearchStr + '%','''')

WHILE @TableName IS NOT NULL

BEGIN
    SET @ColumnName = ''
    SET @TableName = 
    (
        SELECT MIN(QUOTENAME(TABLE_SCHEMA) + '.' + QUOTENAME(TABLE_NAME))
        FROM     INFORMATION_SCHEMA.TABLES
        WHERE         TABLE_TYPE = 'BASE TABLE'
            AND    QUOTENAME(TABLE_SCHEMA) + '.' + QUOTENAME(TABLE_NAME) > @TableName
            AND    OBJECTPROPERTY(
                    OBJECT_ID(
                        QUOTENAME(TABLE_SCHEMA) + '.' + QUOTENAME(TABLE_NAME)
                         ), 'IsMSShipped'
                           ) = 0
    )

    WHILE (@TableName IS NOT NULL) AND (@ColumnName IS NOT NULL)

    BEGIN
        SET @ColumnName =
        (
            SELECT MIN(QUOTENAME(COLUMN_NAME))
            FROM     INFORMATION_SCHEMA.COLUMNS
            WHERE         TABLE_SCHEMA    = PARSENAME(@TableName, 2)
                AND    TABLE_NAME    = PARSENAME(@TableName, 1)
                AND    DATA_TYPE IN ('char', 'varchar', 'nchar', 'nvarchar', 'int', 'decimal')
                AND    QUOTENAME(COLUMN_NAME) > @ColumnName
        )

        IF @ColumnName IS NOT NULL

        BEGIN
            INSERT INTO #Results
            EXEC
            (
                'SELECT ''' + @TableName + '.' + @ColumnName + ''', LEFT(' + @ColumnName + ', 3630) FROM ' + @TableName + ' (NOLOCK) ' +
                ' WHERE ' + @ColumnName + ' LIKE ' + @SearchStr2
            )
        END
    END   
END

SELECT ColumnName, ColumnValue FROM #Results

DROP TABLE #Results
Romanticize answered 5/3, 2018 at 10:41 Comment(1)
Thank you for this code snippet, which might provide some limited, immediate help. A proper explanation would greatly improve its long-term value by showing why this is a good solution to the problem, and would make it more useful to future readers with other, similar questions. Please edit your answer to add some explanation, including the assumptions you've made.Cheder
S
-4

Use:

SELECT * FROM MyTable WHERE Column1 Like "*word*"

This will display all the records where column1 has a partial value containing word.

Shopworn answered 27/12, 2016 at 8:2 Comment(1)
% not * (Oh man the minimum char count for edits and comments is punishing me today)Canyon
B
-9
select * from table where name regexp '^word[1-3]$'

or

select * from table where name in ('word1','word2','word3')
Bonney answered 12/1, 2013 at 7:27 Comment(3)
Is "regexp" standard SQL?Equilibrate
This code seems to check if the column equals one of the three words. The question is about checking if the column contains all of the three words.Cavit
Hiya, this may well solve the problem... but it'd be good if you could edit your answer and provide a little explanation about how and why it works :) Don't forget - there are heaps of newbies on Stack overflow, and they could learn a thing or two from your expertise - what's obvious to you might not be so to them.Geary

© 2022 - 2024 — McMap. All rights reserved.