An execution plan for the example query reveals the root cause of the problem:
[nt:base] as [s] /* lucene:lucene(/oak:index/lucene) +:fulltext:my +:fulltext:search +:fulltext:expression ft:("my/search-expression") where contains([s].[*], 'my/search-expression') */
The CONTAINS
operator triggers a full text search. Non-word characters, like "/" or "-", are used as word delimiters. As a result, the query looks for all nodes that contain the words: "my", "search" and "expression".
What can be done with it? There are several options.
1. Use double quotes
If you want to limit results to phrases with given words in exact order and without any other words between them, put the search expression inside double quotes:
SELECT * FROM [nt:base] AS s WHERE CONTAINS(s.*, '"my/search-expression"')
Now, the execution plan is different:
[nt:base] as [s] /* lucene:lucene(/oak:index/lucene) :fulltext:"my search expression" ft:("my/search-expression") where contains([s].[*], '"my/search-expression"') */
The query will now look for the whole phrase, not single words. However, it still ignores non-word characters, so such phrases would also be found: "my search expression" or "my-search-expression".
2. Use LIKE expression (not recommended)
If you want to find only the exact phrase, keeping non-word characters, you can use the LIKE
expression:
SELECT * FROM [nt:base] AS s WHERE s.* LIKE '%my/search-expression%'
This is, however, much slower. I needed to add another condition to avoid timeout during explaining the execution plan. For this query:
SELECT * FROM [nt:base] AS s WHERE s.* LIKE '%my/search-expression%' AND ISDESCENDANTNODE([/content/my/content])
the execution plan is:
[nt:base] as [s] /* traverse "/content/my/content//*" where ([s].[*] like '%my/search-expression%') and (isdescendantnode([s], [/content/my/content])) */
It would find only nodes with this phrase: "my/search-expression".
3. Use double quotes and refine the results
It would be probably better to use the first approach (CONTAINS
with double quotes) and refine the results later, for example in application code if the query is run from an application.
4. Mix CONTAINS and LIKE
Another option is to mix full-text search and LIKE
expression with AND
:
SELECT * FROM [nt:base] AS s WHERE CONTAINS(s.*, '"my/search-expression"') AND s.* LIKE '%my/search-expression%'
The execution plan is now:
[nt:base] as [s] /* lucene:lucene(/oak:index/lucene) :fulltext:"my search expression" ft:("my/search-expression") where (contains([s].[*], '"my/search-expression"')) and ([s].[*] like '%my/search-expression%') */
Now, it should be fast and strict in the same time.