If you use the method onlyWholewords() it should have no results for your example above.
For example:
Trie trie = Trie.builder()
.onlyWholeWords()
.addKeyword("He")
.build();
Collection<Emit> emits = trie.parseText("Hello World");
emits in this case will be empty.
It will only reult in whole words only that are "he".
Although beware of characters that are not [a-z A-Z]. for example if you:
"He//Is"
It would pick up the "He" and ignores the "//"
Two things to add:
if you want to assert word boundary, you can use:
onlyWholeWordsWhiteSpaceSeparated()
instead of
onlyWholeWords()
If you want to "white-list" some characters, this read might be helpful:
The word characters used are the default ones modified by the ones
provided and boolean flags signal where characters are turned on and
off. This is useful when you just want to turn off a specific
character in the set of default characters. For example:
The word characters used are the default ones modified by the ones
provided and boolean flags signal where characters are turned on and
off. This is useful when you just want to turn off a specific
character in the set of default characters. For example:
new WholeWordMatchSet(keywords, true, ['_', '='], [false, true])
Will produce a set where letters and digits and - and = are considered
word characters, but not _.