Autocomplete by most frequent words - postgres or lucene?
Asked Answered
S

0

7

We're using Postgres and its fulltext feature to search for documents (posts content) in our system, and it works really well.

For autocomplete we want to build index (dictionary?) with all words used in documents and search by most frequent ones. We will always search for one word. We will never search for phrase.

So if I write:

  • "th"

I will receive (suppose the most frequent words in our documents):

  • "this"
  • "there"
  • "thoughts"
  • ...

How to do it with Postgres? Or maybe we need some more advanced solution like apache lucene / solr ?

Neither postgres fulltext search (which provides lexems) nor postgres trigrams seems to be suitable for this work. Or maybe I am wrong ?

I don't want to manually parse text and ignore all english stopwords which would be error prone. Postgres does good job with this while building lexems index. But intead of lexems, we need to build and search words dictionary without normalization

Thank you for your assistance

Starofbethlehem answered 26/11, 2016 at 22:36 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.