I'm trying to get a good natural language search going in a website, and trying to understand the advantages of Apache Solr vs Xapian. Xapian seems easier to set up. Do both offer good natural language searches? Any insight appreciated.
Xapian is more like Lucene, a library that you integrate with your application. If you have a C++ app, then Xapian might be a better match. If you have a Java application, Lucene is almost certainly the best choice.
If you want a search server, then compare Omega (built on Xapian) to Solr (built on Lucene). I have not used Omega or Xapian, but Solr has a few features that I have come to depend on, especially the per-field analysis chains. That is a brilliant idea, and one that I wish I had thought of when I was working on Ultraseek.
It is quite easy to extend the Solr analysis chain with your own Java class. I expect that would be more difficult in C++ with Omega/Xapian.
The two engines use different underlying relevance models. Xapian is a probabilistic engine, Lucene is a vector space engine. I have seen both models tuned to perform well, so that might not be a reason to decide.
The Solr/Lucene community is large and very helpful.
© 2022 - 2024 — McMap. All rights reserved.