I was browsing the web looking for an indexing and search framework and stumbled upon Solr. A functionality that we absolutely need is to boost results based on what field contained the hit.
A small example:
Consider a record like this:
<movie>
<title>The Dark Knight</title>
<alternative_title>Batman Begins 2</alternative_title>
<year>2008</year>
<director>Christopher Nolan</director>
<plot>Batman, Gordon and Harvey Dent are forced to deal with the chaos unleashed by an anarchist mastermind known only as the Joker, as it drives each of them to their limits.</plot>
</movie>
I want to combine for example the title
, alternative_title
and plot
fields into one search field, which isn't too difficult after looking at the Solr/Lucene documentation and tutorials.
However I also want that movies that have a hit in title
have a higher score than hits on alternative_title
and those in their turn should score higher than hits in the plot
field.
Is there any way to indicate this kind of scoring in the XML or do we need to develop some custom scoring algorithm?
Please also note that the example I've given is fictional and the real data will probably contain 100+ fields.