I recognise this is a moot point on the web database, so this question applies to the master db...
I have a custom index set up in Sitecore 6.4.1 as follows:
<index id="search_content_US" type="Sitecore.Search.Index, Sitecore.Kernel">
<param desc="name">$(id)</param>
<param desc="folder">_search_content_US</param>
<Analyzer ref="search/analyzer" />
<locations hint="list:AddCrawler">
<search_content_home type="Sitecore.Search.Crawlers.DatabaseCrawler, Sitecore.Kernel">
<Database>master</Database>
<Root>/sitecore/content/usa home</Root>
<Tags>home content</Tags>
</search_content_home>
</locations>
</index>
I query the index like this (I am using techphoria414's SortableIndexSearchContext
from this answer: How to sort/filter using the new Sitecore.Search API):
private SearchHits GetSearchResults(SortableIndexSearchContext searchContext, string searchTerm)
{
CombinedQuery query = new CombinedQuery();
query.Add(new FullTextQuery(searchTerm), QueryOccurance.Must);
return searchContext.Search(query, Sort.RELEVANCE);
}
...
SearchHits hits = GetSearchResults(searchContext, searchTerm);
hits
is a collection of search hits from my index. When I iterate through hits
I can see that there are many duplicates of the same items in Sitecore, 1 per version of the item.
I then do the following to get a SearchResultCollection
:
SearchResultCollection results = hits.FetchResults(0, hits.Length);
This combines all of the duplicates into a single SearchResult
object. This object represents 1 version of a particular item, and has a property called SubResults
which is a collection of SearchResult
s that represent all of the other item versions.
Here's my problem:
The version of the item represented by the SearchResult
is NOT the current published version of the item! It appears to be a randomly selected version (whichever the search method hit first in the index). The latest version is included in the SubResults
collection, however.
E.g.:
SearchResult
|
|- Version 8 // main result
...
|- SubResults
|
|- Version 9 // latest version
|- Version 3
|- Version 5
... // all versions in random order
How do I prevent this from happening on the master db? Either by preventing Lucene from indexing old versions of items, or by doing some manipulation of the result set to get the latest version from the SubResults
?
As an aside, why does Lucene bother to index old versions of items anyway? Surely this is pointless for searching content on your website as the old versions are not visible?