Azure Search scoring
Asked Answered
T

1

5

I have sets of 3 identical (in Text) items in Azure Search varying on Price and Points. Cheaper products with higher points are boosted higher. (Price is boosted more then Points, and is boosted inversely).

However, I keep seeing search results similar to this.

Search is on ‘john milton’.

I get

Product="Id = 2-462109171829-1, Price=116.57, Points=  7, Name=Life of Schamyl / John Milton Mackie, Description=.", Score=32.499783
Product="Id = 2-462109171829-2, Price=116.40, Points=  9, Name=Life of Schamyl / John Milton Mackie, Description=.", Score=32.454872
Product="Id = 2-462109171829-3, Price=115.64, Points=  9, Name=Life of Schamyl / John Milton Mackie, Description=.", Score=32.316270

I expect the scoring order to be something like this, with the lowest price first.

Product="Id = 2-462109171829-3, Price=115.64, Points=  9, Name=Life of Schamyl / John Milton Mackie, Description=.", Score=
Product="Id = 2-462109171829-2, Price=116.40, Points=  9, Name=Life of Schamyl / John Milton Mackie, Description=.", Score=
Product="Id = 2-462109171829-1, Price=116.57, Points=  7, Name=Life of Schamyl / John Milton Mackie, Description=.", Score=

What am I missing or are minor scoring variations acceptable?

The index is defined as

let ProductDataIndex = 

        let fields = 
                    [|
                        new Field (
                            "id", 
                            DataType.String,
                            IsKey           = true, 
                            IsSearchable    = true);


                        new Field (
                            "culture", 
                            DataType.String,
                            IsSearchable    = true);

                        new Field (
                            "gran", 
                            DataType.String,
                            IsSearchable    = true);

                        new Field (
                            "name", 
                            DataType.String,
                            IsSearchable    = true);

                        new Field (
                            "description", 
                            DataType.String, 
                            IsSearchable    = true);

                        new Field (
                            "price", 
                            DataType.Double, 
                            IsSortable      = true,
                            IsFilterable    = true)

                        new Field (
                            "points", 
                            DataType.Int32, 
                            IsSortable      = true,
                            IsFilterable    = true)
                    |]

        let weightsText = 
            new TextWeights(
                Weights =   ([|  
                                ("name",        4.); 
                                ("description", 2.) 
                            |]
                            |> dict))

        let priceBoost = 
            new MagnitudeScoringFunction(
                new MagnitudeScoringParameters(
                    BoostingRangeStart  = 1000.0,
                    BoostingRangeEnd    = 0.0,
                    ShouldBoostBeyondRangeByConstant = true),
                "price",
                10.0)

        let pointsBoost = 
            new MagnitudeScoringFunction(
                new MagnitudeScoringParameters(
                    BoostingRangeStart  = 0.0,
                    BoostingRangeEnd   = 10000000.0,
                    ShouldBoostBeyondRangeByConstant = true),
                "points",
                2.0)

        let scoringProfileMain = 
            new ScoringProfile (
                            "main", 
                            TextWeights =
                                weightsText,
                            Functions = 
                                new List<ScoringFunction>(
                                        [
                                            priceBoost      :> ScoringFunction
                                            pointsBoost     :> ScoringFunction
                                        ]),
                            FunctionAggregation = 
                                ScoringFunctionAggregation.Sum)

        new Index 
            (Name               =   ProductIndexName
            ,Fields             =   fields 
            ,ScoringProfiles    =   new List<ScoringProfile>(
                                        [
                                            scoringProfileMain
                                        ]))
Torture answered 23/4, 2015 at 5:1 Comment(5)
Hi Hocho, quick clarifying question, how many documents are in your index? Scoring in indexes with low document count may be a little off. This is a result of how they are internally organized to enable efficient scale ups and scale downs of your distributed service.Abyss
30+ million documents. I am doing some proof of concept testing, so each document is replicated 3 time with all identical fields except for the Identifying field and the Price and Points fields randomly generated within 10% of each other respectively.Torture
Thanks! Do you see the same behavior when you issue a query that's less selective? For example : "John" (assuming you have more than one John in your dataset :))Abyss
Yes, I see the same behavior on all queries. Most results show up in the expected order but about 5 to 10% show up in the unexpected order.Torture
Thanks. I'll need more information to answer this. I'll follow up over email and then summarize my findings here once we find the root cause.Abyss
A
7

All indexes in Azure Search are split into multiple shards allowing us for quick scale up and scale downs. When a search request is issued, it’s issued against each of the shards independently. The result sets from each of the shards are then merged and ordered by score (if no other ordering is defined). It is important to know that the scoring function weights query term frequency in each document against its frequency in all documents, in the shard!

It means that in your scenario, in which you have three instances of every document, even with scoring profiles disabled, if one of those documents lands on a different shard than the other two, its score will be slightly different. The more data in your index, the smaller the differences will be (more even term distribution). It’s not possible to assume on which shard any given document will be placed.

In general, document score is not the best attribute for ordering documents. It should only give you general sense of document relevance against other documents in the results set. In your scenario, it would be possible to order the results by price and/or points if you marked price and/or points fields as sortable. You can find more information how to use $orderby query parameter here: https://msdn.microsoft.com/en-us/library/azure/dn798927.aspx

Abyss answered 28/4, 2015 at 1:15 Comment(3)
Just one question here, if I order on score, it should not vary from one call to the next if my search criteria is the same and the data does not change in the index. But for me, using pagination, moving from one page to the next and coming back to the first one, I see different scores. How so?Watts
Do you see different score for the same item or different order of items with the same score? Take a look a this question if it's the latter: #43593276Abyss
This is a hard requirement for us so we can't use azure search. Does anyone know if other online search providers have the same limitation? ThanksBashuk

© 2022 - 2024 — McMap. All rights reserved.