Indexing the partition key in Azure Cosmos DB
Asked Answered
J

2

11

Suppose I've the following data in my container:

{
    "id": "1DBF704E-1623-4844-DC86-EFA729A5C048",
    "firstName": "Wylie",
    "lastName": "Ramsey",
    "country": "AZ",
    "city": "Tucson"
}

Where I use the field "id" as the item id, and the field "country" as the partition key, when I query on specific partition key:

SELECT * FROM c WHERE c.country = "AZ"

(get all the people in "AZ")

Should I add "country" as an index or I will get it by default, since I declared "country" as my partition key? Is there a difference when using the SDK (meaning: adding the new PartitionKey("AZ") option and then sending the query as mentioned above)?

Joggle answered 9/5, 2021 at 13:42 Comment(8)
By default, Azure Cosmos DB automatically indexes every property for all items in your container without having to define any schema or configure secondary indexes. See thisFourfold
correct, but I don't want to index all my properties...Joggle
And on the query statement, you can refer to azure official sample, and you'll find it really no need. C# sapmle, nodejs sampleFourfold
don't worry about the index, but if you wanna custom index policy, refer to this doc may help.Fourfold
The Indexing policies doc doesn't mention anything about indexing the partition keyJoggle
Using partition key as the filtering key is a good practice. If most of your workload's requests are queries and most of your queries have an equality filter on the same property, this property can be a good partition key choice.Fourfold
If you'd like to set custom index policy, you should set index according to your querying requirement, index is used to improve query performance, right? So I think it's necessary to set index on partition key in your scenario :)Fourfold
Not sure why this question is downvoted, it's a perfectly valid question. Also, for the replies above, I suppose the question is not whether partition key should have an index or not, it should definitely have an index. The question here is whether, it should be explicitly declared in the indexing policy json or it would automatically exist, even if /* is added to the excludedPaths. Correct me, if I am wrong @JoggleHouseboat
C
19

I created a collection with 50,000 records and disabled indexing on all properties.

Indexing policy:

{
    "indexingMode": "consistent",
    "automatic": true,
    "includedPaths": [], // Included nothing
    "excludedPaths": [
        {
            "path": "/\"_etag\"/?"
        },
        {
            "path": "/*" // Exclude all paths
        }
    ]
}

Querying by id cost 2.85 RUs. Querying by PartitionKey cost 580 RUs.

Indexing policy with PartitionKey (country) added:

{
    "indexingMode": "consistent",
    "automatic": true,
    "includedPaths": [
        {
            "path": "/country/*"
        }
    ],
    "excludedPaths": [
        {
            "path": "/\"_etag\"/?"
        },
        {
            "path": "/*" // Exclude all paths
        }
    ]
}

Adding an index on the PartitionKey brought it down to 2.83 RUs.

So the answer to this is Yes, if you have disabled default indexing policies and need to search by partition key then you should add an index to it.

Choanocyte answered 5/8, 2021 at 22:23 Comment(1)
Documentation is not super clear about it but seems to recommend indexing the PK learn.microsoft.com/en-us/azure/cosmos-db/…: "Exclude the root path to selectively include paths that need to be indexed. The partition key property path isn't indexed by default with the exclude strategy and should be explicitly included if needed."Hartwig
F
-1

In my opinion, it's a good practice to query with partition key in cosmosdb sql api, here's the offical doc related to it.

By the way, cosmosdb sql api indexes all the properties by default. If you'd like to cover the default setting and customise the indexing policy, this doc may provide more details.

Fourfold answered 9/5, 2021 at 14:44 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.