Why doesn't routing work with ElasticSearch Bulk API?
Asked Answered
C

3

8

I am setting a Bulk request to ElasticSearch and specifying the shard to route to.

But when I run it, the documents get sent to different shards.

Is this a bug in ElasticSEarch bulk? it works when I just index a single document. It works when I search. But not when I do a bulk import.

To reproduce:

curl -XPOST 'http://192.168.1.115:9200/_bulk?routing=a' -d '
{ "index" : { "_index" : "articles", "_type" : "article", "_id" : "1" } }
{ "title" : "value1" }
{ "delete" : { "_index" : "articles", "_type" : "article", "_id" : "2" } }
{ "create" : { "_index" : "articles", "_type" : "article", "_id" : "3" } }
{ "title" : "value3" }
{ "update" : {"_id" : "1", "_type" : "article", "_index" : "index1"} }
{ "doc" : {"field2" : "value2"} }'
Clavus answered 2/11, 2013 at 18:47 Comment(0)
C
16

So adding the "routing" parameter to the end of the URL doesn't work.

I need to add the "_routing" field to the actual document fields to specify which shard it will go to.

Very unintuitive, and I wish ElasticSearch would've documented this! Sometimes I wish I just chose Solr :*(

Hope this helps anyone else looking for this in the future

curl -XPOST 'http://192.168.1.115:9200/_bulk?routing=a' -d '
{ "index" : { "_index" : "articles", "_type" : "article", "_id" : "1", "_routing" : "b"} }
{ "title" : "value1" }
{ "delete" : { "_index" : "articles", "_type" : "article", "_id" : "2", "_routing" : "b" } }
{ "create" : { "_index" : "articles", "_type" : "article", "_id" : "3", "_routing" : "b" } }
{ "title" : "value3" }
{ "update" : {"_id" : "1", "_type" : "article", "_index" : "index1", "_routing" : "b"} }
{ "doc" : {"field2" : "value2"} }'
Clavus answered 2/11, 2013 at 19:2 Comment(3)
Just a heads up: the issue you reported was fixed in no time and the release that got out today contains the fix. Thanks for reporting it, you may want to update your answer. We are also working hard on the documentation, just so you know.Quarrier
Thanks javanna. But now I can't specify the _routing field for each individual document when bulk importing???Clavus
If memory serves that wasn't the idea, the one in the url is the default one, used as fallback when the per-item one is not specified.Quarrier
Y
11

@Henley Chiu has given the correct answer, I add one detail:

  • before es 6.1, you can use _routing or routing field for each individual document when bulk
  • after es 6.1(included), you can only use routing

So, you'd better use routing for better future compatibility.

Yolanthe answered 9/10, 2019 at 2:9 Comment(0)
S
-1

Node.js client

 const body = users.flatMap((doc: UserDoc) => [{
      index: { _id: new mongoose.Types.ObjectId(), _index: ElasticIndex.UserData001, routing: activity.sk }
    }, {
      user: doc.id,
      journey: activity.id,
      relation_type: {
        name: ElasticRelationType.CustomUser,
        parent: activity.id,
      },
    }]);

    const res = await elasticWrapper.client.bulk({ refresh: true, body })
Sleepless answered 22/12, 2023 at 6:22 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.