Solr AutoScaling - Add replicas on new nodes
Asked Answered
H

1

5

Using Solr version 7.3.1
Starting with 3 nodes:

I have created a collection like this:

wget "localhost:8983/solr/admin/collections?action=CREATE&autoAddReplicas=true&collection.configName=my_col_config&maxShardsPerNode=1&name=my_col&numShards=1&replicationFactor=3&router.name=compositeId&wt=json" -O /dev/null

In this way I have a replica on each node.

GOAL:

  • Each shard should add a replica to new nodes joining the cluster.
  • When a node are shoot down. It should just go away.
  • Only one replica for each shard on each node.

I know that it should be possible with the new AutoScalling API but I am having a hard time finding the right syntax. The API is very new and all I can find is the documentation. Its not bad but I am missing some more examples.

This is how its looks today. There are many small shard each with a replication factor that match the numbers of nodes. Right now there are 3 nodes. enter image description here

This video was uploaded yesterday (2018-06-13) and around 30 min. into the video there is an example of the Solr.HttpTriggerListener that can be used to call any kind of service, for example an AWS Lamda to add new nodes.

enter image description here

Hosbein answered 13/6, 2018 at 13:55 Comment(5)
Did you get anywhere with this ? ive also seen lots of documentation but no one actually using autoscaling in anger ... be interested to know what you have done since you posted this questionNeck
I promise to write an answer in a couple of weeks, I am close (-:Hosbein
ooh nice - happy to help test / work on things if you wantNeck
@MartinAndersen are you a step further?Decennial
Sorry no progress. I am waiting for version 7.6 that should have events for node addedHosbein
M
6

The short answer is that your goals are not not achievable today (till Solr 7.4).

The NodeAddedTrigger only moves replicas from other nodes to the new node in an attempt to balance the cluster. It does not support adding new replicas. I have opened SOLR-12715 to add this feature.

Similarly, the NodeLostTrigger adds new replicas on other nodes to replace the ones on the lost node. It, too, has no support for merely deleting replicas from cluster state. I have opened SOLR-12716 to address that issue. I hope to release both the enhancements in Solr 7.5.

As for the third goal:

Only one replica for each shard on each node.

To achieve this, a policy rule given in the "Limit Replica Placement" example should suffice. However, looking at the screenshot you've posted, you actually mean a (collection,shard) pair which is unsupported today. You'd need a policy rule like the following (following does not work because collection:#EACH is not supported):

{"replica": "<2", "collection": "#EACH", "shard": "#EACH", "node": "#ANY"}

I have opened SOLR-12717 to add this feature.

Thank you for these excellent use-cases. I'll recommend asking questions such as these on the solr-user mailing list because not a lot of Solr developers frequent Stackoverflow. I could only find this question because it was posted on the docker-solr project.

Malherbe answered 29/8, 2018 at 11:55 Comment(2)
Thanks I will join the Solr mailing list. I am having the cluster on AWS and I will try to create a lambda that can add new replicas. AWS has life cycles hooks so I can attached the lambda to the server created eventHosbein
I see that 2/3 issues are implemented in 7.5. The main functionality of adding replications when a new node is coming up is not implemented yet. Correct?Decennial

© 2022 - 2024 — McMap. All rights reserved.