Elasticsearch cluster load balancing best practices

Asked 8/2, 2021 at 8:26 Answered 8/2, 2021 at 10:28

Solved java elasticsearch load-balancing high-availability elasticsearch-high-level-restclient

I would like to understand whether I need or is it considered as a good practice to have load balancer as part of the deployment of Elasticsearch.

As far as I understand high level rest client as well as transport client of Elasticsearch can manage load balancing between the nodes. So the client needs coma separated endpoint list and that's it.

Is there any point to have also Load Balancer at the middle? For which case it might be useful? Pros and cons of each method?

Sheathe answered 8/2, 2021 at 8:26 Comment(0)

Normally external load-balancer in ES cluster is not very common and not required as Elasticsearch already does load balancing and by default all the data nodes in ES cluster act as co-ordinating role but if you want to improve the performance you can have dedicated co-ordinating node as well.

If your goal is to have a smart load-balancing which improves the performance than if you are on ES 6.X or higher(turned by default on 7.X), you get it out of the box without doing any external configuration, by using Adaptive replica selection.

Having another loadbalancer means extra configuration and another layer before your request reaches to ES, so IMHO it doesn't make any sense to use it.

Curricle answered 8/2, 2021 at 10:28 Comment(0)

The answer depends on your architecture and also your requirements. Do you need a loadbalancer for high availability? Or for performance reasons/scalability? Or both?

Elasticsearch like many other distributed systems comes with its own protocols and semantics to distribute load across multiple nodes and to manage fail-overs.

You can use these semantics to configure nodes in such a way that a node can perform just the role of a coordinator -- effectively acting as a load balancer for heavy duty operations like search requests or bulk index requests.

Elasticsearch also has its own built-in protocol for electing a new master node in case of failures -- again effectively performing the role of a load balancer.

In general, I would recommend you to use the native capabilities to achieve your goals instead of adding more complexity by introducing another technology in front of it.

If you want a stable URL for your cluster, then configure your DNS server to reach that goal. A cloud provider managed cluster should already have such a feature, otherwise you can configure it with some efforts.

Forsake answered 8/2, 2021 at 9:9 Comment(0)

Recommended topics

Hot tags