How is Cassandra designed to avoid the need for load balancers?

Asked 17/5, 2018 at 21:46 Answered 18/5, 2018 at 7:15

I read this from the official DSE doc but it did not go in depth in to how. Can someone explain or provide any links to how?

Sterile answered 17/5, 2018 at 21:46 Comment(0)

It's better to look into architecture guide for this kind of information.

There are multiple places that could be considered as some kind of load balancers. First - you can send requests to any node in the cluster, and this node will work as "coordinator", re-sending the request to the nodes that actually owns the data. Because this is not very optimal, drivers provides so-called token-aware load balancing policy, where driver is able to infer from data, which nodes are responsible for handling them, and send request to one of the nodes, selected based on other information (contributed by other load balancing policies).

In case of the multiple data centers, drivers & Cassandra itself, are able to send requests to "remote" DCs if "local" isn't available (notion of remote & local are specific to consumers). But in this case, some other factors will play their role - for example, if you have LOCAL_ consistency levels, then your requests won't be sent to "remote" data center.

Talking about application design - you may use load balancer before your application layer that will connect to Cassandra cluster in their "local" data center, and use LOCAL_ consistency levels to perform their operations. In case of downtime of one of the DCs, the load balancer should stop to send traffic to application layer in that DC.

Ginzburg answered 18/5, 2018 at 7:15 Comment(3)

Thanks for the insightful guide! For the last part about about a LB before the application layer, what are some common health-check strategies that can be done to trigger the failover? (e.g. ways to ping Cassandra to check its health, etc) – Sterile 18/5, 2018 at 17:11

I really need to think about it, but I maybe would go the path of getting error metrics from driver: docs.datastax.com/en/drivers/java/3.5/com/datastax/driver/core/…, or if you use "circuit breaker" pattern in your app, then pull this information from it... But this is really good question, I'll put it into my TODO list to investigate... P.S. you can also ask on DataStax Academy Slack channel – Ginzburg 18/5, 2018 at 17:56

Thanks. Can you please reply here with anything you find from your investigation so I can be notified? – Sterile 18/5, 2018 at 22:16

Load balancer is builtin to the drivers/connections. For example, Java driver "roundrobin" behavior is explained in the documentation here:

https://docs.datastax.com/en/developer/java-driver-dse/1.6/manual/load_balancing/

Also explained here:

https://docs.datastax.com/en/developer/java-driver/3.1/manual/load_balancing/

Talton answered 17/5, 2018 at 21:51 Comment(2)

What about for handling failover across multiple datacenters? Would that warrant an external load balancer? – Sterile 18/5, 2018 at 2:9

You can configure your application to go to another DC if required - see examples in documentation linked (it's the same doc in both cases). But will it work or not, depends on the things like Consistency level, etc. – Ginzburg 18/5, 2018 at 6:51

Recommended topics

Hot tags