Cassandra Java driver: how many contact points is reasonable?
Asked Answered
P

4

19

In Java I connect to Cussandra cluster as this:

Cluster cluster = Cluster.builder().addContactPoints("host-001","host-002").build();

Do I need to specify all hosts of the cluster in there? What If I have a cluster of 1000 nodes? Do I randomly choose few? How many, and do I really do that randomly?

Paschal answered 10/11, 2014 at 20:28 Comment(0)
C
16

I would say that configuring your client to use the same list of nodes as the list of seed nodes you configured Cassandra to use will give you the best results.

As you know Cassandra nodes use the seed nodes to find each other and discover the topology of the ring. The driver will use only one of the nodes provided in the list to establish the control connection, the one used to discover the cluster topology, but providing the client with the seed nodes will increase the chance for the client to continue to operate in case of node failures.

Curet answered 11/11, 2014 at 6:40 Comment(2)
«continue to operate in case of node failures.» — what if that seed node is the point of failure?!Bursarial
You usually have multiple seed nodes in a cluster and same applies for configuring the drivers.Curet
V
11

My approach is to add as many nodes as I can -- The reason is simple: seeds are necessary only for cluster boot but once the cluster is up and running seeds are just common nodes -- using only seeds may result in the impossibility to connect in a working cluster -- So I give myself the best chances to connect to the cluster keeping a more than reasonable amount of nodes -- it's enough one working node to get the current cluster configuration.

Valeda answered 11/11, 2014 at 7:46 Comment(4)
Thank you for your responce. I just want to get a confirmation, in the cluster of 1000 nodes you would try your best to add all 1000 of them?Paschal
Absolutely not ... probably in such a big cluster I would keep track of 30 nodes choosing them in a way that a rack-failure will not cause all these nodes to be unreachable.Valeda
My cluster has 10 nodes. I mention all the 10 nodes in addContactPoints. Now if I get a session from cluster object, will it always use the same one node to execute my every query in that same session?Berberine
No, node that will compute your read/write operation depends on the hash of your key. And "proxy node" is, afaik, randomValeda
M
8

Documentation from DataStax

public Cluster.Builder addContactPoint(String address)

Adds a contact point.

Contact points are addresses of Cassandra nodes that the driver uses to discover the cluster topology. Only one contact point is required (the driver will retrieve the address of the other nodes automatically), but it is usually a good idea to provide more than one contact point, because if that single contact point is unavailable, the driver cannot initialize itself correctly.

Note that by default (that is, unless you use the withLoadBalancingPolicy(com.datastax.driver.core.policies.LoadBalancingPolicy)) method of this builder), the first successfully contacted host will be use to define the local data-center for the client. If follows that if you are running Cassandra in a multiple data-center setting, it is a good idea to only provided contact points that are in the same datacenter than the client, or to provide manually the load balancing policy that suits your need.

Parameters:
    address - the address of the node to connect to
Returns:
    this Builder.
Throws:
    IllegalArgumentException - if no IP address for address could be found
    SecurityException - if a security manager is present and permission to resolve the host name is denied.

From what I understand, you should just add a single contact point and the driver will discover the rest. Hope that helps. I personally use hector you should look into that too.

Midshipman answered 10/11, 2014 at 20:56 Comment(3)
But just like documentation says, if that single point is down, the driver will not be able to discover anything. So, I obviously need to specify more than one, but how many is reasonable? Cassandra is contrasting itself to MongoDB by saying: "unlike MongoDB all nodes in Cassandra are equal." In other words there are no special "mongos" nodes. But from driver's prospective that does not hold true, the one node or few nodes you specify as contact point, become that special node...Paschal
They are only special because they necessary to discover the rest of your cluster. Once the driver connected to at least one node in a cluster, it will discover the rest of the nodes in the cluster and load balance requests across all of them (according to the load balancing policy that you have configured). Check out the load balancing docs of the Ruby Driver, the concepts are taken directly from the Java Driver - datastax.github.io/ruby-driver/features/load_balancingAlberik
+1 because you reference the correct and best description straight from the documentation, but like henri says, it clearly says you should not use a single IP, although you probably do not need hundreds.Bursarial
B
5

I read an interesting article about Netflix and their Cassandra installation.

They mention the fact that they used their Gorilla system to take down 33% of their Cassandra cluster and see that their systems were still working as expected.

They have some 2,000 Cassandra nodes and took 33% down. This means, 1 out of 3 nodes are gone. (About 660 nodes for Netflix)

If you are really unlucky, all the connections you specified are part of the 660 nodes... Ouch.

Chances are, though, that if you use just enough nodes and never expect a dramatic event to where more than 33% of your network goes down, then you should be able to use a pretty small number, such as 6 nodes because with such a number, you should always hit at least 4 that are up...

Now, it should certainly be chosen strategically if possible. That is, if you choose 6 nodes all in the same rack when you have 6 different racks, you probably chose wrong. Instead, you probably want to specify 1 node per rack. (That's once you grow that much, of course.)

Note that if you have a Replication Factor of 5 and 33% of your Cassandra nodes go down, you're in trouble anyway. In that situation, many nodes cannot access the database in a QUORUM manner. Notice that Netflix talks about that. Their replication factor is just 3! (i.e. 1/3 = 0.33, and 1/5 = 0.2 so 20% which is less than 33%.)

Finally, I do not know the Java driver, I use the C++ one. When it fails, I am told. So what I can do is try with another set of IPs if necessary, until it works... My system has one connection that stays up between client accesses, so this is a one time process and I can relay the fact that this server is connected to Cassandra and thus can accept client connections. If you reconnect to Cassandra each time a client sends you a request, it may be wise to not send many IPs at all.

Bursarial answered 1/9, 2016 at 5:25 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.