Best way to add multiple nodes to existing cassandra cluster
Asked Answered
P

2

12

We have a 12 node cluster with 2 datacenters(each DC has 6 nodes) with RF-3 in each DC.

We are planning to increase cluster capacity by adding 3 nodes in each DC(total 6 nodes). What is the best way to add multiple nodes at once.(ya,may be with 2 min difference).

  1. auto_bootstrap:false - Use auto_bootstrap:false(as this is quicker process to start nodes) on all new nodes , start all nodes & then run 'nodetool rebuild' to get data streamed to this new nodes from exisitng nodes.

If I go this way , where read requests go soon starting this new nodes , as at this point it has only token range assigned to them(new nodes) but NO data got streamed to this nodes , will it cause Read request failures/CL issues/any other issue?

OR

  1. auto_bootstrap:true - Use auto_bootstrap:true and then start one node at a time , wait until streaming process finishes(this might take time I guess as we have huge data approx 600 GB+ on each node) before starting next node. If I go this way , I have to wait until whole streaming process done on a node before proceed adding next new node.

Kindly suggest a best way to add multiple nodes all at once.

PS: We are using c*-2.0.3.

Thanks in advance.

Postulate answered 17/5, 2016 at 18:9 Comment(0)
B
7

As each depends on streaming data over the network, it largely depends how distributed your cluster is, and where your data currently is.

If you have a single-DC cluster and latency is minimal between all nodes, then bringing up a new node with auto_bootstrap: true should be fine for you. Also, if at least one copy of your data has been replicated to your local datacenter (the one you are joining the new node to) then this is also the preferred method.

On the other hand, for multiple DCs I have found more success with setting auto_bootstrap: false and using nodetool rebuild. The reason for this, is because nodetool rebuild allows you to specify a data center as the source of the data. This path gives you the control to contain streaming to a specific DC (and more importantly, not to other DCs). And similar to the above, if you are building a new data center and your data has not yet been fully-replicated to it, then you will need to use nodetool rebuild to stream data from a different DC.

how read requests would be handled ?

In both of these scenarios, the token ranges would be computed for your new node when it joins the cluster, regardless of whether or not the data is actually there. So if a read request were to be sent to the new node at CL ONE, it should be routed to a node containing a secondary replica (assuming RF>1). If you queried at CL QUORUM (with RF=3) it should find the other two. That is of course, assuming that the nodes which are picking-up the slack are not overcome by their streaming activities that they cannot also serve requests. This is a big reason why the "2 minute rule" exists.

The bottom line, is that your queries do have a higher chance of failing before the new node is fully-streamed. Your chances of query success increase with the size of your cluster (more nodes = more scalability, and each bears that much less responsibility for streaming). Basically, if you are going from 3 nodes to 4 nodes, you might get failures. If you are going from 30 nodes to 31, your app probably won't notice a thing.

Will the new node try to pull data from nodes in the other data centers too?

Only if your query isn't using a LOCAL consistency level.

Boaz answered 17/5, 2016 at 18:37 Comment(3)
As I asked , what would happen if I go with 'auto_bootstrap:false' how read requests would be handled ? Do they go to this new nodes , as at this point this new nodes will have only token range assigned to them(new nodes) but NO data got streamed to this nodes , will it cause Read request failures/CL issues/any other issue? Thanks for your reply*Postulate
Thank once again for your reply. Small doubt in one of your line "Also, if at least one copy of your data has been replicated to your local datacenter (the one you are joining the new node to) then this is also the preferred method." Will the new node try to pull data from nodes in the other data centers too(other than the one to which it belongs)? If so, why is that, given that the nodes in its DC have all the data? Or Is it required that the cluster must be in a perfect state, not needing any repair, at the time of adding a new node? Could this be the reason?Postulate
@Postulate StackOverflow is a Q&Q site, not a discussion forum. First, please only ask one question per question. Second, if you have another question, please ask a new question.Boaz
D
3

I'm not sure this was ever answered:

If I go this way , where read requests go soon starting this new nodes , as at this point it has only token range assigned to them(new nodes) but NO data got streamed to this nodes , will it cause Read request failures/CL issues/any other issue?

And the answer is yes. The new node will join the cluster, receive the token assignments, but since auto_bootstrap: false, the node will not receive any streamed data. Thus, it will be a member of the cluster, but will not have any old data. New writes will be received and processed, but existing data prior to the node joining, will not be available on this node.

With that said, with the correct CL levels, your new node will still do background and foreground read repair, so that it shouldn't respond any differently to requests. However, I would not go this route. With 2 DC's, I would divert traffic to DCA, add all of the nodes with auto_bootstrap: false to DCB, and then rebuild the nodes from DCA. The rebuild will need to be from DCA because the tokens have changed in DCB, and with auto_bootstrap: false, the data may no longer exist. You could also run repair, and that should resolve any discrepancies as well. Lastly, after all of the nodes have been bootstrapped, run cleanup on all of the nodes in DCB.

Daron answered 22/2, 2021 at 19:23 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.