How to restart one live node from a multi node cassandra cluster?
Asked Answered
H

2

6

I have a production cassandra cluster of 6 nodes. I have made some changes to the cassandra.yaml file on one node and hence need to restart it. How can I do it without losing any data or causing any cluster related issues? Can I just kill the cassandra process on that particular node and start it again. Cluster Info: 6 nodes. All active. I am using AWS Ec2Snitch.

Thanks.

Housen answered 5/6, 2017 at 12:5 Comment(0)
A
5

In case you are using replication factor greater than 1, and not using ALL consistency setting on your writes/reads, you can perform steps listed below, without any downtime/data loss. In case you have one of the limitations listed above, you'll need to increase your replication factor/change requests consistency before you continue.

  1. Perform nodetool drain on that node (http://docs.datastax.com/en/cassandra/2.1/cassandra/tools/toolsDrain.html).
  2. Stop the service.
  3. Start the service.

In Cassandra, if durable writes are enabled, you should not lose data anyway - there's a mechanism of commitlog log replay in case of accidental restart, so you should not lose any data if doing just restart, but replaying commitlog can take some time.

The steps written above are a part of official upgrade procedure, and should be the "safest" option. You can do nodetool flush + restart, this will ensure that commitlog replay will be minimal and can be faster than drain approach.

Abernathy answered 5/6, 2017 at 13:4 Comment(2)
Thanks @nevsv, I will try these on staging cluster and then go ahead with production. The replication factor that I am using is 3 and read consistency is 1. I will share my experience here once I am done with the maintenance activity.Housen
This Solved an issue where nodetool describecluster showed different Schema versions on different nodes. I went node by node and followed your 3 steps. But only after nodetool status showed the node was up again I did the 3 steps on the next nodeJerboa
E
2

Can I just kill the cassandra process on that particular node and start it again.

Essentially, yes. I'm assuming you have a RF of 3 with 6 nodes, so it shouldn't be a big deal. If you want, to do what I call a "clean shutdown" you could run the following commands first:

nodetool disablegossip
nodetool drain

And then (depending on your install):

sudo service cassandra stop

Or:

kill `cat cassandra.pid`

Note that if you do not complete these steps, you should still be ok. The drain just flushes the memtables to disk. If that doesn't happen, the commit log is reconciled against what's on disk at boot-time anyway. Those steps will just make your boot go faster.

Ecthyma answered 5/6, 2017 at 13:33 Comment(1)
Thanks @aaron. Yes, I am indeed using replication factor 3 and read consistency is 1. If I get it right, I need not remove or touch the data directory and just stop and start the process once I am done with the steps mentioned by you.Housen

© 2022 - 2024 — McMap. All rights reserved.