Understand cassandra replication factor versus consistency level
Asked Answered
M

2

66

I want to clarify very basic concept of replication factor and consistency level in Cassandra. Highly appreciate if someone can provide answer to below questions.

RF- Replication Factor RC- Read Consistency WC- Write Consistency

2 cassandra nodes (Ex: A, B) RF=1, RC=ONE, WC=ONE or ANY

  • can I write data to node A and read from node B ?
  • what will happen if A goes down ?

3 cassandra nodes (Ex: A, B, C) RF=2, RC=QUORUM, WC=QUORUM

  • can I write data to node A and read from node C ?
  • what will happen if node A goes down ?

3 cassandra nodes (Ex: A, B, C) RF=3, RC=QUORUM, WC=QUORUM

  • can I write data to node A and read from node C ?
  • what will happen if node A goes down ?
Marilumarilyn answered 5/7, 2014 at 15:26 Comment(1)
These two other answers to similar questions are better than what is below here IMO: https://mcmap.net/q/297383/-read-operation-in-cassandra-at-consistency-level-of-quorum and https://mcmap.net/q/297384/-read-write-strategy-for-consistency-levelBobbie
P
110

Short summary: Replication factor describes how many copies of your data exist. Consistency level describes the behavior seen by the client. Perhaps there's a better way to categorize these.

As an example, you can have a replication factor of 2. When you write, two copies will always be stored, assuming enough nodes are up. When a node is down, writes for that node are stashed away and written when it comes back up, unless it's down long enough that Cassandra decides it's gone for good.

Now say in that example you write with a consistency level of ONE. The client will receive a success acknowledgement after a write is done to one node, without waiting for the second write. If you did a write with a CL of ALL, the acknowledgement to the client will wait until both copies are written. There are very many other consistency level options, too many to cover all the variants here. Read the Datastax doc, though, it does a good job of explaining them.

In the same example, if you read with a consistency level of ONE, the response will be sent to the client after a single replica responds. Another replica may have newer data, in which case the response will not be up-to-date. In many contexts, that's quite sufficient. In others, the client will need the most up-to-date information, and you'll use a different consistency level on the read - perhaps a level ALL. In that way, the consistency of Cassandra and other post-relational databases is tunable in ways that relational databases typically are not.

Now getting back to your examples.

Example one: Yes, you can write to A and read from B, even if B doesn't have its own replica. B will ask A for it on your client's behalf. This is also true for your other cases where the nodes are all up. When they're all up, you can write to one and read from another.

For writes, with WC=ONE, if the node for the single replica is up and is the one you're connect to, the write will succeed. If it's for the other node, the write will fail. If you use ANY, the write will succeed, assuming you're talking to the node that's up. I think you also have to have hinted handoff enabled for that. The down node will get the data later, and you won't be able to read it until after that occurs, not even from the node that's up.

In the other two examples, replication factor will affect how many copies are eventually written, but doesn't affect client behavior beyond what I've described above. The QUORUM will affect client behavior in that you will have to have a sufficient number of nodes up and responding for writes and reads. If you get lucky and at least (nodes/2) + 1 nodes are up out of the nodes you need, then writes and reads will succeed. If you don't have enough nodes with replicas up, reads and writes will fail. Overall some QUORUM reads and writes can succeed if a node is down, assuming that that node is either not needed to store your replica, or if its outage still leaves enough replica nodes available.

Pounce answered 5/7, 2014 at 20:25 Comment(5)
"As an example, you can have a replication factor of 2. When you write, two copies will always be stored" - Meaning that three 'instances' of the data will be stored (an 'original' and two 'replicas'), or two? I suppose what I'm asking is, is the replication factory 0-based or 1-based?Harmonyharmotome
Replication factor of 2 means that a total of 2 instances will be stored.Pounce
Very well explained - Thank youOmaomaha
@Omaomaha - Thanks. :)Pounce
Here is the Datastax doc link: docs.datastax.com/en/archived/cassandra/2.0/cassandra/dml/… The one linked in the answer no longer works.Decorate
S
52

Check out this simple calculator which allows you to simulate different scenarios:

http://www.ecyrd.com/cassandracalculator/

For example with 2 nodes, a replication factor of 1, read consistency = 1, and write consistency = 1:

Your reads are consistent 
You can survive the loss of no nodes. 
You are really reading from 1 node every time. 
You are really writing to 1 node every time. 
Each node holds 50% of your data.
Sunset answered 1/7, 2015 at 23:49 Comment(4)
this answer is really awsomeWeight
even if the number of nodes are high - ex 10, if RF =1 and WC =1 and RC =1. why are the reads strongly consistent. With WC =1,say that node 1 is written to and then if node 1 goes down, with RC of one, read from another node will not return the content written ?Centuplicate
Since replication factor is 1, therefore, there will be only 1 copy of data. Therefore, if the node containing the data goes down, we'll not be able to fetch the data(This doesn't mean that we'll be getting wrong data). Always remember, strong consistency is ensured if R+W>RF(R->read consistency, W->Write consistency and RF is replication factor). In your case R+W=2>1(RF). Therefore it is a case of strong consistency. But as you've pointed out, in some cases availability will suffer.Concurrence
@Sunset if Replication factor is 1 which means single copy of data, how it will be consistent with Read consistency(RC) of 1 ? IF RC is 1 and WC 1 then it means data can be read from other node which does not have latest copy of data. Isn't it ?Imbecilic

© 2022 - 2024 — McMap. All rights reserved.