Cassandra uses Leaderless replication. This means there is no single node which is the authority to provide the most recent or correct value. So, we will have to read the value (for a key) using more democratic means i.e. ask multiple nodes and then derive the correct value.
Let's understand it through examples:
Assume for all examples that there are 3 replicas, i.e. N =3. And 3 nodes are A, B, C
R = 1, W = 1, N =3
It basically means we are storing 3 copies of the same data but we have configured that consider read and writes to be successful even if one node responds.
Now, let's take case of updating the value of x to 5 from the current value of 3.
During write, assume that write was only successful on node A due to some reason (W value is 1) so it will be considered as successful write.
Now during the read, we can get below values:
if node A is reachable; the client reads the value of 5. (i.e. gets correct values)
if node A is unreachable/down. The client gets the stale value of 3.
So clearly, this configuration (R+W < N) will not provide consistent read.
R = 1, W = 2, N =3
Here, though the write is happening to two nodes but still read will be confirmed only from 1 node. Read can still happen from a node which does't have the latest value.
So clearly, this configuration (R+W = N) will not provide consistent read.
R = 2, W = 2, N =3
- Best case (read and write from the same set of nodes): write to A, B and Read: A, B => Consistent read i.e. latest value.
- Worst case (one node is common): write to A,B and read: B,C => Consistent read since the there is an overlap of node B.
So only R+W > N guarantees the consistent read.
You can explore more options here.