Read Operation in Cassandra at Consistency level of Quorum?
Asked Answered
H

1

13

I am reading this post on read operations and consistency level in Cassandra. According to this post:

For example, in a cluster with a replication factor of 3, and a read consistency level of QUORUM, 2 of the 3 replicas for the given row are contacted to fulfill the read request. Supposing the contacted replicas had different versions of the row, the replica with the most recent version would return the requested data. In the background, the third replica is checked for consistency with the first two, and if needed, the most recent replica issues a write to the out-of-date replicas.

So even with consistency level of Quorum, it is not guaranteed that you don't get a stale read. According to the above paragraph, if the third replica has the latest timestamp, the co-coordinator has already returned the latest timestamp of the two replicas it inquired. But it is not the latest since third replica has the latest timestamp.

Hellbent answered 30/7, 2014 at 17:26 Comment(1)
'post' link returns 404. Can you fix it pls.Neoclassicism
L
22

The QUORUM CL read does not guarantee the consistency of your data. What guarantees consistency is the following disequation

(WRITE CL + READ CL) > REPLICATION FACTOR

Translating the minimum W+R needed to guarantee data-consistency is

WRITE ALL + READ ONE
WRITE ONE + READ ALL
WRITE QUORUM + READ QUORUM

Like said in the post, if you have a Replication Factor of 3 and you wrote with CL1 surely 1 node have fresh information while other 2 might have old information. Asking cassandra a CL QUORUM read you might retrieve data from the other 2 nodes (old data), and get information back to the client. But since the coordinator sent the read request to all nodes (but waited only for 2 before sending back the response to the client) he will find out which node has the most fresh information and update other nodes.

Other, in a RF3 situation, if you write data in Quorum at least 2 nodes will have fresh information -- performing a read with CL QUORUM will invoke 2 of the 3 nodes, in this situation at least one of the two nodes have the fresh information.

Lamdin answered 30/7, 2014 at 17:50 Comment(9)
but again, what is sent to client by co-ordinator is old data when write with CL1 and read with quorumHellbent
Yes, because in this situation you are not respecting the disequation with RF = 3 the CL1 = 1 and CLQUORUM = 2 ... so (1 + 2) is not bigger than 3, it's just equal to 3Lamdin
Although client is getting the non-updated data in this case, if he queries again, he will get the latest update correct (because in the background an update has occured)Hellbent
The client "should" receive the fresh data, and in most cases will have the fresh information -- imagine such a case, the coordinator send the response back to client because he got the needed CL but is still waiting for other nodes that are very busy and slow to answer. Now you repeat the read, your "new" coordinator will query the same nodes and again he got a fast answer from the minimum needed to reach the CL. Here you can still get old data, because read repair of first read will happen only after your second read request. It's an "edge case", but it might happen.Lamdin
how do you say read repair of first read will happen only after second read request? I don't understand thisHellbent
Also read_repair_chance with default value of 0.1 does not guarantee that a read repair will happen for every read requestHellbent
The problem I'm talking about does not depend on rrc value.A coordinator before performing a rr wait for a response from all invoked nodes.Imagine RF3 cluster,(N=NODE) N1,N2 and N3 own the key you are querying. Your coordinator is N5, you query for read with CL_ONE,N5 will contact N1,N2,N3 and wait for their reply. N1 reply, so N5 will send you response because it's CL1. N5 is still waiting a response from N2 and N3 (so READ REPAIR surely didn't happen). Now you perform same query,N6 is your new coordinator. N6 contacts N1,N2,N3, N1 answer (again), so you receive(again) an old informationLamdin
Let us continue this discussion in chat.Hellbent
can you look this when you get time: #25102482Hellbent

© 2022 - 2024 — McMap. All rights reserved.