Can the WAIT command provide strong consistency in Redis?
Asked Answered
R

1

12

Greetings overflowers,

In Redis sentinel/cluster setup, can we use the WAIT command with the total number of slaves to ensure strong consistency across the Redis servers? Why not?

Kind regards

Recorder answered 10/11, 2015 at 11:51 Comment(0)
M
25

WAIT implements synchronous replication for Redis. Synchronous replication is required but not sufficient in order to achieve strong consistency. Strong consistency is practically the sum of two things:

  1. Synchronous replication to the majority of nodes on a distributed system.
  2. A way to orchestrate change of leadership (failover basically) so that it is guaranteed that only a node preserving the full history of acknowledged operations in the previous leader can be elected.

WAIT does not provide "2". The replication process in Redis is performed by Sentinel or Redis Cluster, and is not able to provide property 2 (since the synchronous replication in Redis is the exception not the rule, so there was no much focus on that aspect). However what Redis replication does is to attempt to promote the slave that appears to preserve the greatest amount of data. While this does not change the theoretical guarantees of Redis failover, that can still lose acknowledged writes, it means that if you use WAIT, there are more slaves having a given operation into their memory, and in turn it is a lot more likely that in the event of a failover, the operation will be retained. However while this will make a failure mode that discards the acknowledged operation hard to trigger, there always exists a failure mode with this properties.

TLDR: WAIT does not make Redis linearizable, what it does is to make sure the specified number of slaves will receive the write, that in turn makes failover more robust, but without any hard guarantee.

Middelburg answered 10/11, 2015 at 12:58 Comment(19)
Thank you antirez for your answer. I will definitely accept your answer. However, I have this scenario in mind and I would like to hear your thoughts about it. I am trying to implement distributed locks using the WAIT operation, such that if WAIT fails with any error, the lock will not be acquired, and instead the lock keys will be deleted immediately or automatically expire eventually. Other clients will simply fail also to acquire the failed locks until they are deleted/expired. Assuming that I have handled other issues such as timing, is this a safe approach?Recorder
If you want distributed locks with hard guarantees, you have a much better way to do this! The redlock algorithm is an implementation using N different masters, synchronous replication implemented client-side, and is available for several languages. I hope this helps!Middelburg
Thanks antirez. I am currently using redlocks. However, I still would love to understand if my suggestion if safe. Currently, our solution does not require multi master setup (except if using redlocks), and we would like to explore using WAIT with a single master multi slave setup as it is simpler to manage.Recorder
I don't think your algorithm is safe. For example what happens after a failover? If you pick the slave without a given lock, even if the lock itself reached other N/2+1 nodes, the SETNX in the (new) master will succeed and also WAIT will succeed, since now the other instances are replicating from the new master. If you don't failover, then you can just use a single instance :-) Since you have anyway a non available system during failures. Btw to handle a multi-master system is IMHO simpler, so I would pick redlock regardless.Middelburg
Thank you for your time and effort to answer my questions. After failover all WAIT operations will fail with an error causing clients to also fail acquiring the locks. Hence, new clients can now safely acquire the same locks regardless of the newly selected master. Am I wrong? :)Recorder
The problem is with the locks already provided to other clients. Example: <old master> gives lock and writes to N/2+1 servers. FAILVER HAPPENS HERE <new master> was not among N/2+1 servers, now all the slaves replicate from it, so the old lock no longer exists inside Redis. However a client is using it. However <new server> will provide the same pock to another client: safety violation, game over.Middelburg
I really appreciate your patience with me :) But the WAIT command can be given a parameter that enforces all slaves to be synced to and acknowledge the sync before the WAIT is considered successful. This way we can guarantee that all slaves have the same view of locks. Am I missing something here?Recorder
Yep as I said you are not considering that after failovers you may elect a slave which did not received the last lock (or any other past lock still valid from the POV of the client receiving it).Middelburg
But that is impossible to happen since the client is WAITing for all slaves (using the number of slaves parameter that I mentioned before) to register the lock before the client claims the lock, otherwise the client considers the lock as not acquired. Could you please elaborate on how this can happen in this specific approach?Recorder
Three servers A, B, C. A is the current master. Some client acquires a lock L1 writing to A and WAITing with replication factor 2. The write is received by A and B, since C was disconnected at the moment, or there was a lag in the replication channel, or whatever. Then A fails, the failover mechanism elects C as slave since for a temporary network partition B was not available (and A is still failing). Now the new master is C which has no notion of the lock L1, B and A become slaves of C, losing L1 key as well, so now L1 no longer exists. New client asks for L1 and acquires it again. FAILMiddelburg
I edited the above comment multiple times, please reload & re-read.Middelburg
Here is my understanding of this scenario. Client attempt to acquire L1 by writing to the master A and WAITing for the write to be replicated to all slaves B and C. Since C is disconnected at this moment, WAIT will return 1 as the number of slaves that acknowledged the write (namely B). At this time the client will see that not all slaves have acknowledged the write, hence it will fail to acquire the lock. Any acknowledged write can now be rolled back by the client via immediate deleting or automatically by the servers via eventual expiring. Both are safe.Recorder
In general, any failing server (master or slave) will cause the locking to fail because the number of acknowledged writes will be less than 100%, which is required for strong consistency among all servers. I hope that I am not missing a very obvious thing here :)Recorder
Yep you missed something (this violates availability) but this discussion is too long at this point :-) I suggest reading a few intro on distributed systems like this (not perfect at all but a decent start): book.mixu.net/distsys/single-page.html, the Raft paper, and in general how to achieve consistency in a replicated state machine. This will provide you some POV and tools in order to find easily the problems your schema has and why agreement is needed to solve this problem in the general case (if you want to use replication).Middelburg
Redlock is explicitly a way to make the problem a lot simpler compared to replicated state machines, using the fact we need just specific guarantees in order to make the distributed lock to have certain safety and availability guarantees. But in order to control the synchronous replication needed for this use case, it is implemented client-side instead of rely to Redis replication that is designed in a different way.Middelburg
Hi @antirez, what if I wait for the entire cluster (1 master and 2 slaves) to get the writes, and then only I acknowledge the client. Will that make it strongly consistent? (Given that I'm fine with cluster not accepting writes even if one of the machines goes down.)Duchess
Yes @ptntialunrlsd, by totally sacrificing availability of any kind during failures, you are indeed safe. So your replicas will not allow to survive failures but will make writes safer because you can run the different nodes in different geographical locations. All this makes sense if you run AOF in fsync=always mode, otherwise during restarts each replica may lose writes making the schema weak.Middelburg
Thanks @antirez! Yes, My master will be writing AOF with fsync=always and RDB will be disabled. The 2 slaves of this master won't be writing any logs. Will this be fully safe (given hard disk of master never crashes), or should I write RDB also in one of the slaves? (I don't feel the need, though.)Duchess
@antirez, consider merging your last comment into your answer, since is pretty crucial. That is, the following two conditions guarantee strong consistency (at the expense of availability): 1. WAIT for every single replica to get writes; 2. Every single replica uses AOF persistence with fsync=always.Lewis

© 2022 - 2024 — McMap. All rights reserved.