Is there a strongly consistent group membership protocol?
Asked Answered
A

1

8

I'm looking for an algorithm where groups of connected nodes can be merged together to form a new group (by creating links between nodes from different groups). And where a group can be partitioned to form new partitions.

Unlike with consensus style of membership protocols (e.g. the one described in the Raft paper) where only one group can remain after the partition, I'd like each new partition to form a new group.

Also I'd like that for each partition, each of its members is going to agree on who belongs to that partition with a strong consistency guarantee.

Or put differently, I'd like the following property to hold: After a group undergoes a membership change, if two nodes that belonged to the original group can still communicate (there is a path between the two), they should agree on the sequence of changes that happened to the group.

My understanding is that the fact that each new partition is going to agree on a different set of members in a sense implies that the Consistency part from the CAP theorem is relaxed. Giving me hope that such protocol may exist(?).

Afterwards answered 7/10, 2016 at 14:41 Comment(0)
T
1

No consensus protocol (such as Paxos, Raft etc.) can be leveraged to develop a multi-group membership protocol. This is due to the fact that all consensus protocols are based on the fundamental idea that any decision can be taken only if the majority of members have "agreed-accepted" it. In this way, the "split-brain" phenomenon is avoided, because there could be no 2 partitions (of size greater than the majority: (n/2)+1) that have agreed on a different leader (and thus member set), since at least one member would be member of both partitions and would have voted for only one of the partitions (the one that asked first for vote).

One protocol that could possibly be leveraged to create a multi-group membership protocol is the Virtual Synchrony. However, note that virtual synchrony is a protocol used to send messages to (statically) predefined process groups, aka to the currently existing members of these groups. As a result, it is not designed for cases, where new process groups should be created (dynamically) at each new partition. Also note that the virtual synchrony is a protocol that does not scale to bigger members, since the message latency is proportional to the groups size.

I believe that by using the virtual synchrony protocol, you could develop such a membership protocol, which could satisfy the condition

After a group undergoes a membership change, if two nodes that belonged to the original group can still communicate (there is a path between the two), they should agree on the sequence of changes that happened to the group

However, note that this membership is not strongly consistent in a strict sense, because the failure of a node might be propagated inside the group eventually. Nonetheless, the message deliveries (which is what matters most) will be delivered in a way that ensures these deliveries obey the memberships of the group. This is achieved through imposing order on message deliveries in the members side.


Another alternative approach for a membership protocol are gossip-based membership protocols, with real-life implementations being integrated in various tools in industry, such as Consul. In order to leverage this approach, you could emit multiple different classes of messages from each member, depending on the different groups that you would like to monitor. However, again these groups are statically defined inside the protocol and eventually consistent (meaning that every failure will be finally detected by all live members).


As a conclusion, I think that a strongly-consistent membership protocol is not feasible in a strict definition, since you cannot distinguish between a member that has failed and a member that is responding really-really slow (basis of FLP and CAP theorem).

Thrips answered 16/10, 2016 at 9:54 Comment(7)
Thanks, though, I deliberately did not not use the word consensus as that would imply that only one group can remain. I wasn't aware that Virtual Synchrony needs a statically predefined set of nodes, so that's interesting. "... and these are much more complex to be elaborated on an SO post" must say I was hoping for a link or a name of an algorithm to have something to google + some high level description if I got lucky (e.g. I wouldn't expect a deep explanation of the Raft paper to someone who asked for about consensus algorithms).Afterwards
Since virtual synchrony is quite complex, I provided the wiki link as a starting point, I hope this will help finding useful resources. I skipped explaining virtual synchrony, because there are multiple versions of it and again would require a lot of space. Regarding consensus algorithms, note that that they are used to agree about a single value (not necessarily a single group), so they could theoretically be used to create multiple groups, using a mapping of groups as this value (if there was not the majority-vote limitation described above).Thrips
I slightly misinterpreted your question initially, so I fixed a bit my initial response. I also added some more alternatives, I hope my updated response helps.Thrips
I'm sorry but I still don't think I can accept this as an answer. In the first paragraph you mention that consensus can't be used, this is what I'm also explicitly saying that is not what I'm looking for (second paragraph in the question). Then, I skimmed through a ISIS paper some time ago, I remember it did not satisfy the properties I listed, though can't remember what it actual problem was (I'll have a second look). But in either case, it having to use a static list of nodes makes it not an answer to this question either. Lastly, aren't all gossip based protocols eventually consistent?Afterwards
"However, note that this membership is not strongly consistent in a strict sense, because the failure of a node might be propagated inside the group eventually.". Could you please elaborate on this one? I understand that it is not strongly consistent in the strict sense, but for a different reason (see the last paragraph in the question). "... (basis of FLP and CAP theorem)." do consensus theorems even apply here? Note that nodes from different groups are allowed to agree on a different sequences of changes that happened to the groups they are in.Afterwards
By nature, all gossip-based protocols are eventually consistent (with optimisations improving just the propagation delay). CAP theorem is not consensus-focused, while FLP is. However, the membership problem is equivalent and can be reduced to a consensus problem. To straightly answer your question, let's talk about the simplest form of a gossip-based protocol, an epidemic protocol, where each member multicasts membership table to B members at each stage. The propagation will take log_B(N) iterations, so your condition holds true for nodes that both stay alive for these iterations.Thrips
"CAP theorem is not consensus-focused" true, though the 'C' stands for strictly consistent, which is not this case. "However, the membership problem is equivalent" I beg to differ here, the agreement needs not to hold, which should then allow a bigger set of solutions.Afterwards

© 2022 - 2024 — McMap. All rights reserved.