How CA distributed system according to Cap Theorem can exist
Asked Answered
P

4

24

How can a distributed system be consistent and available (CA)?

Because I would argue when a network partition occurs, CA cannot be possible in a way where every node of the network, even the partioned nodes that users are connected to, continue to be available and answer with consistent data.

Pean answered 28/11, 2017 at 19:13 Comment(0)
B
39

It can't.

As often mentioned, the CAP theorem in its original form is a little misleading. It can be restated as

in the presence of the network partition, a distributed system is either available or consistent

so you are right. Generally, systems cannot be classified as CA, CP or AP only, since partition tolerance is a property of the system, which describes what to choose in case of a network partition. So it is possible that a system can behave according to AP sometimes, and CP other times (however it is not common).

Another interesting part is that RDBMS databases are often at the CA side of the triangle. This is only the case in a single node setup. Even with master (write) - slave (read) setup, the system is not CA (or if it is termed "CA" for some reason, and cannot recover from network partitions, then a split-bran scenario may happen, a new master is elected for the partition, and chaos ensues, possibly breaking the consistency of the system).

Useful read: https://codahale.com/you-cant-sacrifice-partition-tolerance/.

Biting answered 24/3, 2018 at 12:31 Comment(2)
The CAP theorem is about distributed systems, I don't understand why people talk about single node setups having something to do with the CAP theorem. Furthermore, in a distributed system, in case of CP systems, you lose availability, if some wire, connecting 2 parts of your cluster loses electrical power or whatever. When a CA system loses electrical power, it loses the letter A, so it's not really a CA system anymore, is it?Hedvig
Yeah I don't think CAP should be used to describe non-distributed systems, can't really see the practical point of it. "When a CA system loses electrical power..." - I think a preassumption of CAP is to have a working system, so if all nodes go down it's not really a distributed system anymore from a philosophical viewpoint I guess, but a bunch of metal and wires :DBiting
B
13

It can, but it won't.

The CAP theorem reasons about guarantees when one or more nodes get isolated from the rest of the cluster. In such cases a node has three options which result in the three known CAP trade-offs: i) it keeps responding to any received requests AP; ii) it no longer responds to received requests until it is again able to reach the others CP; iii) it shuts down before receiving any requests to eliminate the partition along with it CA.

In other words you can achieve CA by having your nodes shutting down instead of tolerating the partition but bear in mind that partitions are likely to keep happening hence this will converge to the scenario in which you have a single node in your cluster and I assume this is the opposite of what you want, i.e. having a cluster with multiple nodes is kind of the whole point.

Therefore in practice you end up choosing between CP and CA. See this answer for more illustrative examples.

Bickering answered 23/3, 2022 at 18:26 Comment(11)
What a good reasoning! I've never thought of thatTreiber
The nodes would of course need to have some pre-agreement on who shuts down in the case of partition, since no communication will be possible at that point, and they can't all shut down or you wouldn't have Availability. For example, supposing the nodes have unique numeric IDs established prior to the partition, they could agree: any node which cannot reach all nodes with a greater ID, must shut down.Alitaalitha
What is the difference between ii and iii? Aren't they indistinguishable by an outside observer?Smallsword
One is refusal to respond and the other is not existing at all, they are not necessarily indistinguishableAcaleph
Nodes shutting down instead of tolerating partitioning means that there's no availability, so, your system is not CA. It's just C. If you call a system that is unavailable in the event of partitioning a CA system, then we can call any CP system a CAP system, because when there's no partitioning it is available and consistent. Also, CAP theorem is about distributed systems, not single node setups, and the guy who came up with it did so in the context of distributed systems, he's a distributed systems researcher.Hedvig
@pavel_orekhov that there's no availability, so, your system is not CA -> "CAP-availability" states that for each request R you get a response R'. So if for ex you have a N-node cluster in which one got partitioned and shuts down, you'll still receive R' from any of the other N-1 nodes, so Availability stands.Acaleph
Also, CAP theorem is about distributed systems, not single node setups -> that's what I'm trying to say in the second paragraph, shutting down nodes goes against what we try to achieve and that's why you won't have CAAcaleph
I disagree with you. Because you only mention some corner case when 1 node gets removed. Partitioning should be looked at in general case. Furthermore, that 1 node that was removed from the cluster due to partitioning, could be serving a whole city, and the service could be made unavailable to them.Hedvig
@pavel_orekhov, If you have only one node available for an entire city then you have a single point of failure from the perspective of those that live there, which for all intents and purposes is the same as having a single-node clusterAcaleph
The definition of CAP-Availability "for each request R you get a response R'" is applicable no matter how many nodes you removeAcaleph
The cap theorem is applicable to the cluster formed by all nodes that are reachable by the client. Looking at that set of nodes, one may stop hearing back from the other members and from that moment on it has only 3 choices which lead to the 3 CAP settings: 1) keeps working and responding to requests (AP); 2) keeps working but won't provide an answer before hearing back to the cluster (CP); 3) stops working so that upcoming requests reach its colleague nodes (CA)Acaleph
E
2

Dr. Stonebraker says: The guidance from the CAP theorem is that you must choose either A or C, when a network partition is present. As is obvious in the real world, it is possible to achieve both C and A in this failure mode.

See this for thoughts on why CA can exist:

CA is a specification of the operating range: you specify that the system does not work well under partition or, more precisely, that partitions are outside the operating range of the system.

My background is far from these theoretical considerations and I must say it is highly confusing. I am researching distributed Blockchain systems and I don't see why those "generalized" definitions of C, A, P must always apply. If let's say 5% of nodes fail or are otherwise partitioned, the consensus still functions. If an end user is connected to a partitioned node, the node could let the user know it lost connection. I don't even see how any major Blockchain network is CP without defining conditions such as "if a certain amount of nodes fail or get partitioned, the consensus halts".

Ellington answered 11/12, 2018 at 13:54 Comment(0)
S
0

CA is not practical, but I don't see any issues with the statement of the CAP theorem. CA simply means the system can offer both consistency and availability only when there is no network partition issue. However, when there is a network partition issue, the system cannot function properly (e.g., it may eventually lose availability or consistency); otherwise, it's CAP, not CA.

What would be an example of a CA system? Just something like what @João Matos mentioned: the node can shut down itself when it cannot communicate with others. In an extreme case, all nodes other than the master node shut down themselves, and eventually, the master node fails for some reason, causing the system to fail (it is no longer CA).

Sessler answered 5/5 at 0:52 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.