How would you program a strong read-after-write consistency in a distributed system?

Asked 16/12, 2020 at 17:37 Answered 25/3, 2021 at 11:43

amazon-web-services amazon-s3 distributed-system cap-theorem

Recently, S3 announces strong read-after-write consistency. I'm curious as to how one can program that. Doesn't it violate the CAP theorem?

In my mind, the simplest way is to wait for the replication to happen and then return, but that would result in performance degradation.

AWS says that there is no performance difference. How is this achieved?

Another thought is that amazon has a giant index table that keeps track of all S3 objects and where it is stored (triple replication I believe). And it will need to update this index at every PUT/DELTE. Is that technically feasible?

Pigeonwing answered 16/12, 2020 at 17:37 Comment(4)

One guess is S3 has virtual storage (memory) that guarantees read after write consistency. The system can take its time (so to speak) actually writing the contents of the virtual storage to disk. – Dramatist 19/12, 2020 at 20:16

This question is also discussed on reddit: reddit.com/r/aws/comments/k4yknz/s3_strong_consistency/gecdohv. Unfortunately no official aws statement AFAIK. – Chillon 25/12, 2020 at 21:7

There is also now this blog post from AWS, giving more details about it: allthingsdistributed.com/2021/04/s3-strong-consistency.html – Pedagogue 3/11, 2021 at 15:39

Thanks for the update @JoãoAlmeida. If you want to summarize the answer and put it as your own answer, I can update this question. – Pigeonwing 15/11, 2021 at 22:57

As indicated by Martin above, there is a link to Reddit which discusses this. The top response from u/ryeguy gave this answer:

If I had to guess, s3 synchronously writes to a cluster of storage nodes before returning success, and then asynchronously replicates it to other nodes for stronger durability and availability. There used to be a risk of reading from a node that didn't receive a file's change yet, which could give you an outdated file. Now they added logic so the lookup router is aware of how far an update is propagated and can avoid routing reads to stale replicas.

I just pulled all this out of my ass and have no idea how s3 is actually architected behind the scenes, but given the durability and availability guarantees and the fact that this change doesn't lower them, it must be something along these lines.

Better answers are welcome.

Pigeonwing answered 25/3, 2021 at 11:43 Comment(0)

-2

Our assumptions will not work in the Cloud systems. There are a lot of factors involved in the risk analysis process like availability, consistency, disaster recovery, backup mechanism, maintenance burden, charges, etc. Also, we only take reference of theorems while designing. we can create our own by merging multiple of them. So I would like to share the link provided by AWS which illustrates the process in detail.

https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-plan-consistent-view.html

When you create a cluster with consistent view enabled, Amazon EMR uses an Amazon DynamoDB database to store object metadata and track consistency with Amazon S3. You must grant EMRFS role with permissions to access DynamoDB. If consistent view determines that Amazon S3 is inconsistent during a file system operation, it retries that operation according to rules that you can define. By default, the DynamoDB database has 400 read capacity and 100 write capacity. You can configure read/write capacity settings depending on the number of objects that EMRFS tracks and the number of nodes concurrently using the metadata. You can also configure other database and operational parameters. Using consistent view incurs DynamoDB charges, which are typically small, in addition to the charges for Amazon EMR.

Pantalets answered 23/12, 2020 at 7:28 Comment(1)

From Review: This is not a link-only answer, because the explanation text in the reference was also posted here. – Rydder 9/2, 2021 at 12:50

Recommended topics

Hot tags