Amazon S3 conditional put object
Asked Answered
D

3

11

I have a system in which I get a lot of messages. Each message has a unique ID, but it can also receives updates during its lifetime. As the time between the message sending and handling can be very long (weeks), they are stored in S3. For each message only the last version is needed. My problem is that occasionally two messages of the same id arrive together, but they have two versions (older and newer).

Is there a way for S3 to have a conditional PutObject request where I can declare "put this object unless I have a newer version in S3"?

Discrepant answered 7/2, 2013 at 8:57 Comment(3)
How are you going to identify which one is newer or older? You could insert a custom header storing the timestamp information and then checking that to see if its older/newer.Southeastwards
I use a timestamp that I receive embedded in each message. Checking against S3 on every request will hurt performance and it does not solve the race condition. I need an atomic operation hereDiscrepant
It doesnt appear that s3 supports your use case. The closesnt you might be able to get is versioning which would mean both versions would get stored. You would have to figure out when requesting it which version is actually the newest one. If your object is within the size limits something like SimpleDB might work.Southeastwards
M
5

I need an atomic operation here

That's not the use-case for S3, which is eventually-consistent. Some ideas:

  • You could try to partition your messages - all messages that start with A-L go to one box, M-Z go to another box. Then each box locally checks that there are no duplicates.

  • Your best bet is probably some kind of database. Depending on your use case, you could use a regular SQL database, or maybe a simple RAM-only database like Redis. Write to multiple Redis DBs at once to avoid SPOF.

  • There is SWF which can make a unique processing queue for each item, but that would probably mean more HTTP requests than just checking in S3.

  • David's idea about turning on versioning is interesting. You could have a daemon that periodically trims off the old versions. When reading, you would have to do "read repair" where you search the versions looking for the newest object.

Means answered 7/5, 2013 at 2:39 Comment(3)
This is not about eventual consistency. The issue is that S3 does NOT support If-Unmodified-Since nor If-Match requests for PUT requests. Amazingly, those are supported for GET reqs. See s3.amazonaws.com/doc/s3-developer-guide/RESTObjectPUT.html and s3.amazonaws.com/doc/s3-developer-guide/RESTObjectGET.htmlEmotive
And here in late 2022, it's still true that even though S3 is now strongly consistent, it still doesn't support Check-And-Set style operations.Carver
The issue with "read repair" as mentioned above is that there is no consistent way of knowing which version is the newest and which is "conflicting". Relying on metadata / timestamp isn't reliable, since it's client based. Read-repair is risky and in my opinion against the versioning concept which is about protecting user's data. At best it can be used to free up some space deleting "same" ETag versions. Simple: If-Match or If-Unmodified-Since conditions would solve that, as they would prevent write and would return the information to the client, which then could handle such case.Tugman
C
5

Conditional writes are, as of August 2024, supported by S3: https://aws.amazon.com/about-aws/whats-new/2024/08/amazon-s3-conditional-writes/

Countrified answered 21/8 at 7:10 Comment(1)
Ber careful to phrase so that your post does not get awkward around 2040.Typewriting
C
0

Couldn't this be solved by using tags, and using a Condition on that when using PutObject? See "Example 3: Allow a user to add object tags that include a specific tag key and value" here: https://docs.aws.amazon.com/AmazonS3/latest/dev/object-tagging.html#tagging-and-policies

Carrageen answered 5/2, 2021 at 15:31 Comment(2)
Are you sure a PUT operation can have a tag-based Condition?Aerobe
> S3 Object Tagging is strongly consistent. For more information, see Amazon S3 data consistency model. and with docs.aws.amazon.com/AmazonS3/latest/userguide/… you could perhaps use policies with number condition (if version in request is greater than what the policy was last updated to know of, then grant the Put operation). Problem now moved to updating the policy with an atomic conditional operation. :-(Carrageen

© 2022 - 2024 — McMap. All rights reserved.