Data integrity check during upload to S3 with server side encryption

Data integrity check is something that the AWS Java SDK claims that it provides by default where either the client can calculate the object checksum on its own and add it as a header “Headers.CONTENT_MD5” in the S3 client or if we pass it as null or not set it, the S3 client internally computes an MD5 checksum on the client itself which it uses to compare to the Etag ((which is nothing but the MD5 of the created object) obtained from the object creation response to throw an error back to the client in case of a data integrity failure. Note that in this case though, the integrity check happens on the client side and not on the S3 server side which means that the object will still be created successfully and the client would need to clean it explicitly.

Hence, using the header is recommended(where the check happens at the S3 end itself and fails early) but as TransferManager uses part upload, it is not possible for the client to explicitly set the MD5 for a specific part. The Transfer Manager should take care of computing the MD5 of the part and setting the header but I don’t see that happening in the code.

As we want to use the Transfer Manager for multi-part uploads, we would need to depend on the client side checking which is enabled by default. However, there is a caveat to that too. When we enable SSE-KMS or SSE-C on the object in S3, then this data integrity check is skipped as it seems (as they mention in one of the comments in the code) that in that case an MD5 of the ciphertext is received from S3 which cant be verified with the MD5 which was computed at the client side.

What should I use to enable the data integrity check with SSE in S3?

Note: Please verify that the above understanding is correct.

Recommended topics

Hot tags