Amazon S3 - Batch File Upload Using Java API?
Asked Answered
A

2

6

We're looking to begin using S3 for some of our storage needs and I'm looking for a way to perform a batch upload of 'N' files. I've already written code using the Java API to perform single file uploads, but is there a way to provide a list of files to pass to an S3 bucket?

I did look at the following question is-it-possible-to-perform-a-batch-upload-to-amazon-s3, but it is from two years ago and I'm curious if the situation has changed at all. I can't seem to find a way to do this in code.

What we'd like to do is to be able to set up an internal job (probably using scheduled tasking in Spring) to transition groups of files every night. I'd like to have a way to do this rather than just looping over them and doing a put request for each one, or having to zip batches up to place on S3.

Amphi answered 28/7, 2015 at 20:7 Comment(4)
Can you script it with awscli or s3cmd, rather than write it in Java? Using Java seems heavy-handed here.Tribade
The things haven't changed in this regard. People have developed libraries that make use of the s3 apis and parallelize the uploads.Soaring
@Soaring Can you provide an example?Amphi
github.com/tj---/s3-parallelSoaring
T
5

The easiest way to go if you're using the AWS SDK for Java is the TransferManager. Its uploadFileList method takes a list of files and uploads them to S3 in parallel, or uploadDirectory will upload all the files in a local directory.

Tradelast answered 29/7, 2015 at 12:57 Comment(2)
does it spawn n upload processes performed in parallel or does it spawn a single upload process for all of the objects (therefore needing only one connection)? I hope it's the latterGabi
It performs N independent uploads - how many will be executed at a time depends on what kind of ExecutorService you pass to the constructor. S3 does not expose a way to upload multiple objects in a single HTTP request besides manually zipping them up. And even then you'd probably want to do a multi-part upload and split the zip over multiple HTTP requests so if there's a transient failure halfway through you don't have to start the whole upload over from scratch...Tradelast
B
0
public void uploadDocuments(List<File> filesToUpload) throws 
    AmazonServiceException, AmazonClientException,
    InterruptedException {
    AmazonS3 s3 = AmazonS3ClientBuilder.standard().withCredentials(getCredentials()).withRegion(Regions.AP_SOUTH_1)
            .build();

    TransferManager transfer = TransferManagerBuilder.standard().withS3Client(s3).build();
    String bucket = Constants.BUCKET_NAME;

    MultipleFileUpload upload = transfer.uploadFileList(bucket, "", new File("."), filesToUpload);
    upload.waitForCompletion();
}

private AWSCredentialsProvider getCredentials() {
    String accessKey = Constants.ACCESS_KEY;
    String secretKey = Constants.SECRET_KEY;
    BasicAWSCredentials awsCredentials = new BasicAWSCredentials(accessKey, secretKey);
    return new AWSStaticCredentialsProvider(awsCredentials);

}
Boil answered 30/9, 2019 at 9:38 Comment(1)
I know it is an old post, would you please share some details in case if you know that this can be done in Java 2 aws sdk version?Freedwoman

© 2022 - 2024 — McMap. All rights reserved.