I would like to read a file from S3 in my EMR Hadoop job. I am using the Custom JAR option.
I have tried two solutions:
org.apache.hadoop.fs.S3FileSystem
: throws aNullPointerException
.com.amazonaws.services.s3.AmazonS3Client
: throws an exception, saying "Access denied".
What I fail to grasp is that I am starting the job from the Console, so obviously I should have the necessary permissions. However, the AWS_*_KEY keys are missing from the environment variables (System.getenv()
) that are available to the mapper.
I am sure I do something wrong, just not sure what.