Cannot create temp dir with proper permission: /mnt1/s3

Asked 19/12, 2016 at 11:41 Answered 20/7, 2023 at 6:8

amazon-web-services apache-spark amazon-s3 amazon-emr

The following is the log dump of one of the container. I got an exception stating that a folder can't be created due to some permissions. I have troubleshooted various time but still it exist.

16/12/19 09:44:05 WARN ConfigurationUtils: Cannot create temp dir with proper permission: /mnt1/s3

java.nio.file.AccessDeniedException: /mnt1 at sun.nio.fs.UnixException.t here ranslateToIOException(UnixException.java:84) at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) at sun.nio.fs.UnixFileSystemProvider.createDirectory(UnixFileSystemProvider.java:384) at java.nio.file.Files.createDirectory(Files.java:674) at java.nio.file.Files.createAndCheckIsDirectory(Files.java:781) at java.nio.file.Files.createDirectories(Files.java:767) at com.amazon.ws.emr.hadoop.fs.util.ConfigurationUtils.getTestedTempPaths(ConfigurationUtils.java:224) at com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem.initialize(S3NativeFileSystem.java:449) at com.amazon.ws.emr.hadoop.fs.EmrFileSystem.initialize(EmrFileSystem.java:111) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2717) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:93) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2751) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2733) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:377) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295) at org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:230) at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:201)

Billat answered 19/12, 2016 at 11:41 Comment(2)

Did you ever get a fix for this? We're running into the same problem. See also forums.aws.amazon.com/thread.jspa?threadID=57967, which looks similar – Lian 6/7, 2017 at 18:20

I don't have a concrete answer for this. Because this problem got solved automatically when i created a new cluster. For me this was a cluster specific issue. I guess you can compare the all the configuration files of hadoop and yarn . I tried that i found few differences. But its long time ago i don't remember it now – Billat 7/7, 2017 at 10:39

you should use the same instance type for your master and task/core nodes

see: https://forums.aws.amazon.com/thread.jspa?threadID=57967

Chiles answered 24/12, 2017 at 9:8 Comment(1)

@Tomer I'm already using same instance types [r3.xlarge] for my master, core and task nodes on EMR 5.12.0 and still running into this error while running my job [Spark 2.3.0] (though only occasionally; i.e., this error appears only sometimes). My master, task and core have these configurations: 8 vCore, 30.5 GiB memory, 80 SSD GB storage, EBS Storage: 32 GiB; with one exception: task doesn't have any EBS Storage – Vinni 10/3, 2018 at 6:16

I had a similar error when I had an AWS EMR cluster running, and was trying to connect to it from an EMR edge node using RStudio and SparklyR by running a simple "select * from abc" query.

The query worked on the master nodes but not on the edge node. So, I looked at the permissions of /mnt/s3 on the EMR Cluster's master nodes, and compared it with the permissions of that folder in the edge node. The difference was that on the master nodes, the permissions were rwxrwxrwt (like that of /tmp), while on the edge node it was rwxrwxr--.

When I gave the edge node the same expansive permissions as on the master node, the problem disappeared (I also set the sticky bit so that it is exactly the same as on the master node). The commands I used were.

sudo chmod 777 /mnt/s3
sudo chmod o+t /mnt/s3

Scandian answered 20/4, 2020 at 19:27 Comment(1)

I am facing the same problem and using different instance types. But my Task nodes have /mnt1/s3 permissions as drwxrwxrwt permission as the master node. I am using Pyspark. I saw the above error in YARN logs. The script continues to run and doesn't halt. So, what do I do? – Mute 29/4, 2020 at 18:43

I figured it out for my environment. The master node on emr may have the /mnt1 and /mnt1/s3 directory, but ssh into the worker nodes and they're not configured with the proper directories.

I had to change my setup script from a --Step to a --bootstrap-action in the cluster launch command. This allowed me to pass edits onto all worker nodes, even if they scaled up after the initial cluster launch.

Decencies answered 20/7, 2023 at 6:8 Comment(0)

Recommended topics

Hot tags