Exception while deleting Spark temp dir in Windows 7 64 bit
Asked Answered
E

13

29

I am trying to run unit test of spark job in windows 7 64 bit. I have

HADOOP_HOME=D:/winutils

winutils path= D:/winutils/bin/winutils.exe

I ran below commands:

winutils ls \tmp\hive
winutils chmod -R 777  \tmp\hive

But when I run my test I get the below error.

Running com.dnb.trade.ui.ingest.spark.utils.ExperiencesUtilTest
Tests run: 17, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.132 sec
17/01/24 15:37:53 INFO Remoting: Remoting shut down
17/01/24 15:37:53 ERROR ShutdownHookManager: Exception while deleting Spark temp dir: C:\Users\415387\AppData\Local\Temp\spark-b1672cf6-989f-4890-93a0-c945ff147554
java.io.IOException: Failed to delete: C:\Users\415387\AppData\Local\Temp\spark-b1672cf6-989f-4890-93a0-c945ff147554
        at org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:929)
        at org.apache.spark.util.ShutdownHookManager$$anonfun$1$$anonfun$apply$mcV$sp$3.apply(ShutdownHookManager.scala:65)
        at .....

Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=786m; support was removed in 8.0

Caused by: java.lang.RuntimeException: java.io.IOException: Access is denied
        at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:525)
        ... 28 more
Caused by: java.io.IOException: Access is denied
        at java.io.WinNTFileSystem.createFileExclusively(Native Method)

I have tried to change the permissions manually. Every time I get the same error.

Please help!

Ellora answered 24/1, 2017 at 10:33 Comment(0)
S
29

The issue is in the ShutdownHook that tries to delete the temp files but fails. Though you cannot solve the issue, you can simply hide the exceptions by adding the following 2 lines to your log4j.properties file in %SPARK_HOME%\conf. If the file does not exist, copy the log4j.properties.template and rename it.

log4j.logger.org.apache.spark.util.ShutdownHookManager=OFF
log4j.logger.org.apache.spark.SparkEnv=ERROR

Out of sight is out of mind.

Stickler answered 8/3, 2018 at 7:44 Comment(5)
The problem with ignoring this is that the undeleted temporary files pile up and use up disk space, and most people are not likely to look in this directory to find them.Maigre
Correct buddy - out o sight is out of mind. Very true.Retroflex
@UncleLongHair Its always a good habit to remove the temp folder every once in a while by going to cmd and typing %temp% and deleting everything, so its fine i believe.Retroflex
@Sumukh, can you please update your answer with how this works with log4j2? There is a log4j2.properties.template file but copy/rename + adding the two lines does not seem to hide the exceptions.Heinz
@KaiRoesner see a more recent answer in the same post, below: https://mcmap.net/q/482195/-exception-while-deleting-spark-temp-dir-in-windows-7-64-bit I had the same problem as you and this solution worked for me.Afflictive
S
6

The lines below should hide the error on Windows for log4j2

logger.shutdownhookmanager.name = org.apache.spark.util.ShutdownHookManager
logger.shutdownhookmanager.level = OFF

logger.sparkenv.name = org.apache.spark.SparkEnv
logger.sparkenv.level = ERROR
Subvene answered 24/7, 2023 at 10:38 Comment(0)
L
5

I'm facing the same problem after trying to run the WordCount example with spark-submit command. Right now, i'm ignoring it because it returns the results before the error happens.

I found some old issues in spark Jira but didn't found any fixes. (BTW, one of them is with the status closed.)

https://issues.apache.org/jira/browse/SPARK-8333

https://issues.apache.org/jira/browse/SPARK-12216

Unfortunately seems that they don't care about spark on windows at all.

One bad solution is to give the Temp folder (in yout case *C:\Users\415387\AppData\Local\Temp*) permission to everyone.

So it will be like that:

winutils chmod -R 777 C:\Users\415387\AppData\Local\Temp\

But I strongly recomend you to not do that.

Linin answered 11/5, 2017 at 2:47 Comment(3)
I changed the permissions of the Temp folder to see it that was the issue, but it didn't work. Also, I tried to specify a different working dir with --conf spark.local.dir, with a directory I know I have permissions, but it didn't work either. So this is definitely not a permissions issue. If someone has a solution, please share.Frontality
You can also change the location of this temp dir by setting the "java.io.tmpdir" Java System property, so you can put it someplace else that you feel comfortable changing the permissions.Maigre
You can try MD D:\temp and SET SPARK_JAVA_OPTS=-Djava.io.tmpdir=D:\temp before starting spark-shell.Devilmaycare
I
4

I have a workaround for this, instead of letting spark's ShutdownHookManager to delete the temporary directories you can issue windows commands to do that,

Steps:

  1. Change the temp directory using spark.local.dir in spark-defaults.conf file

  2. Set log4j.logger.org.apache.spark.util.ShutdownHookManager=OFF in log4j.properties file

  3. spark-shell internally calls spark-shell.cmd file. So add rmdir /q /s "your_dir\tmp"

this should work!

Ingathering answered 12/4, 2020 at 14:27 Comment(0)
L
3

I've set the HADOOP_HOME variable in the same way as you have. (On Windows 10)

Try using the complete path when setting permissions i.e.

D:> winutils/bin/winutils.exe chmod 777 \tmp\hive

This worked for me.

Also, just a note on the exception - I'm getting the same exception on exiting spark from cmd by running "sys.exit".

But... I can exit cleanly when I use ":q" or ":quit". So, not sure what's happening here, still trying to figure out...

Luger answered 9/3, 2017 at 1:35 Comment(3)
what is "D:>" ?Yukyukaghir
That's just a prompt on Windows cmd, currently sitting in D: drive. That's where I had kept my winutils.Luger
I see. Thanks! Mine is "C:\>“Yukyukaghir
B
2

Running Spark in windows has this deleting Spark temp issue. You can set it as follows to hide it.

Logger.getLogger("org").setLevel(Level.FATAL)

Barron answered 20/12, 2017 at 6:54 Comment(0)
A
1

I was facing a similar problem. I changed the permission to \tmp folder instead of \tmp\hive

D:>winutils/bin/winutils.exe chmod 777 \tmp

Not seeing any error after this and there is a clean exit

Abele answered 8/9, 2017 at 11:16 Comment(0)
G
1

After following above suggestions, I made below changes -

Update spark-defaults.conf or create a copy of spark-defaults.conf.template
& rename it to spark-defaults.conf

Add following line like - spark.local.dir=E:\spark2.4.6\tempDir via above line we are setting the temp folder for Spark to use.

Similarly update log4j.properties in your spark setup like did above, with the below lines-

log4j.logger.org.apache.spark.util.ShutdownHookManager=OFF log4j.logger.org.apache.spark.SparkEnv=ERROR

Now ShutdownHookManager will not be used during exit causing those error lines on console.

Now how to clean the temp folder then?
So for that add below lines in bin/spark-shell.cmd file -

rmdir /q /s "E:/spark2.4.6/tempDir"
del C:\Users\nitin\AppData\Local\Temp\jansi*.*

By having above updates, I can see clean exit with temp folders clean-up also.

Georgettageorgette answered 4/7, 2020 at 13:23 Comment(0)
K
0

I create a directory d:\spark\temp

I give total control to Everybody on this dir

I run

set TEMP=d:\spark\temp

then I submit my jar to spark and watch the directory on the explorer.

Many files and directories are created/deleted but for one of them there is an exception.

Imho this is not a right problem.

java.io.IOException: Failed to delete: D:\data\temp\spark\spark-9cc5a3ad-7d79-4317-8990-f278e63cb40b\userFiles-4c442ed7-83ba-4724-a533-5f171d830913\simple-app_2.11-1.0.jar

this is when trying to delete the submitted package. It may not have been released by all involved process.

Kelle answered 22/5, 2019 at 10:23 Comment(0)
A
0

My Hadoop environment on Windows 10:

HADOOP_HOME=C:\hadoop

Spark and Scala versions:

Spark-2.3.1 and Scala-2.11.8

Below is my spark-submit command:

spark-submit --class SparkScalaTest --master local[*] D:\spark-projects\SparkScalaTest\target\scala-2.11\sparkscalatest_2.11-0.1.jar D:\HDFS\output

Based on my Hadoop environment on Windows 10, I defined the following system properties in my Scala main class:

System.setProperty("hadoop.home.dir", "C:\\hadoop\\")
System.setProperty("hadoop.tmp.dir", "C:\\hadoop\\tmp")

Result: I am getting the same error, but my outputs are getting generated in the output path D:\HDFS\output passed in spark-submit

Hope this helps to bypass this error and get the expected result for Spark running locally on Windows.

Aurist answered 17/9, 2019 at 6:31 Comment(0)
S
0

for python:

create an empty dir tmp\hive

import os
os.system(command=f"path to \\bin\\winutils.exe chmod -R 777 path to \\tmp\\hive")
Sevenup answered 24/10, 2020 at 13:53 Comment(0)
C
0

For log4j2.properties, the following snippet seems to do the job (of hiding the error)

logger.1.name = org.apache.spark.SparkEnv
logger.1.level = ERROR
Coalfish answered 14/10, 2022 at 10:1 Comment(0)
G
0

One solution for this issue is to just run that parent command prompt
(in which you call spark-submit) as an Administator. So when opening
that parent command prompt on Windows do "Run as Administrator"
instead of just "Run".

For example (if you're using Anaconda) you can right click
on "Anaconda Prompt", then select "More", then "Run as Administrator".

Now in that command prompt, you run your spark-submit command.

This way your script will have enough permissions to delete
that Spark temp folder (when the script is on its way to shutdown).

p1

Gitagitel answered 9/7, 2024 at 9:39 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.