Property spark.yarn.jars - how to deal with it?
Asked Answered
P

3

25

My knowledge with Spark is limited and you would sense it after reading this question. I have just one node and spark, hadoop and yarn are installed on it.

I was able to code and run word-count problem in cluster mode by below command

 spark-submit --class com.sanjeevd.sparksimple.wordcount.JobRunner 
              --master yarn 
              --deploy-mode cluster
              --driver-memory=2g
              --executor-memory 2g
              --executor-cores 1
              --num-executors 1
              SparkSimple-0.0.1SNAPSHOT.jar                                 
              hdfs://sanjeevd.br:9000/user/spark-test/word-count/input
              hdfs://sanjeevd.br:9000/user/spark-test/word-count/output

It works just fine.

Now I understood that 'spark on yarn' requires spark jar files available on the cluster and if I don't do anything then every time I run my program it will copy hundreds of jar files from $SPARK_HOME to each node (in my case it's just one node). I see that code's execution pauses for some time before it finishes copying. See below -

16/12/12 17:24:03 WARN yarn.Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME.
16/12/12 17:24:06 INFO yarn.Client: Uploading resource file:/tmp/spark-a6cc0d6e-45f9-4712-8bac-fb363d6992f2/__spark_libs__11112433502351931.zip -> hdfs://sanjeevd.br:9000/user/sanjeevd/.sparkStaging/application_1481592214176_0001/__spark_libs__11112433502351931.zip
16/12/12 17:24:08 INFO yarn.Client: Uploading resource file:/home/sanjeevd/personal/Spark-Simple/target/SparkSimple-0.0.1-SNAPSHOT.jar -> hdfs://sanjeevd.br:9000/user/sanjeevd/.sparkStaging/application_1481592214176_0001/SparkSimple-0.0.1-SNAPSHOT.jar
16/12/12 17:24:08 INFO yarn.Client: Uploading resource file:/tmp/spark-a6cc0d6e-45f9-4712-8bac-fb363d6992f2/__spark_conf__6716604236006329155.zip -> hdfs://sanjeevd.br:9000/user/sanjeevd/.sparkStaging/application_1481592214176_0001/__spark_conf__.zip

Spark's documentation suggests to set spark.yarn.jars property to avoid this copying. So I set below below property in spark-defaults.conf file.

spark.yarn.jars hdfs://sanjeevd.br:9000//user/spark/share/lib

http://spark.apache.org/docs/latest/running-on-yarn.html#preparations To make Spark runtime jars accessible from YARN side, you can specify spark.yarn.archive or spark.yarn.jars. For details please refer to Spark Properties. If neither spark.yarn.archive nor spark.yarn.jars is specified, Spark will create a zip file with all jars under $SPARK_HOME/jars and upload it to the distributed cache.

Btw, I have all the jar files from LOCAL /opt/spark/jars to HDFS /user/spark/share/lib. They are 206 in number.

This makes my jar failed. Below is the error -

spark-submit --class com.sanjeevd.sparksimple.wordcount.JobRunner --master yarn --deploy-mode cluster --driver-memory=2g --executor-memory 2g --executor-cores 1 --num-executors 1 SparkSimple-0.0.1-SNAPSHOT.jar hdfs://sanjeevd.br:9000/user/spark-test/word-count/input hdfs://sanjeevd.br:9000/user/spark-test/word-count/output
16/12/12 17:43:06 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/12/12 17:43:07 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
16/12/12 17:43:07 INFO yarn.Client: Requesting a new application from cluster with 1 NodeManagers
16/12/12 17:43:07 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (5120 MB per container)
16/12/12 17:43:07 INFO yarn.Client: Will allocate AM container, with 2432 MB memory including 384 MB overhead
16/12/12 17:43:07 INFO yarn.Client: Setting up container launch context for our AM
16/12/12 17:43:07 INFO yarn.Client: Setting up the launch environment for our AM container
16/12/12 17:43:07 INFO yarn.Client: Preparing resources for our AM container
16/12/12 17:43:07 INFO yarn.Client: Uploading resource file:/home/sanjeevd/personal/Spark-Simple/target/SparkSimple-0.0.1-SNAPSHOT.jar -> hdfs://sanjeevd.br:9000/user/sanjeevd/.sparkStaging/application_1481592214176_0005/SparkSimple-0.0.1-SNAPSHOT.jar
16/12/12 17:43:07 INFO yarn.Client: Uploading resource file:/tmp/spark-fae6a5ad-65d9-4b64-9ba6-65da1310ae9f/__spark_conf__7881471844385719101.zip -> hdfs://sanjeevd.br:9000/user/sanjeevd/.sparkStaging/application_1481592214176_0005/__spark_conf__.zip
16/12/12 17:43:08 INFO spark.SecurityManager: Changing view acls to: sanjeevd
16/12/12 17:43:08 INFO spark.SecurityManager: Changing modify acls to: sanjeevd
16/12/12 17:43:08 INFO spark.SecurityManager: Changing view acls groups to: 
16/12/12 17:43:08 INFO spark.SecurityManager: Changing modify acls groups to: 
16/12/12 17:43:08 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(sanjeevd); groups with view permissions: Set(); users  with modify permissions: Set(sanjeevd); groups with modify permissions: Set()
16/12/12 17:43:08 INFO yarn.Client: Submitting application application_1481592214176_0005 to ResourceManager
16/12/12 17:43:08 INFO impl.YarnClientImpl: Submitted application application_1481592214176_0005
16/12/12 17:43:09 INFO yarn.Client: Application report for application_1481592214176_0005 (state: ACCEPTED)
16/12/12 17:43:09 INFO yarn.Client: 
 client token: N/A
 diagnostics: N/A
 ApplicationMaster host: N/A
 ApplicationMaster RPC port: -1
 queue: default
 start time: 1481593388442
 final status: UNDEFINED
 tracking URL: http://sanjeevd.br:8088/proxy/application_1481592214176_0005/
 user: sanjeevd
16/12/12 17:43:10 INFO yarn.Client: Application report for application_1481592214176_0005 (state: FAILED)
16/12/12 17:43:10 INFO yarn.Client: 
 client token: N/A
 diagnostics: Application application_1481592214176_0005 failed 1 times due to AM Container for appattempt_1481592214176_0005_000001 exited with  exitCode: 1
For more detailed output, check application tracking page:http://sanjeevd.br:8088/cluster/app/application_1481592214176_0005Then, click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_1481592214176_0005_01_000001
Exit code: 1
Stack trace: ExitCodeException exitCode=1: 
    at org.apache.hadoop.util.Shell.runCommand(Shell.java:545)
    at org.apache.hadoop.util.Shell.run(Shell.java:456)
    at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:722)
    at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211)
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)


Container exited with a non-zero exit code 1
Failing this attempt. Failing the application.
     ApplicationMaster host: N/A
     ApplicationMaster RPC port: -1
     queue: default
     start time: 1481593388442
     final status: FAILED
     tracking URL: http://sanjeevd.br:8088/cluster/app/application_1481592214176_0005
     user: sanjeevd
16/12/12 17:43:10 INFO yarn.Client: Deleting staging directory hdfs://sanjeevd.br:9000/user/sanjeevd/.sparkStaging/application_1481592214176_0005
Exception in thread "main" org.apache.spark.SparkException: Application application_1481592214176_0005 finished with failed status
    at org.apache.spark.deploy.yarn.Client.run(Client.scala:1132)
    at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1175)
    at org.apache.spark.deploy.yarn.Client.main(Client.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:497)
    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:736)
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:185)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:210)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:124)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
16/12/12 17:43:10 INFO util.ShutdownHookManager: Shutdown hook called
16/12/12 17:43:10 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-fae6a5ad-65d9-4b64-9ba6-65da1310ae9f

Do you know what wrong am I doing? The task's log says below -

Error: Could not find or load main class org.apache.spark.deploy.yarn.ApplicationMaster

I understand the error that ApplicationMaster class is not found but my question is why it is not found - where this class is supposed to be? I don't have assembly jar since I'm using spark 2.0.1 where there is no assembly comes bundled.

What this has to do with spark.yarn.jars property? This property is to help spark run on yarn, and that should be it. What additional I need to do when using spark.yarn.jars?

Thanks in reading this question and for your help in advance.

Pourparler answered 13/12, 2016 at 2:21 Comment(1)
Hi Sanjeev, in my case only jars in $SPAKR_HOME$/jars got copied. How do you make your own jar, namely SparkSimple-0.0.1SNAPSHOT.jar also copied to the hdfs?Pleurodynia
P
20

I was finally able to make sense of this property. I found by hit-n-trial that correct syntax of this property is

spark.yarn.jars=hdfs://xx:9000/user/spark/share/lib/*.jar

I didn't put *.jar in the end and my path was just ended with /lib. I tried putting actual assembly jar like this - spark.yarn.jars=hdfs://sanjeevd.brickred:9000/user/spark/share/lib/spark-yarn_2.11-2.0.1.jar but no luck. All it said that unable to load ApplicationMaster.

I posted my response to a similar question asked by someone at https://mcmap.net/q/538116/-spark2-yarn-nullpointerexception-while-preparing-am-container

Pourparler answered 16/12, 2016 at 7:58 Comment(1)
is this part of "spark-defaults.conf" and the jar files would have been already available in hdfs ?Solemnity
K
31

You could also use the spark.yarn.archive option and set that to the location of an archive (you create) containing all the JARs in the $SPARK_HOME/jars/ folder, at the root level of the archive. For example:

  1. Create the archive: jar cv0f spark-libs.jar -C $SPARK_HOME/jars/ .
  2. Upload to HDFS: hdfs dfs -put spark-libs.jar /some/path/.
    2a. For a large cluster, increase the replication count of the Spark archive so that you reduce the amount of times a NodeManager will do a remote copy. hdfs dfs –setrep -w 10 hdfs:///some/path/spark-libs.jar (Change the amount of replicas proportional to the number of total NodeManagers)
  3. Set spark.yarn.archive to hdfs:///some/path/spark-libs.jar
Killam answered 15/1, 2017 at 20:44 Comment(0)
P
20

I was finally able to make sense of this property. I found by hit-n-trial that correct syntax of this property is

spark.yarn.jars=hdfs://xx:9000/user/spark/share/lib/*.jar

I didn't put *.jar in the end and my path was just ended with /lib. I tried putting actual assembly jar like this - spark.yarn.jars=hdfs://sanjeevd.brickred:9000/user/spark/share/lib/spark-yarn_2.11-2.0.1.jar but no luck. All it said that unable to load ApplicationMaster.

I posted my response to a similar question asked by someone at https://mcmap.net/q/538116/-spark2-yarn-nullpointerexception-while-preparing-am-container

Pourparler answered 16/12, 2016 at 7:58 Comment(1)
is this part of "spark-defaults.conf" and the jar files would have been already available in hdfs ?Solemnity
A
12

If you look at spark.yarn.jars documentation it says the following

List of libraries containing Spark code to distribute to YARN containers. By default, Spark on YARN will use Spark jars installed locally, but the Spark jars can also be in a world-readable location on HDFS. This allows YARN to cache it on nodes so that it doesn't need to be distributed each time an application runs. To point to jars on HDFS, for example, set this configuration to hdfs:///some/path. Globs are allowed.

This means that you are actually overriding the SPARK_HOME/jars and telling yarn to pick up all the jars required for the application run from your path,If you set spark.yarn.jars property, all the dependent jars for spark to run should be present in this path, If you go and look inside spark-assembly.jar present in SPARK_HOME/lib , org.apache.spark.deploy.yarn.ApplicationMaster class is present, so make sure that all the spark dependencies are present in the HDFS path that you specify as spark.yarn.jars.

Appling answered 14/12, 2016 at 12:57 Comment(3)
Thanks! I modified my question toward the end. Since I'm using Spark 2.0.1 where there is no assembly jar comes bundled. So i can't find ApplicationMaster java class. Why spark doesn't complain when I unset spark.yarn.jars property? The moment I upload all /spark/jars into HDFS and set spark.yarn.jars property pointing to this HDFS location, Spark goes crazy and ask for ApplicationMaster. Btw, I don't have /spark/lib folder either. I guess they changed it too in 2.x versions. Any help pls.Pourparler
From Spark 2.X , they have stopped creating assembly jar, if you look in /jars folder you will find spark-yarn_<Scala_version>-<sparkVersion>.jar which should contain ApplicationMaster class, verify whether you have this jar in your /jars folder. If you have It and you have copied it to HDFS location, then I have no idea why you are getting this error. :)Appling
Thanks for your help. I upped your comment; looks like I had syntax problem in specifying this property.Pourparler

© 2022 - 2024 — McMap. All rights reserved.