Enabling dynamic allocation on spark on YARN mode
Asked Answered
D

0

7

This question is similar to this but there was no answer.

I am trying to enable dynamic allocation in Spark in YARN mode. I have 11 node cluster with 1 master node and 10 worker nodes. I am following below link for instructions:

For setup in YARN: http://spark.apache.org/docs/latest/running-on-yarn.html#configuring-the-external-shuffle-service

Config variables needs to be set in spark-defaults.conf: https://spark.apache.org/docs/latest/configuration.html#dynamic-allocation https://spark.apache.org/docs/latest/configuration.html#shuffle-behavior

I have also taken reference from below link and few other resources: https://jaceklaskowski.gitbooks.io/mastering-apache-spark/spark-dynamic-allocation.html#spark.dynamicAllocation.testing

Here are the steps I am doing:

  1. Setting up config variables in spark-defaults.conf. My spark-defaults.conf related to dynamic allocation and shuffle service is as:

    spark.dynamicAllocation.enabled=true
    spark.shuffle.service.enabled=true
    spark.shuffle.service.port=7337
    
  2. Making changes in yarn-site.xml

    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>spark_shuffle</value>
    </property>
    <property>
        <name>yarn.nodemanager.auxservices.spark_shuffle.class</name>
        <value>org.apache.spark.network.yarn.YarnShuffleService</value>
    </property>
    <property>
        <name>yarn.nodemanager.recovery.enabled</name>
        <value>true</value>
    </property>
    <property>
        <name>yarn.application.classpath</name>
        <value> $HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*,$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*,$HADOOP_MAPRED_HOME/share/hadoop/common/*,$HADOOP_MAPRED_HOME/share/hadoop/common/lib/*,$HADOOP_MAPRED_HOME/share/hadoop/hdfs/*,$HADOOP_MAPRED_HOME/share/hadoop/hdfs/lib/*,$HADOOP_MAPRED_HOME/share/hadoop/yarn/*,$HADOOP_MAPRED_HOME/share/hadoop/yarn/lib/*,$HADOOP_MAPRED_HOME/share/hadoop/tools/*,$HADOOP_MAPRED_HOME/share/hadoop/tools/lib/*,$HADOOP_MAPRED_HOME/share/hadoop/client/*,$HADOOP_MAPRED_HOME/share/hadoop/client/lib/*,/home/hadoop/spark/common/network-yarn/target/scala-2.11/spark-2.2.2-SNAPSHOT-yarn-shuffle.jar </value>
    </property>
    

    All these steps are replicated in all worker nodes i.e spark-defaults.conf has the above mentioned values and yarn-site.xml has these properties. I have made sure that /home/hadoop/spark/common/network-yarn/target/scala-2.11/spark-2.2.2-SNAPSHOT-yarn-shuffle.jar exists in all worker nodes.

  3. Then I am running $SPARK_HOME/sbin/start-shuffle-service.sh in worker nodes and master node. In master node, I am restarting the YARN using stop-yarn.sh and then start-yarn.sh

  4. Then I am doing YARN node -list -all to see the worker nodes but I am not able to see any node

  5. When I am disabling the property

    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>spark_shuffle</value>
    </property>
    

    I can see all the worker nodes as normal so it seems like shuffle service is not properly configured.

Diverticulum answered 24/11, 2018 at 12:40 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.