Why does vcore always equal the number of nodes in Spark on YARN?

Asked 1/11, 2015 at 0:33 Answered 12/9, 2023 at 13:34

I have a Hadoop cluster with 5 nodes, each of which has 12 cores with 32GB memory. I use YARN as MapReduce framework, so I have the following settings with YARN:

yarn.nodemanager.resource.cpu-vcores=10
yarn.nodemanager.resource.memory-mb=26100

Then the cluster metrics shown on my YARN cluster page (http://myhost:8088/cluster/apps) displayed that VCores Total is 40. This is pretty fine!

Then I installed Spark on top of it and use spark-shell in yarn-client mode.

I ran one Spark job with the following configuration:

--driver-memory 20480m
--executor-memory 20000m
--num-executors 4
--executor-cores 10
--conf spark.yarn.am.cores=2
--conf spark.yarn.executor.memoryOverhead=5600

I set --executor-cores as 10, --num-executors as 4, so logically, there should be totally 40 Vcores Used. However, when I check the same YARN cluster page after the Spark job started running, there are only 4 Vcores Used, and 4 Vcores Total

I also found that there is a parameter in capacity-scheduler.xml - called yarn.scheduler.capacity.resource-calculator:

"The ResourceCalculator implementation to be used to compare Resources in the scheduler. The default i.e. DefaultResourceCalculator only uses Memory while DominantResourceCalculator uses dominant-resource to compare multi-dimensional resources such as Memory, CPU etc."

I then changed that value to DominantResourceCalculator.

But then when I restarted YARN and run the same Spark application, I still got the same result, say the cluster metrics still told that VCores used is 4! I also checked the CPU and memory usage on each node with htop command, I found that none of the nodes had all 10 CPU cores fully occupied. What can be the reason?

I tried also to run the same Spark job in fine-grained way, say with --num executors 40 --executor-cores 1, in this ways I checked again the CPU status on each worker node, and all CPU cores are fully occupied.

Collide answered 1/11, 2015 at 0:33 Comment(3)

Could you check on the Spark UI website (tab Environment) that all the config options were really propagated to the Spark app? You can also check YARN Resource Manager logs if there is any problem with allocation. – Haemophilia 7/11, 2015 at 22:55

Have you ever solved this issue? I'm running into the same problem right now. – Littoral 20/3, 2019 at 14:41

Confirming this still exists – Rhys 1/6, 2023 at 9:24

I was wondering the same but changing the resource-calculator worked for me.
This is how I set the property:

    <property>
        <name>yarn.scheduler.capacity.resource-calculator</name>      
        <value>org.apache.hadoop.yarn.util.resource.DominantResourceCalculator</value>       
    </property>

Check in the YARN UI in the application how many containers and vcores are assigned, with the change the number of containers should be executors+1 and the vcores should be: (executor-cores*num-executors) +1.

Jobholder answered 15/2, 2016 at 12:42 Comment(1)

That YARN even shows vCores in the UI with the DefaultResourceCalculator enabled should be considered a bug. – Kieger 29/6, 2020 at 14:1

Without setting the YARN scheduler to FairScheduler, I saw the same thing. The Spark UI showed the right number of tasks, though, suggesting nothing was wrong. My cluster showed close to 100% CPU usage, which confirmed this.

After setting FairScheduler, the YARN Resources looked correct.

Genuflection answered 9/4, 2017 at 2:10 Comment(2)

Please explain how you are doing that. What is the name of the config ? – Rothstein 6/9, 2018 at 23:59

@Rothstein , I used this: cloudera.com/documentation/enterprise/5-8-x/topics/… – Genuflection 14/9, 2018 at 23:8

Executors take 10 cores each, 2 cores for Application Master = 42 Cores requested when you have 40 vCores total.

Reduce executor cores to 8 and make sure to restart each NodeManager

Also modify yarn-site.xml and set these properties:

yarn.scheduler.minimum-allocation-mb
yarn.scheduler.maximum-allocation-mb
yarn.scheduler.minimum-allocation-vcores
yarn.scheduler.maximum-allocation-vcores

Repugn answered 24/11, 2016 at 11:55 Comment(0)

For people who are looking for ways to get this configuration done in EMR below sample may help. It worked for me.

It makes sure that yarn doesn't allocate cores more than configured on the node. If no scheduler is configured in the below json, the yarn defaults to capacity-scheduler

Location: Emr New console > configurations > Instance group configurations > Select Master radio button - instance group master > click on reconfigure > edit in JSON.

[
  {
    "Classification": "yarn-site",
    "Properties": {
      "yarn.nodemanager.resource.cpu-vcores": "5"
    }
  },
  {
    "Classification": "capacity-scheduler",
    "Properties": {
      "yarn.scheduler.capacity.resource-calculator": "org.apache.hadoop.yarn.util.resource.DominantResourceCalculator"
    }
  }
]

Henceforth answered 12/9, 2023 at 13:34 Comment(0)

Recommended topics

Hot tags