YARN Resourcemanager not connecting to nodemanager
Asked Answered
D

6

16

thanks in advance for any help

I am running the following versions:

Hadoop 2.2 zookeeper 3.4.5 Hbase 0.96 Hive 0.12

When I go to http://:50070 I am able to correctly see that 2 nodes are running.

The problem is when I go to http://:8088 it shows 0 nodes running.

I understand that :8088 reflects the resourcemanager and shows the number of nodemanagers running. The daemons all start, but it would appear that the nodemanagers aren't connecting to the resourcemanager.

This is the log file:

2013-12-16 20:55:48,648 INFO org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at /127.0.0.1:8031
2013-12-16 20:55:49,755 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: localhost/127.0.0.1:8031. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2013-12-16 20:55:50,756 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: localhost/127.0.0.1:8031. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2013-12-16 20:55:51,757 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: localhost/127.0.0.1:8031. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2013-12-16 20:55:52,758 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: localhost/127.0.0.1:8031. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2013-12-16 20:55:53,759 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: localhost/127.0.0.1:8031. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2013-12-16 20:55:54,760 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: localhost/127.0.0.1:8031. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)

I have checked and port 8031 is open.

EDIT:

For people viewing this in the future, I needed to edit my yarn-site.xml to look like the following:

<property>
   <name>yarn.nodemanager.aux-services</name>
   <value>mapreduce_shuffle</value>
</property>
<property>
   <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
   <value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
   <name>yarn.resourcemanager.scheduler.address</name>
   <value>master-1:8030</value>
</property>
<property>
   <name>yarn.resourcemanager.address</name>
   <value>master-1:8032</value>
</property>
<property>
   <name>yarn.resourcemanager.webapp.address</name>
   <value>master-1:8088</value>
</property>
<property>
   <name>yarn.resourcemanager.resource-tracker.address</name>
   <value>master-1:8031</value>
</property>
<property>
   <name>yarn.resourcemanager.admin.address</name>
   <value>master-1:8033</value>
</property> 
Depute answered 16/12, 2013 at 21:8 Comment(6)
Did you point yarn.resourcemanager.resource-tracker.address to you resource manager's hostname?Cleaner
Do I need just the hostname? Or hostname and port? And is this just on the nodemanager's node or all the nodes? I tried this out but it didn't change anything, could be that I had the port wrong - I'm not sure where to check which port to use.Depute
You'll need tospecify hostname:port. Yes, it has to be set on all the nodes, not just the ResourceManager node (You'll be fine just copying the same hadoop's conf dir to all of your nodes). If you need some minimal working configs to get started, take a look here: toster.ru/q/57046#answer_208326Cleaner
Thanks, this worked perfectly. Please respond with that as an "answer" so I can select it as the solution.Depute
Is yarn.nodemanager.aux-services.mapreduce.shuffle.class really required? I don't see it mentioned on hadoop.apache.org/docs/current/hadoop-project-dist/…Agaric
I'm sorry, from which dir did you obtain the log file?Burette
C
9

You'll need to specify

hostname:port

Yes, it has to be set on all the nodes, not just the ResourceManager node (You'll be fine just copying the same hadoop's conf dir to all of your nodes). If you need some minimal working configs to get started, take a look here: toster.ru/q/57046#answer_208326

Cleaner answered 17/12, 2013 at 22:15 Comment(0)
A
10

I had a very similar problem, and it was solved just by specifying the ResourceManager hostname, no need to spell out the exact address per service.

<property>
  <name>yarn.resourcemanager.hostname</name>
  <value>master-1</value>
</property>
Aureliaaurelian answered 2/3, 2014 at 8:4 Comment(0)
C
9

You'll need to specify

hostname:port

Yes, it has to be set on all the nodes, not just the ResourceManager node (You'll be fine just copying the same hadoop's conf dir to all of your nodes). If you need some minimal working configs to get started, take a look here: toster.ru/q/57046#answer_208326

Cleaner answered 17/12, 2013 at 22:15 Comment(0)
Y
0

The rsync or scp command can be used to copy the configuration files from the master node to the slave nodes:

for host in $HADOOP_CONF_DIR/slaves; do 
    rsync -rv $HADOOP_CONF_DIR/* $host:$HADOOP_CONF_DIR/
done

Note, here I am assuming all the nodes have the same hadoop directory layout.

Yates answered 26/2, 2014 at 4:23 Comment(0)
E
0

I experienced an issue with very similar symptoms although it was the nodemanager not connecting to the resource manager. The problem was that in yarn-site.xml there is (or may be) a property named "yarn.nodemanager.hostname". That setting had been accidentally populated with the hostname of the HDFS "namenode" but it is supposed to contain the hostname of the YARN per-node "nodemanager". Depending on what was entered for other properties this was causing various errors like "Retrying connect", "Connection refused", or resource allocation errors. Setting this to "0.0.0.0" (the default) fixed the problem.

Eudemonism answered 15/2, 2017 at 17:23 Comment(0)
E
-1

I also had the same issue but in my case only one node manage was listed in resource manager. I placed below property in yarn-site.xml and I could see nodes listed at RM.

<property>
    <name>yarn.resourcemanager.hostname</name>
    <value><master-1></value>
</property>
Embellish answered 11/3, 2016 at 19:51 Comment(0)
R
-2
  1. check YARN HA is enabled or not?
  2. In case it is enabled then for each resource managers mentioned in the yarn-site.xml yarn.resourcemanager.ha.rm-ids (e.g. rm1,rm2) run the resourcemanager service.
Robalo answered 23/7, 2016 at 11:26 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.