Namenode HA (UnknownHostException: nameservice1)
Asked Answered
H

4

7

We enable Namenode High Availability through Cloudera Manager, using

Cloudera Manager >> HDFS >> Action > Enable High Availability >> Selected Stand By Namenode & Journal Nodes Then nameservice1

Once the whole process completed then Deployed Client Configuration.

Tested from Client Machine by listing HDFS directories (hadoop fs -ls /) then manually failover to standby namenode & again listing HDFS directories (hadoop fs -ls /). This test worked perfectly.

But When I ran hadoop sleep job using following command it failed

$ hadoop jar /opt/cloudera/parcels/CDH-4.6.0-1.cdh4.6.0.p0.26/lib/hadoop-0.20-mapreduce/hadoop-examples.jar sleep -m 1 -r 0
java.lang.IllegalArgumentException: java.net.UnknownHostException: nameservice1
at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:414)
at org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:164)
at org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:129)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:448)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:410)
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:128)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2308)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:87)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2342)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2324)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:351)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:194)
at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:103)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:980)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:974)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:416)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:974)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:948)
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1410)
at org.apache.hadoop.examples.SleepJob.run(SleepJob.java:174)
at org.apache.hadoop.examples.SleepJob.run(SleepJob.java:237)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.examples.SleepJob.main(SleepJob.java:165)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:622)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:144)
at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:622)
at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
Caused by: java.net.UnknownHostException: nameservice1
... 37 more

I dont know why its not able to resolved nameservice1 even after deploying client configuration.

When I google this issue I found only one solution to this issue

Add the below entry in configuration entry for fix the issue dfs.nameservices=nameservice1 dfs.ha.namenodes.nameservice1=namenode1,namenode2 dfs.namenode.rpc-address.nameservice1.namenode1=ip-10-118-137-215.ec2.internal:8020 dfs.namenode.rpc-address.nameservice1.namenode2=ip-10-12-122-210.ec2.internal:8020 dfs.client.failover.proxy.provider.nameservice1=org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider

My impression was Cloudera Manager take cares of it. I checked client for this configuration & configuration was there (/var/run/cloudera-scm-agent/process/1998-deploy-client-config/hadoop-conf/hdfs-site.xml).

Also some more details of config files :

[11:22:37 [email protected]:~]# ls -l /etc/hadoop/conf.cloudera.*
/etc/hadoop/conf.cloudera.hdfs:
total 16
-rw-r--r-- 1 root root  943 Jul 31 09:33 core-site.xml
-rw-r--r-- 1 root root 2546 Jul 31 09:33 hadoop-env.sh
-rw-r--r-- 1 root root 1577 Jul 31 09:33 hdfs-site.xml
-rw-r--r-- 1 root root  314 Jul 31 09:33 log4j.properties

/etc/hadoop/conf.cloudera.hdfs1:
total 20
-rwxr-xr-x 1 root root  233 Sep  5  2013 container-executor.cfg
-rw-r--r-- 1 root root 1890 May 21 15:48 core-site.xml
-rw-r--r-- 1 root root 2546 May 21 15:48 hadoop-env.sh
-rw-r--r-- 1 root root 1577 May 21 15:48 hdfs-site.xml
-rw-r--r-- 1 root root  314 May 21 15:48 log4j.properties

/etc/hadoop/conf.cloudera.mapreduce:
total 20
-rw-r--r-- 1 root root 1032 Jul 31 09:33 core-site.xml
-rw-r--r-- 1 root root 2775 Jul 31 09:33 hadoop-env.sh
-rw-r--r-- 1 root root 1450 Jul 31 09:33 hdfs-site.xml
-rw-r--r-- 1 root root  314 Jul 31 09:33 log4j.properties
-rw-r--r-- 1 root root 2446 Jul 31 09:33 mapred-site.xml

 /etc/hadoop/conf.cloudera.mapreduce1:
total 24
-rwxr-xr-x 1 root root  233 Sep  5  2013 container-executor.cfg
-rw-r--r-- 1 root root 1979 May 16 12:20 core-site.xml
-rw-r--r-- 1 root root 2775 May 16 12:20 hadoop-env.sh
-rw-r--r-- 1 root root 1450 May 16 12:20 hdfs-site.xml
-rw-r--r-- 1 root root  314 May 16 12:20 log4j.properties
-rw-r--r-- 1 root root 2446 May 16 12:20 mapred-site.xml
[11:23:12 [email protected]:~]# 

I doubt its issue with old configuration in /etc/hadoop/conf.cloudera.hdfs1 & /etc/hadoop/conf.cloudera.mapreduce1 , but not sure.

looks like /etc/hadoop/conf/* never got updated

# ls -l /etc/hadoop/conf/
total 24
-rwxr-xr-x 1 root root  233 Sep  5  2013 container-executor.cfg
-rw-r--r-- 1 root root 1979 May 16 12:20 core-site.xml
-rw-r--r-- 1 root root 2775 May 16 12:20 hadoop-env.sh
-rw-r--r-- 1 root root 1450 May 16 12:20 hdfs-site.xml
-rw-r--r-- 1 root root  314 May 16 12:20 log4j.properties
-rw-r--r-- 1 root root 2446 May 16 12:20 mapred-site.xml

Anyone has any idea about this issue?

Heal answered 31/7, 2014 at 15:19 Comment(1)
Dont know why its symlink to wrong config pastebin.com/mv1ehRCm Can we change thoes symlink to correct config manually ?Heal
W
7

Looks like you are using wrong client configuration in /etc/hadoop/conf directory. Sometimes Cloudera Manager (CM) deploy client configurations option may not work.

As you have enabled NN HA, you should have valid core-site.xml and hdfs-site.xml files in your hadoop client configuration directory. For getting the valid site files, Go to HDFS service from CM Choose Download client configuration option from the Actions Button. you will get configuration files in zip format, extract the zip files and replace /etc/hadoop/conf/core-site.xml and /etc/hadoop/conf/hdfs-site.xml files with the extracted core-site.xml,hdfs-site.xml files.

Wulfenite answered 1/8, 2014 at 5:51 Comment(3)
then use the second option(Download client configuration and replace site files) as mentioned in the answerWulfenite
I fixed it by removing alternative linkHeal
I do not have such problem but still it does not work in prestoOrdinal
H
1

Got it resolved. wrong config was linked to "/etc/hadoop/conf/" --> "/etc/alternatives/hadoop-conf/" --> "/etc/hadoop/conf.cloudera.mapreduce1"

It has to be "/etc/hadoop/conf/" --> "/etc/alternatives/hadoop-conf/" --> "/etc/hadoop/conf.cloudera.mapreduce"

Heal answered 1/8, 2014 at 21:15 Comment(0)
M
0

below statement in my code resolved problem by specifying the host and port

val dfs = sqlContext.read.json("hdfs://localhost:9000//user/arvindd/input/employee.json")
Minutiae answered 22/8, 2016 at 18:17 Comment(0)
C
-4

I resolved this issue my putting the complete line to create RDD

myfirstrdd = sc.textFile("hdfs://192.168.35.132:8020/BUPA.txt")

and then I was able to do other RDD transformation .. Make sure you have the w/r/x to the file or you can do chmod 777

Carilla answered 23/2, 2016 at 10:0 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.