HDFS_NAMENODE_USER, HDFS_DATANODE_USER & HDFS_SECONDARYNAMENODE_USER not defined
Asked Answered
E

5

35

I am new to hadoop.
I'm trying to install hadoop in my laptop in Pseudo-Distributed mode.
I am running it with root user, but I'm getting the error below.

root@debdutta-Lenovo-G50-80:~# $HADOOP_PREFIX/sbin/start-dfs.sh
WARNING: HADOOP_PREFIX has been replaced by HADOOP_HOME. Using value of HADOOP_PREFIX.
Starting namenodes on [localhost]
ERROR: Attempting to operate on hdfs namenode as root
ERROR: but there is no HDFS_NAMENODE_USER defined. 
Aborting operation.
Starting datanodes
ERROR: Attempting to operate on hdfs datanode as root
ERROR: but there is no HDFS_DATANODE_USER defined. 
Aborting operation.
Starting secondary namenodes [debdutta-Lenovo-G50-80]
ERROR: Attempting to operate on hdfs secondarynamenode as root
ERROR: but there is no HDFS_SECONDARYNAMENODE_USER defined. Aborting operation.
WARNING: HADOOP_PREFIX has been replaced by  HADOOP_HOME. Using value of HADOOP_PREFIX.

Also, I have to run hadoop in root user as hadoop is not able to access ssh service with other user.
How to fix the same?

Endocrine answered 6/1, 2018 at 15:50 Comment(4)
Please edit your question to clarify how you've installed Hadoop. Are you reading the official documentation?Mcfarland
below is the link I am following to install hadoop.Endocrine
dzone.com/articles/getting-hadoop-and-runningEndocrine
That article is 4 years old, so that's not Hadoop 3. All of the startup scripts changedMcfarland
G
52

just do what it asks you:

export HDFS_NAMENODE_USER="root"
export HDFS_DATANODE_USER="root"
export HDFS_SECONDARYNAMENODE_USER="root"
export YARN_RESOURCEMANAGER_USER="root"
export YARN_NODEMANAGER_USER="root"
Griceldagrid answered 9/1, 2018 at 14:31 Comment(2)
It is working. But every time I close the terminal I am losing the values.. I had to rerun all the export command all over againEndocrine
Add all these commands in hadoop-env.sh and you will be good to go!Ben
S
14

The root cause of this problem,

  1. hadoop install for different user and you start yarn service for different user. OR
  2. in hadoop config's hadoop-env.sh specified HDFS_NAMENODE_USER and HDFS_DATANODE_USER user is something else.

Hence we need to correct and make it consistent at every place. So a simple solution of this problem is to edit your hadoop-env.sh file and add the user-name for which you want to start the yarn service. So go ahead and edit $HADOOP_HOME/etc/hadoop/hadoop-env.sh by adding the following lines

export HDFS_NAMENODE_USER=root
export HDFS_DATANODE_USER=root
export HDFS_SECONDARYNAMENODE_USER=root
export YARN_RESOURCEMANAGER_USER=root
export YARN_NODEMANAGER_USER=root

Now save and start yarn, hdfs service and check that it works.

Substantive answered 4/11, 2018 at 11:43 Comment(2)
make sure ssh service should up and running and datanode and namenode should be access without password. you can verify by ssh IPs/HostNameSubstantive
How can i make datanodes and namenodes accessible without passwordDouai
M
3

Based on on the first warning, HADOOP_PREFIX, sounds like you've not defined HADOOP_HOME correctly.

This would be done in your /etc/profile.d.

hadoop-env.sh is where the remainder of those variables are are defined.

Please refer to the UNIX Shell Guide

hadoop is not able to access ssh service with other user

This has nothing to do with Hadoop itself. It's basic SSH account management. You need to

  1. Make the hadoop (and other, like yarn) accounts on all machines of a cluster (see adduser command documentation)
  2. Copy a passwordless SSH key using ssh-copy-id hadoop@localhost, for example

If you don't need distributed mode and just want to use Hadoop locally, you can use a Mini Cluster.

The documentation also recommends making a single node installation before continuing to pseudo distributed

Mcfarland answered 6/1, 2018 at 22:45 Comment(0)
F
2

Vim ${HADOOP_HOME}sbin/start-dfs.sh & ${HADOOP_HOME}sbin/stop-dfs.sh, then add:

HDFS_DATANODE_USER=root
HADOOP_SECURE_DN_USER=hdfs 
HDFS_NAMENODE_USER=root 
HDFS_SECONDARYNAMENODE_USER=root 
Frailty answered 25/4, 2018 at 7:54 Comment(0)
S
0
  1. Check your pdsh default rcmd rsh

pdsh -q -w localhost -- should get something like this

-- DSH-specific options -- Separate stderr/stdout Yes Path prepended to cmd none Appended to cmd none Command: none Full program pathname /usr/bin/pdsh Remote program path /usr/bin/pdsh

-- Generic options -- Local username enock Local uid 1000 Remote username enock Rcmd type rsh one ^C will kill pdsh No Connect timeout (secs) 10 Command timeout (secs) 0 Fanout 32 Display hostname labels Yes Debugging No

-- Target nodes -- localhost

  1. Modify pdsh default rcmd. Add pdsh to bashrc nano ~/.bashrc -- add this line towards the end export PDSH_RCMD_TYPE=ssh -- update source ~/.bashrc

That should solve your problem

C. sbin/start-dfs.sh

Steward answered 26/9, 2020 at 13:30 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.