Hadoop : start-dfs.sh Connection refused
Asked Answered
C

8

9

I have a vagrant box on debian/stretch64 I try to install Hadoop3 with documentation http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/SingleCluster.htm

When I run start-dfs.sh I have this message

vagrant@stretch:/opt/hadoop$ sudo sbin/start-dfs.sh
Starting namenodes on [localhost]
pdsh@stretch: localhost: connect: Connection refused
Starting datanodes
pdsh@stretch: localhost: connect: Connection refused
Starting secondary namenodes [stretch]
pdsh@stretch: stretch: connect: Connection refused
vagrant@stretch:/opt/hadoop$

of course I tried to update my hadoop-env.sh with : export HADOOP_SSH_OPTS="-p 22"

ssh localhost work (without password)

I have not ideas what I can change to solve this problem

Cloistered answered 10/1, 2018 at 14:48 Comment(1)
I'm having the same problem. I tried a command like pdsh -w node date and got the same error, then I did export PDSH_RCMD_TYPE=ssh, this solve the problem for the command on terminal, but the error using hadoop scripts, remain...Dannadannel
D
15

There is a problem the way pdsh works by default (see edit), but Hadoop can go without it. Hadoop checks if the system has pdsh on /usr/bin/pdsh and uses it if so. An easy way get away from using pdsh is editing $HADOOP_HOME/libexec/hadoop-functions.sh

replace the line

if [[ -e '/usr/bin/pdsh' ]]; then

by

if [[ ! -e '/usr/bin/pdsh' ]]; then

then hadoop goes without pdsh and everything works.

EDIT:

A better solution would be use pdsh, but with ssh instead rsh as explained here, so replace line from $HADOOP_HOME/libexec/hadoop-functions.sh:

PDSH_SSH_ARGS_APPEND="${HADOOP_SSH_OPTS}" pdsh \

by

PDSH_RCMD_TYPE=ssh PDSH_SSH_ARGS_APPEND="${HADOOP_SSH_OPTS}" pdsh \

Obs: Only doing export PDSH_RCMD_TYPE=ssh, as I mention in the comment, doesn't work. I don't know why...

I've also opened a issue and submitted a patch to this problem: HADOOP-15219

Dannadannel answered 9/2, 2018 at 2:24 Comment(1)
your answer edit is very helpful!! the issue was solvedObi
D
2

I fixed this problem for hadoop 3.1.0 by adding

PDSH_RCMD_TYPE=ssh

in my .bashrc as well as $HADOOP_HOME/etc/hadoop/hadoop-env.sh.

Decease answered 8/4, 2018 at 8:49 Comment(0)
R
2

check if your /etc/hosts file contains the hostname stretch and localhost mapping or not

my /etc/hosts file

Repute answered 26/7, 2020 at 7:42 Comment(0)
C
1

Go to your hadoop home directory

~$ cd libexec

~$ nano hadoop-functions.sh

edit this line:

if [[ -e '/usr/bin/pdsh' ]]; then

with:

if [[ ! -e '/usr/bin/pdsh' ]]; then
Caveman answered 23/4, 2020 at 12:24 Comment(0)
M
1

Additionally, it is recommended that pdsh also be installed for better ssh resource management. —— Hadoop: Setting up a Single Node Cluster

We can remove pdsh to solve this problem.

apt-get remove pdsh
Merideth answered 30/3, 2021 at 5:34 Comment(0)
R
0

Check if the firewalls are running on your vagrant box

chkconfig iptables off
/etc/init.d/iptables stop

if not that have a look in the underlying logs /var/log/...

Royden answered 10/1, 2018 at 15:24 Comment(2)
Unfortunately, I found nothing conclusive. I keep looking for a solution .. Thank you for your answerEisenhart
did you find the problem?Royden
K
0

I was dealing with my colleague's problem. he configured ssh using hostname from the hosts file and specified ip in the workers. after I rewrote the workers file everything worked.

~/hosts file

10.0.0.1 slave01

#ssh-copy-id hadoop@slave01

~/hadoop/etc/workers

slave01

Kutzenco answered 27/5, 2021 at 13:18 Comment(0)
L
0

I added export PDSH_RCMD_TYPE=ssh to my .bashrc file, logged out and back in and it worked.

For some reason simply exporting and running right away did not work for me.

Laborer answered 14/6, 2021 at 4:39 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.