My configuration:
Server-class machine cluster (4 machines), each with RHEL, 8GB RAM, quad core processors. I setup machine 'B1' to be the master, rest of 'em as slaves (B2,B3,B4). Kicked off dfs-start.sh, name node came up on 53410 on B1. Rest of the nodes are not able to connect to B1 on 53410!
Here's what I did so far:
- Tried "telnet B1 53410" from B2, B3, B4 - Connection refused.
- Tried ssh to B1 from B2,B3,B4 and viceversa - no problem, works fine.
- Changed 53410 to 55410, restarted dfs, same issue - connection refused on this port too.
- Disabled firewall (iptables stop) on B1 - tried connecting from B2,B3,B4 - fails on telnet.
- Disabled firewall on all nodes, tried again, fails again to connect to 53410.
- Checked ftp was working from B2,B3,B4 to B1, stopped ftp service (service vsftpd stop), tried bringing up dfs on standard ftp port (21), namenode comes up, rest of the nodes are failing again. Can't even telnet to the ftp port from B2,B3,B4.
- "telnet localhost 53410" works fine on B1.
All nodes are reachable from one another and all /etc/hosts are setup with correct mapping for ip addresses. So, I am pretty much clueless at this point. Why on earth would the namenode reject connections - is there a setting in hadoop conf, that I should be aware of to allow external clients connect remotely on the namenode port?