Hadoop release missing /conf directory
Asked Answered
F

6

20

I am trying to install a single node setup of Hadoop on Ubuntu. I started following the instructions on the Hadoop 2.3 docs.

But I seem to be missing something very simple.

First, it says to

To get a Hadoop distribution, download a recent stable release from one of the Apache Download Mirrors.

Then,

Unpack the downloaded Hadoop distribution. In the distribution, edit the file conf/hadoop-env.sh to define at least JAVA_HOME to be the root of your Java installation.

However, I can't seem to find the conf directory.

I downloaded a release of 2.3 at one of the mirrors. Then unpacked the tarball, an ls of the inside returns:

$ ls
bin  etc  include  lib  libexec  LICENSE.txt  NOTICE.txt  README.txt  sbin  share

I was able to find the file they were referencing, just not in a conf directory:

$ find . -name hadoop-env.sh
./etc/hadoop/hadoop-env.sh

Am I missing something, or am I grabbing the wrong package? Or are the docs just outdated?

If so, anyone know where some more up-to date docs are?

Faultfinding answered 19/3, 2014 at 4:46 Comment(0)
H
13

I am trying to install a pseudo-distributed mode Hadoop, running into the same issue.

By following the book Hadoop The Definitive Guide (Third Edition), on page 618, it says:

In Hadoop 2.0 and later, MapReduce runs on YARN and there is an additional con-
figuration file called yarn-site.xml. All the configuration files should go in the
etc/hadoop subdirectory

Hope this confirms that etc/hadoop is the correct place.

Herta answered 19/5, 2014 at 19:20 Comment(0)
A
6

I think the docs need to be updated. Although the directory structure has changed, file names for important files like hadoop-env.sh, core-ste.xml and hdfs-site.xml have not changed. You may find the following link useful for getting started.

http://codesfusion.blogspot.com/2013/10/setup-hadoop-2x-220-on-ubuntu.html

Ampulla answered 19/3, 2014 at 6:28 Comment(1)
Thanks, that was a great blog post, it got me much further, but I am still hitting some issues. It is a bit absurd that the official docs are outdated for even the most basic setup. This seems to be the case for all the 2x versions. Even the current "stable" release's docs.Faultfinding
C
5

In Hadoop1,

{$HADOOP_HOME}/conf/

In Hadoop2,

{$HADOOP_HOME}/etc/hadoop
Colorfast answered 14/1, 2017 at 8:39 Comment(0)
H
3

in Hadoop 2.7.3 the file is in hadoop-common/src/main/conf/

$ sudo find . -name hadoop-env.sh
./hadoop-2.7.3-src/hadoop-common-project/hadoop-common/src/main/conf/hadoop-env.sh
Hawkins answered 10/1, 2017 at 13:3 Comment(0)
J
2

Just adding a note on the blog post http://codesfusion.blogspot.com/2013/10/setup-hadoop-2x-220-on-ubuntu.html. The blogpost is fantastic and very useful. That's how I got started. One aspect that I took a little time to figure is, that this blog seems to use a simplified way of providing configuration in the hadoop conf files such as "conf/core-site.xml", hdfs-site.xml etc... as follows

<!--fs.default.name is the name node URI -->
<configuration>
    fs.default.name
    hdfs://localhost:9000
</configuration>

As per official docs there is a more rigorous way - that would be useful when you have more than one properties is to add it as follows ( please note - the description is optional :-) )

<configuration>
    <property>
    <name> fs.default.name </name>
    <value>hdfs://localhost:9000 </value>
    <description>the name node URI </description>
    </property>
    <!--Add more configuration properties here -->
</configuration>
Jenks answered 6/8, 2014 at 15:10 Comment(0)
S
0

The conf directory for Hadoop's (2022) version 3.3.1 is located in src/main directory:

$HOME/hadoop/hadoop3.3/hadoop-common-project/hadoop-common/src/main/

Synovia answered 10/2, 2022 at 7:35 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.