What is the path to directory within Hadoop filesystem?
Asked Answered
P

2

16

Recently I start learning Hadoop and Mahout. I want to know the path to directory within Hadoop filesystem directory.

In hadoop-1.2.1/conf/core-site.xml, I have specified:

<property>
  <name>hadoop.tmp.dir</name>
  <value>/Users/Li/File/Java/hdfstmp</value>
  <description>A base for other temporary directories.</description>
</property>

In Hadoop filesystem, I have the following directories:

lis-macbook-pro:Java Li$ hadoop fs -ls
Found 4 items
drwxr-xr-x   - Li supergroup          0 2013-11-06 17:25 /user/Li/output
drwxr-xr-x   - Li supergroup          0 2013-11-06 17:24 /user/Li/temp
drwxr-xr-x   - Li supergroup          0 2013-11-06 14:50 /user/Li/tweets-seq
-rw-r--r--   1 Li supergroup    1979173 2013-11-05 15:50 /user/Li/u.data

Now where is /user/Li/output directory?

I tried:

lis-macbook-pro:usr Li$ cd /user/Li/output
-bash: cd: /user/Li/output: No such file or directory

So I think /user/Li/output is a relative path not an absolute path.

Then I search for it in /Users/Li/File/Java/hdfstmp. There are two folders:

dfs

mapred

But still I cant find /user/Li/output within /Users/Li/File/Java/hdfstmp.

Privett answered 12/11, 2013 at 19:52 Comment(0)
W
14

Your first call to hadoop fs -ls is a relative directory listing, for the current user typically rooted in a directory called /user/${user.name} in HDFS. So your hadoop fs -ls command is listing files / directories relative to this location - in your case /user/Li/

You should be able to assert this by running a aboolute listing and confirm the contents / output match: hadoop fs -ls /user/Li/

As these files are in HDFS, you will not be able to find them on the local filesystem - they are distributed across your cluster nodes as blocks (for real files), and metadata entries (for files and directories) in the NameNode.

Whippersnapper answered 13/11, 2013 at 0:14 Comment(5)
Thanks a lot for your explanation. Assume I am running hadoop in single node(machine), still I cannot get access to HDFS because of the limitation of the architecture, right?Intake
You can install a FUSE connector or something similar to allow you to mount HDFS as a filesystem, but otherwise no, not directly. You can query the NameNode metadata for a file, discover the block names and then find the blocks on the local file systemWhippersnapper
You say "typically rooted in a directory called /user/${user.name}" -- how can this be configured? Is there any way I can pwd in case I think I am somewhere else? Or ls file inside my current directory and results are the files' absolute paths?Lanell
@TheRedPea - Googling your question led me back to another one of my answers - https://mcmap.net/q/454042/-hdfs-home-directory. The anwer and link the source still seems validWhippersnapper
@ChrisWhite OK, "seems to be hard-coded" -- and not sure I can use ls to get this absolute path, but now that I know it's hard-coded, I should be able to figure out the path, myself. Thanks!Lanell
S
7

All the files are present under hdfs which is Hadoop Distributed File System. so these files are not present in your filesystem or your directory structure

inside hdfs these are stored as

Path("hdfs://host:port/file"));

The setting of the port is present in your xml file under configuration directory of hadoop $HADOOP_HOME/etc/hadoop/core-site.xml

<property>
    <name>fs.defaultFS</name>
    <value>hdfs://localhost:9010</value>
</property>

you can view the file present under hdfs with the help of command line

hdfs dfs -ls

Basic linux command can be run from the command line

hdfs dfs -<Command>

with the help of this you can create dir delete file or dir and other things also

Sideboard answered 28/1, 2016 at 12:22 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.