hdfs Questions
1
Typical start of working with HDFS using org.apache.hadoop.fs classes on scala is
val conf = new Configuration()
fs = FileSystem.get(conf)
Do we need to call something like
IOUtils.closeStream(...
4
Solved
I am a newbie in Spark SQL world. I am currently migrating my application's Ingestion code which includes ingesting data in stage,Raw and Application layer in HDFS and doing CDC(change data capture...
Jannet asked 1/8, 2017 at 6:49
5
Solved
In Hadoop fs how to lookup the block size for a particular file?
I was primarily interested in a command line, something like:
hadoop fs ... hdfs://fs1.data/...
But it looks like that does not ...
2
Solved
I have a shell script like below. This script prints the path of a file located in HDFS
#!/bin/bash
TIMESTAMP=`date "+%Y-%m-%d"`
path=/user/$USER/logs/${TIMESTAMP}.fail_log
path1=/user/$USER/logs...
3
User rok uploaded file and set the permission to 770. The file on HDFS looks like this:
-rw-rw---- 3 rok hdfs filename1
I'm using ksc user to consume the data uploaded by rok user. So first, I'd...
4
I want to read file paths irrespective of whether they are HDFS or local. Currently, I pass the local paths with the prefix file:// and HDFS paths with the prefix hdfs:// and write some code as the...
2
Solved
Problem -
I am running 1 query in AWS EMR. It is failing by throwing exception -
java.io.FileNotFoundException: File s3://xxx/yyy/internal_test_automation/2016/09/17/17156/data/feed/commerce_feed...
Wolter asked 17/9, 2016 at 12:47
2
I just started reading about Hadoop and came across the CAP Theorem. Can you please throw some light on which two components of CAP would be applicable to a HDFS system?
Priedieu asked 11/11, 2019 at 5:55
4
Solved
I am trying to understand how spark runs on YARN cluster/client. I have the following question in my mind.
Is it necessary that spark is installed on all the nodes in yarn cluster? I think it sho...
Callida asked 23/7, 2014 at 12:0
1
Solved
What is the reason for having column families? Example:
Scenario 1 :
Table Row-Key ColumnFamily1 ColumnFamily2 ColumnFamily3
Scenario 2 :
Table1 Row-Key Column1...ColumnN
Table2 Row-Key Column1......
Moonier asked 21/11, 2020 at 15:5
3
Solved
I am installing HDFS on my local Windows machine. The installayion guide I am following is https://github.com/MuhammadBilalYar/Hadoop-On-Window/wiki/Step-by-step-Hadoop-2.8.0-installation-on-Window...
Dextrosinistral asked 28/7, 2018 at 20:12
4
I get from time to time the following errors in cloudera manager:
This DataNode is not connected to one or more of its NameNode(s).
and
The Cloudera Manager agent got an unexpected response fr...
2
Solved
Is the following code for Mappers, reading a text file from HDFS right? And if it is:
What happens if two mappers in different nodes try to open the file at almost the same time?
Isn't there a ne...
10
When I setup the hadoop cluster, I read the namenode runs on 50070 and I set up accordingly and it's running fine.
But in some books I have come across name node address :
hdfs://localhost...
6
Solved
I just downloaded Hortonworks sandbox VM, inside it there are Hadoop with the version 2.7.1. I adding some files by using the
hadoop fs -put /hw1/* /hw1
...command. After it I am deleting the ad...
Enchase asked 7/12, 2015 at 18:12
4
Solved
How to find Hadoop HDFS directory on my system?
I need this to run following command -
hadoop dfs -copyFromLocal <local-dir> <hdfs-dir>
In this command I don't knon my hdfs-dir.
No...
Emblem asked 2/4, 2016 at 20:50
3
I have created one oozie workflow for hive script to load data in a table.
My workflow.xml contains -
<workflow-app xmlns="uri:oozie:workflow:0.4" name="Hive-Table-Insertion">
<start to...
5
I get the error
Cannot create directory /home/hadoop/hadoopinfra/hdfs/namenode/current
While trying to install hadoop on my local Mac.
What could be the reason for this? Just for reference, I'...
3
Solved
I have 3 DataNodes and 1 NameNode on a machine inside LXC containers. The DataNode on the same node as the NameNode works fine but the other 2 don't i get:
Initialization failed for Block pool BP...
1
I'm trying to connect to HDFS through Pyarrow, but it does not work because libhdfs library cannot be loaded.
libhdfs.so is in $HADOOP_HOME/lib/native as well as in $ARROW_LIBHDFS_DIR.
print(os.e...
Acrylonitrile asked 31/10, 2018 at 16:11
2
Solved
This query returns in 10 seconds most of the times, but occasionally it need 40 seconds or more.
There are two executer nodes in the swarm, and there is no remarkable difference between profiles of...
4
I am facing a problem with hive default partition (null partition) in hive.
I will explain the situation briefly here.. I have a hive main table and data ingestion is happening to that table everyd...
1
Solved
I find that my Impala swarm performs not stable, normally it takes only a few seconds (less than 10s) to finish a query, but occasionally it will take more than 40s (and this situation will last fo...
3
Is there a simple command for hadoop that can change the name of a file (in the HDFS) from its old name to a new name?
1
I am trying to read parquet file and perform some operations on it and save the result as parquet on HDFS. I am doing it using Spark. While doing so I am getting following exception.
java.io.EOFEx...
Luz asked 22/4, 2016 at 11:25
© 2022 - 2024 — McMap. All rights reserved.