hdfs Questions
3
Solved
What's the easiest way to find file associated with a block in HDFS given a block Name/ID
3
Solved
The parameter "mapred.min.split.size" changes the size of the block in which the file was written earlier?
Assuming a situation where I, when starting my JOB, pass the parameter "mapred.min.split.s...
7
Solved
How to find the size of a HDFS file? What command should be used to find the size of any file in HDFS.
2
Solved
I need to loop over all csv files in a Hadoop file system. I can list all of the files in a HDFS directory with
> hadoop fs -ls /path/to/directory
Found 2 items
drwxr-xr-x - hadoop hadoop 2 201...
5
Solved
I have a Hadoop cluster setup and working under a common default username "user1". I want to put files into hadoop from a remote machine which is not part of the hadoop cluster. I configu...
Christachristabel asked 7/7, 2012 at 0:5
2
Solved
Question
On a Flink standalone cluster, running on a server, I am developing a Flink streaming job in Scala. The job consumes data from more than 1 Kafka topics, (do some formatting,) and write re...
Kinghorn asked 2/5, 2018 at 7:8
7
Solved
We are using Cloudera CDH 4 and we are able to import tables from our Oracle databases into our HDFS warehouse as expected. The problem is we have 10's of thousands of tables inside our databases a...
2
I'm using Kafka connect HDFS.
When I'm trying to run my connector I'm got the following exception:
ERROR Failed creating a WAL Writer: Failed to create file[/path/log] for [DFSClient_NONMAPREDUC...
Cheston asked 14/8, 2018 at 13:6
6
I am having following directory structure in HDFS,
/analysis/alertData/logs/YEAR/MONTH/DATE/HOURS
That is data is coming on houly basis and stored in format of year/month/day/hour.
I have writ...
6
Solved
I have tried all the different solutions provided at stackoverflow on this topic, but of no help
Asking again with the specific log and the details
Any help is appreciated
I have one master node...
6
Solved
I have a bunch of .gz files in a folder in hdfs. I want to unzip all of these .gz files to a new folder in hdfs. How should i do this?
2
For example, I want to save a table, what is the difference between the two strategies?
bucketBy:
someDF.write.format("parquet")
.bucketBy(4, "country")
.mode(SaveMode.OverWri...
Costar asked 19/5, 2021 at 8:21
5
With:
Java 1.8.0_231
Hadoop 3.2.1
Flume 1.8.0
Have created a hdfs service on 9000 port.
jps:
11688 DataNode
10120 Jps
11465 NameNode
11964 SecondaryNameNode
12621 NodeManager
12239 ResourceMa...
3
Solved
I have a set of directories created in HDFS recursively. How can list all the directories ? For a normal unix file system I can do that using the below command
find /path/ -type d -print
But I ...
1
Solved
We are testing our Hadoop applications as part of migrating from Hortonworks Data Platform (HDP v3.x) to Cloudera Data Platform (CDP) version 7.1. While testing, we found below issue while trying t...
Vaporescence asked 13/4, 2021 at 7:26
4
Solved
I would like to do some cleanup at the start of my Spark program (Pyspark). For example, I would like to delete data from previous HDFS run. In pig this can be done using commands such as
fs -cop...
Tobacconist asked 1/12, 2015 at 4:45
4
Solved
I would like to navigate in HDFS
First i looked on the directories in "root" HDFS
[cloudera@localhost ~]$ sudo -u hdfs hadoop fs -ls hdfs:/
Found 5 items
drwxr-xr-x - hbase hbase 0 2015-10-10 07:...
3
Solved
Most questions/answers on SO and the web discuss using Hive to combine a bunch of small ORC files into a larger one, however, my ORC files are log files which are separated by day and I need to kee...
5
Solved
I see there is hdfs3, snakebite, and some others. Which one is the best supported and comprehensive?
Adenoid asked 27/10, 2016 at 12:57
5
Solved
I'm studying Hadoop and currently I'm trying to set up an Hadoop 2.2.0 single node. I downloaded the latest distribution, uncompressed it, now I'm trying to set up the Hadoop Distributed File Syste...
Anders asked 26/1, 2014 at 22:25
1
I am trying to set up a Spark + HDFS deployment on a small cluster using Docker Swarm as a stack deployment. I have it generally working, but I ran into an issue that is preventing Spark from takin...
Heir asked 9/11, 2019 at 21:0
9
Solved
How to copy file from HDFS to the local file system . There is no physical location of a file under the file , not even directory . how can i moved them to my local for further validations.i am tri...
5
I am getting this error while performing start-dfs.sh
Starting namenodes on [localhost]
pdsh@Gaurav: localhost: rcmd: socket: Permission denied
Starting datanodes
pdsh@Gaurav: localhost: rcmd: s...
Counterbalance asked 13/3, 2017 at 4:18
2
Solved
When the spark was writing a large file to HDFS using saveAsTextFile, I got an error: java.lang.IllegalArgumentException: Self-suppression not permitted at java.lang.Throwable.addSuppressed(Throwab...
Ophthalmoscope asked 12/6, 2017 at 2:24
4
I want to create a file in HDFS and write data in that. I used this code:
Configuration config = new Configuration();
FileSystem fs = FileSystem.get(config);
Path filenamePath = new Path("input....
© 2022 - 2024 — McMap. All rights reserved.