hdfs Questions
7
I opened up localhost:9870 and try to upload a txt file to the hdfs.
I see the error message below
Failed to retrieve data from /webhdfs/v1/?op=LISTSTATUS: Server Error
2
Solved
I know that both Under-replicated blocks and Mis-replicated blocks occur due to lesser data node count with respect to replication factor set.
But what is the difference between them?
On re-sett...
4
Solved
Hadoop has configuration parameter hadoop.tmp.dir which, as per documentation, is `"A base for other temporary directories." I presume, this path refers to local file system.
I set this value to /...
2
When configuring my hadoop namenode for the first time, I know I need to run
bin/hadoop namenode -format
but running this a second time, after loading data into HDFS, will wipe out everything an...
2
I install spark on three nodes successfully. I can visit spark web UI and find every worker node and master node is active.
I can run the SparkPi example successfully.
My cluster info:
10.45.10.3...
Bluegill asked 12/9, 2016 at 12:3
6
16
I am trying to install hadoop on ubuntu 16.04 but while starting the hadoop it will give me following error
localhost: ERROR: Cannot set priority of datanode process 32156.
Starting secondary nam...
4
I have multiple small parquet files generated as output of hive ql job, i would like to merge the output files to single parquet file?
what is the best way to do it using some hdfs or linux comman...
3
Solved
I have a large image classification dataset stored in the format .hdf5. The dataset has the labels and the images stored in the .hdf5 file. I am unable to view the images as they are store in form ...
Schleswigholstein asked 2/12, 2023 at 7:34
6
Solved
While running the wordcount example in Hadoop, I am facing the following error.
saying "JAR does not exist or is not a normal file:
/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduceexamp...
8
Solved
Is there a way to delete files older than 10 days on HDFS?
In Linux I would use:
find /path/to/directory/ -type f -mtime +10 -name '*.txt' -execdir rm -- {} \;
Is there a way to do this on HDFS...
14
I am getting this error when I try and boot up a DataNode. From what I have read, the RPC paramters are only used for a HA configuration, which I am not setting up (I think).
2014-05-18 18:05:00,5...
Division asked 18/5, 2014 at 8:19
2
Solved
I am working on hadoop apache 2.7.1 and I have a cluster that consists of 3 nodes
nn1
nn2
dn1
nn1 is the dfs.default.name, so it is the master name node.
I have installed httpfs and started it o...
4
Solved
I'm using pydoop to read in a file from hdfs, and when I use:
import pydoop.hdfs as hd
with hd.open("/home/file.csv") as f:
print f.read()
It shows me the file in stdout.
Is there any way for...
3
Solved
I know I can connect to an HDFS cluster via pyarrow using pyarrow.hdfs.connect()
I also know I can read a parquet file using pyarrow.parquet's read_table()
However, read_table() accepts a filepat...
3
Solved
I learned that if you want to copy multiple files from one hadoop folder to another hadoop folder you can better create one big 'hdfs dfs -cp' statement with lots of components, instead of creating...
Demetria asked 16/12, 2016 at 13:52
4
Can one use Delta Lake and not being dependent on Databricks Runtime? (I mean, is it possible to use delta-lake with hdfs and spark on prem only?)
If no, could you elaborate why is that so from tec...
Marko asked 23/3, 2020 at 16:5
2
Solved
Is there a way to acquire lock on a directory in HDFS? Here's what I am trying to do:
I've a directory called ../latest/...
Every day I need to add fresh data into this directory, but before I co...
6
I am running hadoop with default configuration with one-node cluster, and would like to find where HDFS stores files locally.
Any ideas?
Thanks.
8
I am using Cloudera on a VM machine that I am playing around with. Unfortunately I am having issues copying data to the HDFS, I am getting the following:
[cloudera@localhost ~]$ hadoop fs -mkdir i...
1
Hadoop defintive guide says -
Each Namenode runs a lightweight failover controller process whose
job it is to monitor its Namenode for failures (using a simple
heartbeat mechanism) and ...
5
Solved
Some characteristics of Apache Parquet are:
Self-describing
Columnar format
Language-independent
In comparison to Apache Avro, Sequence Files, RC File etc. I want an overview of the formats. I ha...
10
Solved
I have a directory of directories on HDFS, and I want to iterate over the directories. Is there any easy way to do this with Spark using the SparkContext object?
Norval asked 19/11, 2014 at 18:1
3
Solved
I am unable to read a file from HDFS using Java:
String hdfsUrl = "hdfs://<ip>:<port>";
Configuration configuration = new Configuration();
configuration.set("fs.defaultFS", hdfsUrl);
F...
3
Solved
I'm trying to restore some historic backup files that saved in parquet format, and I want to read from them once and write the data into a PostgreSQL database.
I know that backup files saved using...
Dobrinsky asked 10/11, 2019 at 8:5
1 Next >
© 2022 - 2024 — McMap. All rights reserved.