hdfs Questions

17

I am running Spark on Windows 7. When I use Hive, I see the following error The root scratch dir: /tmp/hive on HDFS should be writable. Current permissions are: rw-rw-rw- The permissions are set ...
Bronchitis asked 10/12, 2015 at 7:46

4

I upgrade to the latest version of cloudera.Now I am trying to create directory in HDFS hadoop fs -mkdir data Am getting the following error Cannot Create /user/cloudera/data Name Node is in Safe...
Cockcroft asked 10/6, 2017 at 3:30

4

Solved

How do you, through Java, list all files (recursively) under a certain path in HDFS. I went through the API and noticed FileSystem.listFiles(Path,boolean) but it looks like that method doesn't exis...
Dunghill asked 8/6, 2012 at 0:51

9

Solved

I have 3 data nodes running, while running a job i am getting the following given below error , java.io.IOException: File /user/ashsshar/olhcache/loaderMap9b663bd9 could only be replicated to 0 ...
Drinkable asked 22/3, 2013 at 13:29

0

In my Java application I have an implementation for a file-system layer, where my file class is a wrapper for Hadoop filesystem methods. I am upgrading the from hadoop3-1.9.17 to hadoop3-2.2.8 and ...

12

Solved

I know du -sh in common Linux filesystems. But how to do that with HDFS?
Dysuria asked 28/6, 2011 at 9:7

2

I have a spark code which saves a dataframe to a HDFS location (date partitioned location) in Json format using append mode. df.write.mode("append").format('json').save(hdfsPath) sample hdfs locat...
Detrimental asked 3/9, 2019 at 18:15

4

Solved

I am trying to connect a Spark cluster running within a virtual machine with IP 10.20.30.50 and port 7077 from within a Java application and run the word count example: SparkConf conf = new SparkC...
Agential asked 5/11, 2016 at 14:58

8

Is it possible to save a pandas data frame directly to a parquet file? If not, what would be the suggested process? The aim is to be able to send the parquet file to another team, which they can ...
Ingeminate asked 9/12, 2016 at 18:20

6

Solved

-put and -copyFromLocal are documented as identical, while most examples use the verbose variant -copyFromLocal. Why? Same thing for -get and -copyToLocal
Belvabelvedere asked 18/10, 2011 at 17:29

9

I am new to spark and I have a question. I have a two step process in which the first step write a SUCCESS.txt file to a location on HDFS. My second step which is a spark job has to verify if that ...
Melloney asked 22/5, 2015 at 20:55

5

Solved

I'm trying to run a spark application using bin/spark-submit. When I reference my application jar inside my local filesystem, it works. However, when I copied my application jar to a directory in h...
Epiphysis asked 26/2, 2015 at 10:18

2

Could somebody give me a hint on how can I copy a file form a local filesystem to a HDFS filesystem using PyArrow's new filesystem interface (i.e. upload, copyFromLocal)? I have read the documentat...
Ejective asked 28/7, 2021 at 11:11

2

I am currently developing a Flink 1.4 application that reads an Avro file from a Hadoop cluster. However, running it in local mode on my IDE works perfectly fine. But when I submit it to the Jobman...
Smoothtongued asked 14/2, 2018 at 10:12

1

The following error occures when reading a parquet file from an hdfs 2020-06-04 14:11:23 WARN TaskSetManager:66 - Lost task 44.0 in stage 1.0 (TID 3514, 192.168.16.41, executor 1): java.lang.Runti...
Reed asked 4/6, 2020 at 14:28

6

Solved

I am a new to NoSQL solutions and want to play with Hive. But installing HDFS/Hadoop takes a lot of resources and time (maybe without experience but I got no time to do this). Are there ways to i...
Erebus asked 24/1, 2014 at 10:10

5

Solved

Now I have some Spark applications which store output to HDFS. Since our hadoop cluster is consisting of namenode H/A, and spark cluster is outside of hadoop cluster (I know it is something bad) I...
Naphthyl asked 12/6, 2015 at 6:52

12

I'm getting the following error when attempting to write to HDFS as part of my multi-threaded application could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s...
Auscultation asked 15/3, 2016 at 15:42

4

I have been using Cloudera's hadoop (0.20.2). With this version, if I put a file into the file system, but the directory structure did not exist, it automatically created the parent directories: ...
Renie asked 7/5, 2014 at 16:41

7

Solved

I have a big distributed file on HDFS and each time I use sqlContext with spark-csv package, it first loads the entire file which takes quite some time. df = sqlContext.read.format('com.databricks...
Larva asked 31/5, 2017 at 6:15

2

Solved

Is there a way to alter the location that a database points to? I tried the following ways: alter database <my_db> set DBPROPERTIES('hive.warehouse.dir'='<new_hdfs_loc>'); alter data...
Spoon asked 1/6, 2015 at 16:11

8

Solved

Are they supposed to be equal? but, why the "hadoop fs" commands show the hdfs files while the "hdfs dfs" commands show the local files? here is the hadoop version information: Hadoop 2.0.0-mr...
Ossicle asked 9/8, 2013 at 8:37

4

Solved

I'm storing files on HDFS in Snappy compression format. I'd like to be able to examine these files on my local Linux file system to make sure that the Hadoop process that created them has performed...
Macrocosm asked 21/5, 2013 at 16:23

28

I have Hadoop installed in this location /usr/local/hadoop$ Now I want to list the files inside the dfs. The command I used is : hduser@ubuntu:/usr/local/hadoop$ bin/hadoop dfs -ls This g...
Karlynkarma asked 26/3, 2014 at 7:53

2

Solved

I'm using the example in this link here to copy contents from one directory in hdfs to another directory in hdfs. The copying of file works, but it creates a new subdirectory in the target vs. just...
Leakage asked 21/5, 2017 at 5:1

© 2022 - 2024 — McMap. All rights reserved.