hadoop2 Questions
2
I'd like to use Spark 2.4.5 (the current stable Spark version) and Hadoop 2.10 (the current stable Hadoop version in the 2.x series). Further I need to access HDFS, Hive, S3, and Kafka.
http://spa...
Andromeda asked 2/3, 2020 at 8:4
4
I am learning about Amazon EMR lately, and according to my knowledge the EMR cluster lets us choose 3 nodes.
Master which runs the Primary Hadoop daemons like NameNode,Job Tracker and Resource man...
Unasked asked 7/1, 2017 at 8:23
5
As per my understanding sqoop is used to import or export table/data from the Database to HDFS or Hive or HBASE.
And we can directly import a single table or list of tables. Internally mapreduce p...
3
I am trying to copy some files from my hard drive to HDFS , I am using this command
hadoop fs -copyFromLocal /home/hduser/Pictures/Event\ ordering/* input/
Is this the correct syntax ?
PS : I...
3
Solved
Summary
Stock hadoop2.6.0 install gives me no filesystem for scheme: s3n. Adding hadoop-aws.jar to the classpath now gives me ClassNotFoundException: org.apache.hadoop.fs.s3a.S3AFileSystem.
Detai...
2
Solved
I managed to launch a spark application on Yarn. However memory usage is kind of weird as you can see below :
https://i.sstatic.net/f89UP.jpg
What does memory reserved mean ? How can i manage to ef...
Whiggism asked 17/2, 2015 at 16:42
2
Solved
I have Hadoop 2.6.0 installed on my Ubuntu 14.04 LTS machine. I am able to successfully connect to http://localhost:50070/.
I am trying to connect to http://locahost:50030/ I have the following in...
2
I am seen this in the logs of the data nodes. This is probably because I am copying 5 million files into HDFS:
java.lang.IllegalStateException: com.google.protobuf.InvalidProtocolBufferException: P...
2
Solved
I am using Hadoop 2.3.0 version. Sometimes when I execute the Map reduce job, the below errors will get displayed.
14/08/10 12:14:59 INFO mapreduce.Job: Task Id : attempt_1407694955806_0002_m_000...
Gregorygregrory asked 10/8, 2014 at 19:23
7
I have installed hadoop 2.6.0 and I'm playing around with it. I'm trying the Pseudo-distributed setup and I'm following the instructions on http://hadoop.apache.org/docs/current/hadoop-project-dist...
4
Solved
When I try to start dfs using:
start-dfs.sh
I get an error saying :
14/07/03 11:03:21 WARN util.NativeCodeLoader: Unable to load
native-hadoop library for your platform... using builtin-java cl...
Honestly asked 2/7, 2014 at 7:7
4
Solved
What is the difference between the ring (circular) buffer and a queue? Both support FIFO so in what scenarios I should use ring buffer over a queue and why?
Relevance to Hadoop
The map phase uses...
Pegpega asked 16/4, 2014 at 13:53
2
Solved
After reading Apache Hadoop documentation , there is a small confusion in understanding responsibilities of secondary node & check point node
I am clear on Namenode role and responsibilities:...
Huckaback asked 17/8, 2015 at 13:12
2
Getting the below error with respect to the container while submitting an spark application to YARN. The HADOOP(2.7.3)/SPARK (2.1) environment is running a pseudo-distributed mode in a single node ...
Duma asked 11/4, 2017 at 13:7
4
I know that from the terminal, one can do a find command to find files such as :
find . -type d -name "*something*" -maxdepth 4
But, when I am in the hadoop file system, I have not found a way ...
2
Solved
When and where, HDFS creates the .Trash folder ?
Does there any rule or logic, any reference ?
7
I'm trying to use Hive(0.13) msck repair table command to recover partitions and it only lists the partitions not added to metastore instead of adding them to metastore as well.
here's the ouput o...
2
I'm a chinese student and a beginer on hadoop 2.7.1. I will appreciate you if you could solve my problem.
When I run hadoop WordCount example recently on pseudo-distributed, it is stuck at map 0% a...
Brottman asked 14/5, 2017 at 9:17
1
Solved
Apache Spark supposedly supports Facebook's Zstandard compression algorithm as of Spark 2.3.0 (https://issues.apache.org/jira/browse/SPARK-19112), but I am unable to actually read a Zstandard-compr...
Hodgkins asked 15/6, 2018 at 2:16
1
How do I troubleshoot and recover a Lost Node in my long running EMR cluster?
The node stopped reporting a few days ago. The host seems to be fine and HDFS too. I noticed the issue only from the H...
2
Solved
How to read parquet files using `ssc.fileStream()`? What are the types passed to `ssc.fileStream()`?
My understanding of Spark's fileStream() method is that it takes three types as parameters: Key, Value, and Format. In case of text files, the appropriate types are: LongWritable, Text, and TextInp...
Segarra asked 15/2, 2016 at 15:49
1
Solved
Doing a quick test of the form
testfunc() {
hadoop fs -rm /test001.txt
hadoop fs -touchz /test001.txt
hadoop fs -setfattr -n trusted.testfield -v $(date +"%T") /test001.txt
hadoop fs -mv /test...
1
I'm migrating data to Hive 1.2, and I realized that, by default, I'm no longer allowed to use reserved words as column names. If you want to use reserved words, you need to explicitly set the below...
Lunette asked 11/1, 2016 at 18:12
2
I have a spark job where i am doing outer join between two data frames .
Size of first data frame is 260 GB,file format is text files which is split into 2200 files and the size of second data fram...
Joung asked 15/10, 2017 at 11:16
1
Solved
I'm running on Windows 10 with janusgraph-0.2.0-hadoop2.
I have put the winutils.exe in the bin folder.
P:\Software\DB\NoSQL\janusgraph-0.2.0-hadoop2\bin>gremlin-server.bat
Error: Could not fi...
Latvina asked 14/3, 2018 at 18:16
© 2022 - 2024 — McMap. All rights reserved.