hadoop2 Questions
3
Solved
I install spark on windows, but it failed to run showing the error below:
<console>:16: error: not found: value sqlContext
import sqlContext.implicits._
^
<console>:16: error: not fo...
Depersonalization asked 19/4, 2016 at 13:34
3
I am working on a hadoop project and after many visit to various blogs and reading the documentation, I realized I need to use secondary sort feature provided by hadoop framework.
My input format i...
Similarity asked 4/8, 2016 at 16:54
2
I have a Hadoop/Yarn cluster setup on AWS, I have one master and 3 slaves. I have verified I have 3 live nodes running on port 50070 and 8088. I tested a spark job in client deploy-mode, everything...
Implement asked 28/5, 2017 at 19:36
1
Hi i have output of my spark data frame which creates folder structure and creates so may part files .
Now i have to merge all part files inside the folder and rename that one file as folder path n...
Pelagias asked 18/10, 2017 at 14:17
1
Solved
My first attempt was:
CREATE TABLE t1 (
a string )
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
STORED AS TEXTFILE ;
But the result of that is:
CREATE TABLE t1 (
a string )
ROW FORMAT DE...
0
Installation Info :-
Hadoop version :- 2.6.5
Spark Version :- 2.1.0
And Kerberos
I am trying to get the spark context in yarn mode with kerberos authentication but below exception.
Code :-
publ...
Fridafriday asked 9/10, 2017 at 9:49
2
Yarn is using the concept of virtual core to manage CPU resources. I would ask what's the benefit to use virtual core, is there some reason here that YARN uses vcore?
Devaughn asked 6/3, 2017 at 23:39
7
Solved
Hadoop is Consistent and partition tolerant, i.e. It falls under the CP category of the CAP theoram.
Hadoop is not available because all the nodes are dependent on the name node. If the name node ...
Portly asked 14/11, 2013 at 5:47
1
I am submitting a job to YARN (on spark 2.1.1 + kafka 0.10.2.1) which connects to a secured hbase cluster. This job, performs just fine when i am running in "local" mode (spark.master=local[*]).
H...
Underprop asked 30/5, 2017 at 14:53
1
Solved
I am trying to list the applications run on Hadoop cluster. I can get the list to filter by application status as follows:
>yarn application -list -appStates FINISHED
But that still pulls up ...
Nominee asked 19/5, 2017 at 1:29
2
I'm a dummy on Ubuntu 16.04, desperately attempting to make Spark work.
I've tried to fix my problem using the answers found here on stackoverflow but I couldn't resolve anything.
Launching spark w...
Triceratops asked 13/10, 2016 at 8:0
1
I am trying to setup hadoop cluster in a single VM (for simplicity) using cloudera Manager 5.9. The below are the details of my environment:
Host OS -> Windows 10
Virtualization software -> ...
Adellaadelle asked 17/12, 2016 at 17:23
2
Solved
I have a .txt file as follows:
This is xyz
This is my home
This is my PC
This is my room
This is ubuntu PC xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxxxxxxxxxxxxxxxxxxx
(i...
1
Is there a way to find the name of the user who killed the Hadoop job?
I have no root access on the cluster Hadoop 2.6.0 nodes so I can only use the Hadoop command line tools and scrutinize the log...
Octave asked 12/10, 2015 at 9:42
5
Solved
I want to use Apache YARN as a cluster and resource manager for running a framework where resources would be shared across different task of the same framework. I want to use my own distributed off...
Maudiemaudlin asked 2/3, 2017 at 8:6
2
Solved
I am not able to understand what this DISTRIBUTE BY clause does in Hive. I know the definition that says, if we have DISTRIBUTE BY (city), this would send each city in a different reducer but I am ...
1
Problem
I'm trying to install psuedo-distributed CDH without the use of CDM. Everything "works" via the console. However, the second I begin using Hue, I receive an error when trying to work with ...
Arachnid asked 12/1, 2016 at 8:5
3
Solved
I'm trying to run
pyspark --master yarn
Spark version: 2.0.0
Hadoop version: 2.7.2
Hadoop yarn web interface is
successfully started
This is what happens:
16/08/15 10:00:12 DEBUG Client: ...
Duggan asked 15/8, 2016 at 10:27
2
Solved
This is a very simple question: in spark, broadcast can be used to send variables to executors efficiently. How does this work ?
More precisely:
when are values sent : as soon as I call broadcas...
Larue asked 18/11, 2016 at 20:30
4
2
Solved
I want to implement a maven project, that helps me unit test a Hadoop MapReduce job. My biggest problem is defining the Maven dependencies to be able to make use of the test classes: MiniDFSCluster...
Adley asked 3/7, 2014 at 12:44
2
Solved
Can anyone please tell me that If I am using java application to request some file upload/download operations to HDFS with Namenode HA setup, Where this request go first? I mean how would the clien...
0
I'm trying to run spark on a working hadoop cluster. When I run my python job with a small dataset size, everything seems to work fine. However when I use a larger dataset, the task fails and in th...
Deathlike asked 4/9, 2016 at 12:38
2
I know that HDFS is write once and read many times.
Suppose if i want to update a file in HDFS is there any way to do it ?
Thankyou in advance !
2
Solved
© 2022 - 2024 — McMap. All rights reserved.