cloudera-cdh Questions

1

I'm evaluating multiple big data tools. One of them is of course Impala. I would like to start Impala cluster by manually starting processes on the cluster nodes. As I'm currently doing for Spark, ...
Huihuie asked 22/8, 2016 at 20:3

4

Solved

We enable Namenode High Availability through Cloudera Manager, using Cloudera Manager >> HDFS >> Action > Enable High Availability >> Selected Stand By Namenode & Journal Nodes Then nameserv...
Heal asked 31/7, 2014 at 15:19

1

Solved

How to check whether a file in HDFS location is exist or not, using Oozie? In my HDFS location I will get a file like this test_08_01_2016.csv at 11PM , on a daily basis. I want check whether thi...
Novanovaculite asked 19/8, 2016 at 7:46

2

Solved

I am able to print data in two RDD with the below code. usersRDD.foreach(println) empRDD.foreach(println) I need to compare data in two RDDs. How can I iterate and compare field data in one RDD ...
Disputant asked 5/1, 2015 at 15:45

4

Solved

I have a Scala Implicit class from RecordService API, which i wanted to use in Java file. package object spark { implicit class RecordServiceContext(ctx: SparkContext) { def recordServiceTex...
Smiley asked 8/4, 2016 at 10:58

1

We are trying do a proof of concept on Informatica Big Data edition (not the cloud version) and I have seen that we might be able to use HDFS, Hive as source and target. But my question is does Inf...

2

I am trying to submit a spark job to the CDH yarn cluster via the following commands I have tried several combinations and it all does not work... I now have all the poi jars located in both my lo...
Doura asked 24/7, 2015 at 4:20

3

Solved

I am running the following code in pyspark: In [14]: conf = SparkConf() In [15]: conf.getAll() [(u'spark.eventLog.enabled', u'true'), (u'spark.eventLog.dir', u'hdfs://ip-10-0-0-220.ec2.interna...
Satirical asked 1/7, 2015 at 21:13

3

Solved

I installed CDH5.4 in single node following the instructions here, also, I put the hive-metastore in localmode using these instructions and everything works perfectly, except when I tried to connec...
Bulwerlytton asked 1/5, 2015 at 15:48

1

Solved

I am evaluating Hive and need to do some string field concatenation after group by. I found a function named "concat_ws" but it looks like I have to explicitly list all the values to be concatenate...
Seaden asked 3/5, 2015 at 5:32

1

Solved

Which directory is Hadoop installed in Cloudera distribution? Is it in /usr/bin/hadoop? [cloudera@quickstart opt]$ which hadoop /usr/bin/hadoop I know the software packages are to be installed i...
Starlin asked 7/4, 2015 at 21:47

1

Solved

In our YARN cluster which is 80% full, we are seeing some of the yarn nodemanager's are marked as UNHEALTHY. after digging into logs I found its because disk space is 90% full for data dir. With fo...

1

I'm working by CDH 5.1 now. It starts normal Hadoop job by YARN but hive still works with mapred. Sometimes a big query will hang for a long time and I want to kill it. I can find this big job by ...
Anaesthesia asked 12/2, 2015 at 6:28

1

Solved

I could not find the latest mrunit(1.1.0) in Cloudera repository. The one available is 0.8.0-incubating. Following is my pom: <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http:...
Johathan asked 17/10, 2014 at 19:58

0

Recently we upgraded to YARN with CDH 5. (Version : 2.3.0 cdh5.1.3, r8e266e052e423af592871e2dfe09d54c03f6a0e8) I was trying to access logs of failed job from Resource Manager by clicking logs...
Osage asked 6/10, 2014 at 19:6

1

I am unable to pass the equality check using the below HIVE query. I have 3 table and i want to join these table. I trying as below, but get error : FAILED: Error in semantic analysis: Line 3:40 ...
Standby asked 13/9, 2014 at 8:7

1

Solved

After upgrading our small Cloudera Hadoop cluster to CDH 5, deleting files no longer frees up available storage space. Even though we delete more data than we add, the file system keeps filling up....
Oberstone asked 14/4, 2014 at 10:52

© 2022 - 2024 — McMap. All rights reserved.