hadoop2 Questions
1
Solved
How to check whether a file in HDFS location is exist or not, using Oozie?
In my HDFS location I will get a file like this test_08_01_2016.csv at 11PM , on a daily basis.
I want check whether thi...
Novanovaculite asked 19/8, 2016 at 7:46
2
Solved
I've got issue while Querying on ORC file format table
I was trying below query:
INSERT INTO TABLE <db_name>.<table_name> SELECT FROM <db_name>.<table_name> WHERE CONDITIONS...
Irritate asked 23/2, 2015 at 13:19
0
I have a long-running yarn application (not m/r) with containers that occasionally exceed the yarn memory limit, at which point yarn kills the offending containers. I am finding it difficult to det...
Purposely asked 5/7, 2016 at 18:27
1
I am new to Spark and MLlib and I'm trying to call StreamingKMeans from my java application and I get an exception that I don't seem to understand. Here is my code for transforming my training data...
Meed asked 9/6, 2015 at 16:7
5
Solved
I am a bit confused with the output I get from Mapper.
For example, when I run a simple wordcount program, with this input text:
hello world
Hadoop programming
mapreduce wordcount
lets see if thi...
0
Hadoop V1 java api had HarFileSystem1 class to archive and access small files in hdfs. I want to use the same feature in current version of hadoop, but cannot find any FileSystem class for the same...
1
Solved
As spark runs in-memory what does resource allocation mean in Spark when running on yarn and how does it contrast with hadoop's container allocation?
Just curious to know as hadoop's data and compu...
Slavocracy asked 3/5, 2016 at 18:41
3
Solved
I want to load 1GB (10 Million Records) CSV file into Hbase. I wrote Map-Reduce Program for it. My Code is working fine but taking 1 hour to complete. Last Reducer is taking more than half an hour ...
5
Solved
I am trying to reproduce an Amazon EMR cluster on my local machine. For that purpose, I have installed the latest stable version of Hadoop as of now - 2.6.0.
Now I would like to access an S3 bucket...
Boarish asked 19/1, 2015 at 16:23
1
Solved
In hadoop getmerge description
Usage: hdfs dfs -getmerge src localdst [addnl]
My question is why getmerge is concatenating to the local destination why not hdfs itself ? This question was ask...
Complexity asked 15/4, 2016 at 6:51
3
I am trying to learn MapReduce but I am a little lost right now.
http://hadoop.apache.org/docs/r2.6.0/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduceTutorial.html#Usage
Particularl...
1
Solved
I ran the following command on my test hadoop instance :
hadoop fs -du /test/data/
51179082 153537246 /test/data/9875/2016/02/03
46949272 140847816 /test/data/9875/2016/02/04
I understand du gi...
3
Solved
The Namenode in the Hadoop architecture is a single point of failure.
How do people who have large Hadoop clusters cope with this problem?.
Is there an industry-accepted solution that has worked...
Esperance asked 21/12, 2010 at 17:46
1
Solved
In EMR, is there a way to get a specific value of the configuration given the configuration key using the yarn command?
For example I would like to do something like this
yarn get-config yarn.sch...
Millican asked 7/1, 2016 at 22:31
1
Solved
I don't know how to fix this error:
Vertex failed, vertexName=initialmap, vertexId=vertex_1449805139484_0001_1_00, diagnostics=[Task failed, taskId=task_1449805139484_0001_1_00_000003, diagnostics...
Brandonbrandt asked 12/12, 2015 at 22:20
2
I'm doing some data preparation using a single node hadoop job. The mapper/combiner in my job outputs many keys (more than 5M or 6M) and obviously the job proceeds slowly or even fails. The mapping...
2
Solved
Can you please help me out to the below scenarios.
1) While using Hadoop V2, do we use Secondary NameNode in production environment?
2) For Hadoop V2, suppose we use muliple NameNodes in active/...
0
I'm running Hive 1.0, trying to compute column statistics using the built-in analyze command. HQL script looks like:
set hive.cbo.enable=true;
set hive.compute.query.using.stats=true;
set hive.stat...
3
Solved
I am migrating my application from hadoop 1.0.3 to hadoop 2.2.0 and maven build had hadoop-core marked as dependency. Since hadoop-core is not present for hadoop 2.2.0. I tried replacing it with ha...
2
I am using Hadoop2.2. I see that my jobs are completed with success. I can browse the filesystem to find the output. However, when I browse http://NNode:8088/cluster/apps, I am unable to see any ap...
Aegean asked 1/7, 2014 at 20:1
0
I am trying to use Spark Streaming application in Java. My Spark application reads continuous feed from Hadoop
directory using textFileStream() at interval of each 1 Min.
I need to perform Spark ...
Brython asked 3/9, 2015 at 14:22
1
Solved
I'm new to HBase. Currently I'm using hortonworks sandbox hdp2. While studying Hbase, I came across some questions.
Where does hbase stores data?
If it stores on HDFS, then how it perform update ...
Pacifier asked 24/8, 2015 at 6:20
2
Solved
I have a small query regarding hadoop data writes
From Apache documentation
For the common case, when the replication factor is three, HDFS’s placement policy is to put one replica on one node ...
2
Solved
I was under the impression that combiners are just like reducers that act on the local map task, That is it aggregates the results of individual Map task in order to reduce the network bandwidth fo...
3
I have followed the guide of michael-noll so far but got stuck here.
hduser@ubuntu:/usr/local/hadoop$ bin/hadoop dfs -copyFromLocal /tmp/gutenberg /user/hduser/gutenberg
DEPRECATED: Use of this sc...
© 2022 - 2024 — McMap. All rights reserved.