mapreduce Questions

7

I am trying to create partition for my Table inorder to update a value. This is my sample data 1,Anne,Admin,50000,A 2,Gokul,Admin,50000,B 3,Janet,Sales,60000,A I want to update Janet's Departme...
Billiot asked 18/9, 2014 at 6:46

1

Solved

Our spark aggregation jobs are taking a lot of execution time to complete. It supposed to complete in 5 mins but taking 30 to 40 minutes to complete. dataproc cluster logging say it's trying to sca...

10

[hadoop-1.0.2] → hadoop jar hadoop-examples-1.0.2.jar wordcount /user/abhinav/input /user/abhinav/output Warning: $HADOOP_HOME is deprecated. ****hdfs://localhost:54310/user/abhinav/input 12/04/15...
Hokanson asked 15/4, 2012 at 19:58

9

I am trying to run a hadoop-streaming python job. bin/hadoop jar contrib/streaming/hadoop-0.20.1-streaming.jar -D stream.non.zero.exit.is.failure=true -input /ixml -output /oxml -mapper scrip...
Hydrocarbon asked 2/12, 2010 at 20:56

6

Where is the classpath for hadoop set? When I run the below command it gives me the classpath. Where is the classpath set? bin/hadoop classpath I'm using hadoop 2.6.0
Irrelevance asked 1/2, 2015 at 7:53

8

I am passing input and output folders as parameters to mapreduce word count program from webpage. Getting below error: HTTP Status 500 - Request processing failed; nested exception is java.la...
Ankledeep asked 24/7, 2014 at 3:48

5

Solved

What are the disadvantages of mapreduce? There are lots of advantages of mapreduce. But I would like to know the disadvantages of mapreduce too.
Puritanism asked 3/9, 2013 at 6:47

12

Solved

my configurations are hduser@worker1:/usr/local/hadoop/conf$ jps The program 'jps' can be found in the following packages: * openjdk-6-jdk * openjdk-7-jdk Ask your administrator to install one o...
Chabazite asked 20/10, 2011 at 23:24

4

Solved

The data type of the field is String. I would like to find the length of the longest and shortest value for a field in mongoDB. I have totally 500000 documents in my collection.
Ophir asked 16/10, 2014 at 2:46

5

I am new to hadoop and trying to process wikipedia dump. It's a 6.7 GB gzip compressed xml file. I read that hadoop supports gzip compressed files but can only be processed by mapper on a single jo...
Gretagretal asked 12/4, 2011 at 4:0

4

Solved

I have a large CSV file containing a list of stores, in which one of the field is ZipCode. I have a separate MongoDB database called ZipCodes, which stores the latitude and longitude for any given ...
Melon asked 6/10, 2010 at 19:9

2

Solved

I have a problem to execute mapreduce python files on Hadoop by using Hadoop streaming.jar. I use: Windows 10 64bit Python 3.6 and my IDE is spyder 3.2.6, Hadoop 2.3.0 jdk1.8.0_161 I can get answ...
Fragrant asked 21/2, 2018 at 22:2

3

I've been getting the following error in several cases: 2017-03-23 11:55:10,794 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics report...
Illuviation asked 23/3, 2017 at 10:7

5

I'm using spark in order to calculate the pagerank of user reviews, but I keep getting Spark java.lang.StackOverflowError when I run my code on a big dataset (40k entries). when running the code on...
Sverre asked 19/6, 2016 at 16:32

3

I have just learned about MapReduce, so I wondered if there are any advantages in writing const initialValue = 0; if (this.items) { return this.items.filter(function (item) { return item &&...

5

Solved

From any node in a Hadoop cluster, what is the command to identify the running namenode? identify all running datanodes? I have looked through the commands manual and have not found this.
Alston asked 1/6, 2013 at 3:33

5

I need to find connected components for a huge dataset. (Graph being Undirected) One obvious choice is MapReduce. But i'm a newbie to MapReduce and am quiet short of time to pick it up and to code...
Maclaine asked 20/5, 2012 at 21:30

3

Is there a way to do the following in CouchDB? A way to return unique, distinct values by a given key? SELECT DISTINCT field FROM table WHERE key="key1" 'key1' => 'somevalue' 'key1' => 'som...
Schecter asked 28/3, 2011 at 8:58

11

Solved

Are there any dependencies between Spark and Hadoop? If not, are there any features I'll miss when I run Spark without Hadoop?
Iridescence asked 15/8, 2015 at 6:51

4

On the whim of node school, I am trying to use reduce to count the number of times a string is repeated in an array. var fruits = ["Apple", "Banana", "Apple", "Durian", "Durian", "Durian"], obj ...
Dabchick asked 16/7, 2015 at 22:16

8

I'm trying to run very simple task with mapreduce. mapper.py: #!/usr/bin/env python import sys for line in sys.stdin: print line my txt file: qwerty asdfgh zxc Command line to run the job: ...
Muzzleloader asked 27/3, 2017 at 14:6

3

What is assertThat() method? How can it be useful? I had seen this method in mapreduce program in hadoop. Can anyone explain brief about it?
Bernetta asked 27/8, 2016 at 8:33

2

I trying to find algorithm of searching disjoint sets (connected components/union-find) on large amount of data with apache spark. Problem is amount of data. Even Raw representation of graph vertex...
Soapberry asked 18/5, 2016 at 10:39

3

Solved

Consider this class: case class Person(val firstName: String, val lastName: String, age: Int) val persons = Person("Jane", "Doe", 42) :: Person("John", "Doe", 45) :: Person("Joe", "Doe", 43) :...

10

Solved

I know how to "transform" a simple Java List from Y -> Z, i.e.: List<String> x; List<Integer> y = x.stream() .map(s -> Integer.parseInt(s)) .collect(Collectors.toList(...
Moresque asked 18/9, 2014 at 2:14

© 2022 - 2024 — McMap. All rights reserved.