mapreduce - 2

5

Solved

Running a job using hadoop streaming and mrjob: PipeMapRed.waitOutputThreads(): subprocess failed with code 1

Hey I'm fairly new to the world of Big Data. I came across this tutorial on http://musicmachinery.com/2011/09/04/how-to-process-a-million-songs-in-20-minutes/ It describes in detail of how to run...

python hadoop mapreduce hadoop-streaming mrjob

Hsining asked 11/6, 2013 at 5:50

4

Solved

MultipleOutputFormat in hadoop

I'm a newbie in Hadoop. I'm trying out the Wordcount program. Now to try out multiple output files, i use MultipleOutputFormat. this link helped me in doing it. http://hadoop.apache.org/common/do...

java hadoop mapreduce

Snowdrop asked 16/8, 2010 at 6:42

10

Solved

IllegalAccessError to guava's StopWatch from org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus

I'm trying to run small spark application and am getting the following exception: Exception in thread "main" java.lang.IllegalAccessError: tried to access method com.google.common.base.Stopwatch.&...

hadoop apache-spark mapreduce guava

Pinnacle asked 5/4, 2016 at 13:7

11

Solved

What is Hive: Return Code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask

I am getting: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask While trying to make a copy of a partitioned table using the commands in the hive console: CR...

hadoop mapreduce hive

Physics asked 25/6, 2012 at 8:4

9

Solved

Writing to HDFS could only be replicated to 0 nodes instead of minReplication (=1)

I have 3 data nodes running, while running a job i am getting the following given below error , java.io.IOException: File /user/ashsshar/olhcache/loaderMap9b663bd9 could only be replicated to 0 ...

java hadoop mapreduce hive hdfs

Drinkable asked 22/3, 2013 at 13:29

15

Solved

Count lines in large files

I commonly work with text files of ~20 Gb size and I find myself counting the number of lines in a given file very often. The way I do it now it's just cat fname | wc -l, and it takes very long. I...

linux shell mapreduce

Tactile asked 3/10, 2012 at 20:42

3

Solved

Is there something sys.minint in python similar to sys.maxint?

Is there something sys.minint in python similar to sys.maxint ?

python mapreduce

Undersecretary asked 21/5, 2018 at 10:0

5

Solved

How to build OpenCV with Java under Linux using command line?(Gonna use it in MapReduce)

Recently I'm trying OpenCV out for my graduation project. I've had some success under Windows enviroment. And because with Windows package of OpenCV it comes with pre-built libraries, so I don't ha...

java linux opencv build mapreduce

Apulia asked 30/6, 2013 at 2:26

4

Solved

Export data from Amazon Redshift as JSON

We are migrating from Redshift to Spark. I have a table in Redshift that I need to export to S3. From S3 this will be fed to Apache Spark (EMR). I found there is only one way to export data from ...

amazon-web-services apache-spark amazon-s3 mapreduce amazon-redshift

Mosera asked 25/10, 2016 at 10:27

8

Solved

Hadoop DistributedCache is deprecated - what is the preferred API?

My map tasks need some configuration data, which I would like to distribute via the Distributed Cache. The Hadoop MapReduce Tutorial shows the usage of the DistributedCache class, roughly as follo...

java hadoop mapreduce

Wimmer asked 20/1, 2014 at 16:53

3

Solved

java -Dlog4j.configuration command not working

i have a problem with Hadoop mapreduce in R, and in the logs i did find this : log4j:WARN No appenders could be found for logger (org.apache.hadoop.ipc.Server). log4j:WARN Please initialize the lo...

hadoop mapreduce log4j

Starstudded asked 29/6, 2015 at 7:34

2

Solved

How to print on console during MapReduce job execution in hadoop

I want to print each step of my "map" after its execution on the console. Something like System.out.println("Completed Step one"); System.out.println("Completed Step two"); and so on Is there ...

hadoop mapreduce

Signatory asked 4/8, 2011 at 13:53

4

'Collection' object is not callable. If you meant to call the 'mapReduce' method on a 'Collection' object it is failing because no such method exists

I'm using pyMongo 1.11 and MongoDB 1.8.2. I'm trying to do a fairly complex Map/Reduce. I prototyped the functions in Mongo and got it working, but when I tried transferring it to Python, I get: -...

python mongodb mapreduce pymongo

Unsparing asked 5/8, 2011 at 21:45

10

Solved

Reduce a key-value pair into a key-list pair with Apache Spark

I am writing a Spark application and want to combine a set of Key-Value pairs (K, V1), (K, V2), ..., (K, Vn) into one Key-Multivalue pair (K, [V1, V2, ..., Vn]). I feel like I should be able to do ...

python apache-spark mapreduce pyspark rdd

Egmont asked 18/11, 2014 at 19:15

2

Solved

How to calculate the running total using aggregate?

I'm developing a simple financial app for keeping track of incomes and outcomes. For the sake of simplicity, let's suppose these are some of my documents: { description: "test1", amount: ...

mongodb mongodb-query mapreduce aggregation-framework cumulative-sum

Armyn asked 17/1, 2015 at 1:6

4

Solved

Hadoop Word count: receive the total number of words that start with the letter "c"

Heres the Hadoop word count java map and reduce source code: In the map function, I've gotten to where I can output all the word that starts with the letter "c" and also the total number of times ...

java hadoop mapreduce

Kalie asked 5/10, 2014 at 23:56

4

Solved

MapReduce on Hadoop says 'Output file already exists'

I ran a wordcount example using Mapreduce the first time, and it worked. Then, I stopped the cluster, started it back in a while, and followed the same procedure. Showed this error: 10P:/$ hado...

hadoop mapreduce

Emeric asked 4/8, 2015 at 19:2

5

Solved

Where does hadoop mapreduce framework send my System.out.print() statements ? (stdout)

I want to debug a mapreduce script, and without going into much trouble tried to put some print statements in my program. But I cant seem to find them in any of the logs.

hadoop mapreduce

Crawfish asked 8/7, 2010 at 19:34

5

Solved

Factorializing a number with .reduce()

I am trying to write a function that will produce the factorial of a provided integer and then reduce the factorial array (by multiplying each array element). For example: factor(5) >>> [1, 2, 3,...

javascript arrays function mapreduce factorial

Sewel asked 3/3, 2016 at 19:5

2

Solved

How to get the taskID or mapperID(something like partitionID in Spark) in a hive UDF?

As question, How to get the taskID or mapperID(something like partitionID in Spark) in a hive UDF ?

apache-spark hive apache-spark-sql mapreduce user-defined-functions

Deliladelilah asked 22/6, 2021 at 7:22

3

Solved

Change output filename prefix for DataFrame.write()

Output files generated via the Spark SQL DataFrame.write() method begin with the "part" basename prefix. e.g. DataFrame sample_07 = hiveContext.table("sample_07"); sample_07.write().parquet("sampl...

java scala apache-spark apache-spark-sql mapreduce

Vitkun asked 19/3, 2016 at 21:46

2

Solved

Hadoop command line -D options not working

I am trying to pass a variable (not property) using -D command line option in hadoop like -Dmapred.mapper.mystring=somexyz. I am able to set a conf property in Driver program and read it back in ma...

hadoop mapreduce hadoop2

Cardiomegaly asked 8/7, 2014 at 12:39

3

Solved

Difference between Application Manager and Application Master in YARN?

I understood how MRv1 works.Now I am trying to understand MRv2.. what's the difference between Application Manager and Application Master in YARN?

hadoop mapreduce hadoop-yarn

Kadiyevka asked 21/6, 2015 at 17:19

2

Solved

Cassandra NOT EQUAL Operator

Question to all Cassandra experts out there. I have a column family with about a million records. I would like to query these records in such a way that I should be able to perform a Not-Equal-To...

mapreduce cassandra cql3

Independency asked 21/2, 2014 at 4:49

8

Solved

MapReduce implementation in Scala

I'd like to find out good and robust MapReduce framework, to be utilized from Scala.

scala frameworks google-analytics mapreduce

Osmo asked 7/6, 2009 at 15:14

mapreduce Questions

Recommended topics

Hot tags