scalding Questions
3
After getting code from git using clone https://github.com/twitter/scalding.git and doing ./sbt update I get:
::::::::::::::::::::::::::::::::::::::::::::::
[warn] :: UNRESOLVED DEPENDENCIES ::
[w...
2
Solved
I'm working on a DSL for relational (SQL-like) operators. I have a Rep[Table] type with an .apply: ((Symbol, ...)) => Obj method that returns an object Obj which defines .flatMap: T1 => T2 an...
Blowbyblow asked 22/5, 2015 at 16:45
1
Recently we moved from using scalding to spark. I used eclipse and the scala IDE for eclipse to write code and tests. The tests ran fine with twitter's JobTest class. Any class using JobTest would ...
Bosco asked 21/5, 2015 at 1:42
5
In shell I typed gradle cleanJar in the Impatient/part1 directory. The output is below. The error is "class file for org.apache.hadoop.mapred.JobConf not found". Why did it fail to compile?
:clean...
1
Solved
I see this:
Scalding: How to retain the other field, after a groupBy('field){.size}?
it's a real pain and a mess comparing to Apache Pig... What do I do wrong? Can I do the same like GENERATE(...
4
Solved
Why do Scala and frameworks like Spark and Scalding have both reduce and foldLeft? So then what's the difference between reduce and fold?
Hotpress asked 6/8, 2014 at 11:7
3
Solved
How can you write to multiple outputs dependent on the key using Scalding(/cascading) in a single Map Reduce Job. I could of course use .filter for all the possible keys, but that is a horrible hac...
2
Solved
So people have been having problems compressing the output of Scalding Jobs including myself. After googling I get the odd hiff of an answer in a some obscure forum somewhere but nothing suitable f...
Perfuse asked 29/5, 2014 at 17:42
2
Solved
We have many small files that need combining. In Scalding you can use TextLine to read files as text lines. The problem is we get 1 mapper per file, but we want to combine multiple files so that th...
1
Solved
I'm using scala 2.10 and gradle 1.11
My problem is that the compiled jar drop an error when I try to running in the hadoop cluster.
I want to run on hadoop because I using scalding.
The exception...
2
Solved
If you want to create a pipe with more than 22 fields from a smaller one in Scalding you are limited by Scala tuples, which cannot have more than 22 items.
Is there a way to use collections instea...
1
Solved
So my input data has two fields/columns: id1 & id2, and my code is the following:
TextLine(args("input"))
.read
.mapTo('line->('id1,'id2)) {line: String =>
val fields = line.split("\t")...
1
Solved
In Scala, how does one uncompress the text contained in file.gz so that it can be processed? I would be happy with either having the contents of the file stored in a variable, or saving it as a loc...
1
© 2022 - 2024 — McMap. All rights reserved.