apache-spark-1.4 Questions
6
I am trying to effectively join two DataFrames, one of which is large and the second is a bit smaller.
Is there a way to avoid all this shuffling? I cannot set autoBroadCastJoinThreshold, because...
Coo asked 7/9, 2015 at 9:26
2
Solved
I am running a Spark streaming application with 2 workers.
Application has a join and an union operations.
All the batches are completing successfully but noticed that shuffle spill metrics are no...
Amari asked 12/6, 2015 at 7:36
3
I am running spark streaming 1.4.0 on Yarn (Apache distribution 2.6.0) with java 1.8.0_45 and also Kafka direct stream. I am also using spark with scala 2.11 support.
The issue I am seeing is that...
Coparcenary asked 13/7, 2015 at 18:1
3
I am using Spark 1.4.1.
I can use spark-submit without problem.
But when I ran ~/spark/bin/spark-shell
I got the error below
I have configured SPARK_HOME and JAVA_HOME.
However, It was OK with Spa...
Malposition asked 8/10, 2015 at 2:45
2
Solved
I'm trying to install Spark on my local machine. I have been following this guide. I have installed JDK-7 (also have JDK-8) and Scala 2.11.7. A problem occurs when I try to use sbt to build Spark 1...
Floatplane asked 26/7, 2015 at 13:53
0
I wrote a custom transformer like it is described here.
When creating a pipeline with my transformer as first step I am able to train a (Logistic Regression) model for classification.
However, wh...
Coypu asked 22/9, 2015 at 10:44
3
Solved
I am new to Apache Spark (version 1.4.1). I wrote a small code to read a text file and stored its data in Rdd .
Is there a way by which I can get the size of data in rdd .
This is my code :
im...
Unwitnessed asked 24/8, 2015 at 9:52
1
Solved
My project has unit tests for different HiveContext configurations (sometimes they are in one file as they are grouped by features.)
After upgrading to Spark 1.4 I encounter a lot of 'java.sql.SQL...
Countermand asked 24/8, 2015 at 23:49
2
Solved
I have a SparkSQL DataFrame.
Some entries in this data are empty but they don't behave like NULL or NA. How could I remove them? Any ideas?
In R I can easily remove them but in sparkR it say tha...
Fetterlock asked 23/7, 2015 at 21:46
1
© 2022 - 2024 — McMap. All rights reserved.