sparkr - 2 - McMap

3

Solved

As a new version of spark (1.4) was released there appeared to be a nice frontend interfeace to spark from R package named sparkR. On the documentation page of R for spark there is a command that e...

r csv apache-spark apache-spark-sql sparkr

Repentance asked 3/7, 2015 at 10:50

2

Why is collect in SparkR so slow?

I have a 500K row spark DataFrame that lives in a parquet file. I'm using spark 2.0.0 and the SparkR package inside Spark (RStudio and R 3.3.1), all running on a local machine with 4 cores and 8gb ...

r apache-spark sparkr

Weakly asked 19/9, 2016 at 15:23

1

Not able to retrieve data from SparkR created DataFrame

I have below simple SparkR program, which is to create a SparkR DataFrame and retrieve/collect data from it. Sys.setenv(HADOOP_CONF_DIR = "/etc/hadoop/conf.cloudera.yarn") Sys.setenv(SPARK_HOME = ...

r hadoop apache-spark hive sparkr

Hollah asked 25/7, 2016 at 21:47

2

Solved

How to unnest data with SparkR?

Using SparkR how can nested arrays be "exploded along"? I've tried using explode like so: dat <- nested_spark_df %>% mutate(a=explode(metadata)) %>% head() but though the above does...

r apache-spark hive sparkr

Anstus asked 27/7, 2016 at 0:58

7

Solved

Unable to launch SparkR in RStudio

After long and difficult installation process of SparkR i getting into new problems of launching SparkR. My Settings R 3.2.0 RStudio 0.98.1103 Rtools 3.3 Spark 1.4.0 Java Version 8 SparkR 1.4....

r windows apache-spark rstudio sparkr

Gooseberry asked 29/6, 2015 at 15:5

4

SparkR Error in sparkR.init(master="local") in RStudio

I have installed the SparkR package from Spark distribution into the R library. I can call the following command and it seems to work properly: library(SparkR) However, when I try to get the Spark...

apache-spark rstudio sparkr

Intaglio asked 9/7, 2015 at 15:37

1

Solved

SparkR window function

I found from JIRA that 1.6 release of SparkR has implemented window functions including lag and rank, but over function is not implemented yet. How can I use window function like lag function witho...

r apache-spark apache-spark-sql window-functions sparkr

Abulia asked 19/1, 2016 at 20:6

1

Solved

How to do map and reduce in SparkR

How do I do map and reduce operations using SparkR? All I can find is stuff about SQL queries. Is there a way to do map and reduce using SQL?

apache-spark sparkr

Infielder asked 23/6, 2015 at 20:22

1

Solved

Writing R data frames returned from SparkR:::map

I am using SparkR:::map and my function returns a large-ish R dataframe for each input row, each of the same shape. I would like to write these dataframes as parquet files without 'collect'ing them...

r apache-spark sparkr

Petrel asked 27/11, 2015 at 16:4

0

java.lang.OutOfMemoryError: Java heap space when SparkR collect

My collected data size is 1.3g and all the configurations about driver memory are set to 3g. Why the out of memory is still happening?? This is my detail configuration of sparkR and OOM excepti...

java sparkr

Eavesdrop asked 2/11, 2015 at 10:15

1

Using SparkR JVM to call methods from a Scala jar file

I wanted to be able to package DataFrames in a Scala jar file and access them in R. The end goal is to create a way to access specific and often-used database tables in Python, R, and Scala without...

r scala apache-spark apache-spark-sql sparkr

Aftmost asked 23/10, 2015 at 20:55

1

why SparkR isn't available in CRAN R package list? [duplicate]

I checked for sparkR package in CRAN package list through the following link. https://cran.r-project.org/web/packages/available_packages_by_date.html This list does not include sparkR, and ...

r apache-spark package sparkr

Priestly asked 16/9, 2015 at 6:32

2

Solved

How to handle null entries in SparkR

I have a SparkSQL DataFrame. Some entries in this data are empty but they don't behave like NULL or NA. How could I remove them? Any ideas? In R I can easily remove them but in sparkR it say tha...

r apache-spark sparkr apache-spark-1.4

Fetterlock asked 23/7, 2015 at 21:46

1

Solved

SparkR collect method crashes with OutOfMemory on Java heap space

With SparkR, I'm trying for a PoC to collect an RDD that I created from text files which contains around 4M lines. My Spark cluster is running in Google Cloud, is bdutil deployed and is composed w...

r apache-spark google-hadoop sparkr

Mcnew asked 4/6, 2015 at 13:45

sparkr Questions

Recommended topics

Hot tags