SparkR from Rstudio - gives Error in invokeJava(isStatic = TRUE, className, methodName, ...) :
Asked Answered
D

3

1

I am using RStudio.

After creating session if i try to create dataframe using R data it gives error.

Sys.setenv(SPARK_HOME = "E:/spark-2.0.0-bin-hadoop2.7/spark-2.0.0-bin-hadoop2.7")
Sys.setenv(HADOOP_HOME = "E:/winutils")
.libPaths(c(file.path(Sys.getenv("SPARK_HOME"), "R", "lib"), .libPaths()))
Sys.setenv('SPARKR_SUBMIT_ARGS'='"sparkr-shell"')

library(SparkR)

sparkR.session(sparkConfig = list(spark.sql.warehouse.dir="C:/Temp"))

localDF <- data.frame(name=c("John", "Smith", "Sarah"), age=c(19, 23, 18))
df <- createDataFrame(localDF)

ERROR :

Error in invokeJava(isStatic = TRUE, className, methodName, ...) : 
  java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
    at java.lang.reflect.Constructor.newInstance(Unknown Source)
    at org.apache.spark.sql.hive.client.IsolatedClientLoader.createClient(IsolatedClientLoader.scala:258)
    at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:359)
    at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:263)
    at org.apache.spark.sql.hive.HiveSharedState.metadataHive$lzycompute(HiveSharedState.scala:39)
    at org.apache.spark.sql.hive.HiveSharedState.metadataHive(HiveSharedState.scala:38)
    at org.apache.spark.sql.hive.HiveSharedState.externalCatalog$lzycompute(HiveSharedState.scala:46)
    at org.apache.spark.sql.hive.HiveSharedState.externalCatalog(HiveSharedState.scala:45)
    at org.a
>

TIA.

enter image description here

Dagny answered 10/8, 2016 at 1:47 Comment(7)
I am using prebuilt spark2.0.0 on windows 8Dagny
Thanks in advance ! @Hack-RDagny
Do you have JAVA_HOME setup as explained in this post nishutayaltech.blogspot.in/2015/04/…Kotz
Hi @Kotz thanks , yep i have a configured JAVA_HOME . I also followed the blog and executed spark sample program works fine ( giving correct output-3.14) so this means local spark set up in standalone mode is fine. I am trying SparkR on Rstudio for POC. Any pointers on this ?Dagny
Did you try to set the Spark session as explained here? people.apache.org/~pwendell/spark-nightly/spark-master-docs/…Ockeghem
hi @JaimeCr yep, tried with this also. infact SparkSession get created like - "Java ref type org.apache.spark.sql.SparkSession id 1 " . Session gets created from command prompt also, by that i mean if i execute the SparkR from command prompt from spark home directory. while executing r code system hangs though.Dagny
$hive.metastore.warehouse.dir [1] "file:C:\\Users\\Nineteen\\Documents/spark-warehouse" $spark.app.name [1] "SparkR" $spark.driver.memory [1] "1g" $spark.driver.port [1] "49676" $spark.executor.id [1] "driver" $spark.executorEnv.LD_LIBRARY_PATH [1] "$LD_LIBRARY_PATH:" $spark.home [1] "E:\\spark-2.0.0-bin-hadoop2.7\\spark-2.0.0-bin-hadoop2.7" $spark.master [1] "local[*]" $spark.sql.catalogImplementation [1] "hive" $spark.sql.warehouse.dir [1] "/file:C:\\temp" $spark.submit.deployMode [1] "client"Dagny
W
1

If you have not used SparkR library but you're using Spark, I recommend 'sparklyr' library made by RStudio.

  1. Install the preview version of RStudio.

  2. Install the library:

    install.packages("devtools")
    devtools::install_github('rstudio/sparklyr')
    
  3. Load library and install spark.

    library(sparklyr)
    spark_install('1.6.2')
    

You can see a vignette in http://spark.rstudio.com/

Whine answered 10/8, 2016 at 5:0 Comment(1)
but this is for 1.6.1 version right ? will it work for spark 2.0.0 ? Also i wanted to make use of SparkML component for my POC and subsequently usage in actual analysis. In parallel I will check the sparklyrDagny
D
5

All many thanks for your help.

  1. I had to do was set hadoop_home path in PATH variables (winutils/bin). This should have your winutils.exe file. So when it creates metastore for hive default derby) it is able to call hive classes.
  2. Also i had set hive support as False as i am not using it.

Sys.setenv(SPARK_HOME='E:/spark-2.0.0-bin-hadoop2.7/spark-2.0.0-bin-hadoop2.7',HADOOP_HOME='E:/winutils')

.libPaths(c(file.path(Sys.getenv('SPARK_HOME'), 'R', 'lib'),.libPaths()))

Sys.setenv('SPARKR_SUBMIT_ARGS'='"sparkr-shell"')

library(SparkR)
library(rJava)

sparkR.session(enableHiveSupport = FALSE,master = "local[*]", sparkConfig = list(spark.driver.memory = "1g",spark.sql.warehouse.dir="E:/winutils/bin/"))

df <- as.DataFrame(iris)
Dagny answered 10/8, 2016 at 12:2 Comment(2)
I am having the same error, and unfortunately, these fixes don't still solve the problem. Any other thoughts?Relational
Also for me disabling Hive support switch solved the issue . No other change from default was needed. Not sure how general this a solution is.Matti
W
1

If you have not used SparkR library but you're using Spark, I recommend 'sparklyr' library made by RStudio.

  1. Install the preview version of RStudio.

  2. Install the library:

    install.packages("devtools")
    devtools::install_github('rstudio/sparklyr')
    
  3. Load library and install spark.

    library(sparklyr)
    spark_install('1.6.2')
    

You can see a vignette in http://spark.rstudio.com/

Whine answered 10/8, 2016 at 5:0 Comment(1)
but this is for 1.6.1 version right ? will it work for spark 2.0.0 ? Also i wanted to make use of SparkML component for my POC and subsequently usage in actual analysis. In parallel I will check the sparklyrDagny
F
1

These are the steps that I did in RStudio and it worked for me:

Sys.setenv(SPARK_HOME="C:\\spark-1.6.1-bin-hadoop2.6")
.libPaths(c(file.path(Sys.getenv("SPARK_HOME"), "R", "lib"), .libPaths()))

library(SparkR)
sc <- sparkR.init(master="local")
sqlContext <- sparkRSQL.init(sc)

localDF <- data.frame(name=c("John", "Smith", "Sarah"), age=c(19, 23, 18))
df <- createDataFrame(sqlContext, localDF)
Fannie answered 9/12, 2016 at 17:31 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.