How to write to JDBC source with SparkR 1.6.0?
Asked Answered
S

1

2

With SparkR 1.6.0 I can read from a JDBC source with the following code,

jdbc_url <- "jdbc:mysql://localhost:3306/dashboard?user=<username>&password=<password>"

df <- sqlContext %>%
  loadDF(source     = "jdbc", 
         url        = jdbc_url, 
         driver     = "com.mysql.jdbc.Driver",
         dbtable    = "db.table_name")

But after performing a calculation, when I try to write the data back to the database I've hit a roadblock as attempting...

write.df(df      = df,
         path    = "NULL",
         source  = "jdbc",
         url     = jdbc_url, 
         driver  = "com.mysql.jdbc.Driver",
         dbtable = "db.table_name",
         mode    = "append")

...returns...

ERROR RBackendHandler: save on 55 failed
Error in invokeJava(isStatic = FALSE, objId$id, methodName, ...) : 
  java.lang.RuntimeException: org.apache.spark.sql.execution.datasources.jdbc.DefaultSource does not allow create table as select.
    at scala.sys.package$.error(package.scala:27)
    at org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:259)
    at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:148)
    at org.apache.spark.sql.DataFrame.save(DataFrame.scala:2066)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.spark.api.r.RBackendHandler.handleMethodCall(RBackendHandler.scala:141)
    at org.apache.spark.api.r.RBackendHandler.channelRead0(RBackendHandler.scala:86)
    at org.apache.spark.api.r.RBackendHandler.channelRead0(RBackendHandler.scala:38)
    at io.netty.channel.SimpleChannelIn

Looking around the web I found this which tells me that a patch for this error was included as of version 2.0.0; and we also get the functions read.jdbc and write.jdbc.

For this question, though, assume I'm stuck with SparkR v1.6.0. Is there a way to write to JDBC sources (i.e. is there a workaround that would allow me to use DataFrameWriter.jdbc() from SparkR)?

Sofiasofie answered 16/8, 2017 at 14:21 Comment(8)
Sounds like the answer is "no". The version you have clearly cannot write. If you can't add the patch for the fix you're stuck.Toothed
I thought that may have been the answer, but I wanted I'd ask about a work around anyway. I'm still new to SparkR and there might be something available for this situation that's not in the documentation. I'll accept "No" as the answer here if I get a definitive response.Sofiasofie
What prevents you from upgrading? Shouldn't it just be an R package upgrade?Toothed
I'm stuck using Spark v1.6.0 within my company because that's the production environment we are using until we can update it to Spark 2.x.x. But, your comment is helpful because it makes me realize I may not be stuck with SparkR 1.6.0 even if I'm using Spark 1.6.0. I'm going to try installing SparkR 2.0.0 via Github + devtools and see if the newer SparkR will work with older Spark.Sofiasofie
Right, the writing is being done by the plug in. You should be free to upgrade that without having to change your server installation.Toothed
It was a nice thought, but it turns out this doesn't work. There must be breaking changes between Spark 1.6.x and 2.0.0. I SparkR 2.0.0 will install on the machine but none of the read/write functions work.Sofiasofie
I'm very sorry. Can you export R results to .csv and write Java to persist that?Toothed
Let us continue this discussion in chat.Sofiasofie
S
1

The short answer is, no, the JDBC write method was not supported by SparkR until version 2.0.0.

Sofiasofie answered 17/8, 2017 at 20:42 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.