Why is the error "Unable to find encoder for type stored in a Dataset" when encoding JSON using case classes?
Asked Answered
U

3

18

I've written spark job:

object SimpleApp {
  def main(args: Array[String]) {
    val conf = new SparkConf().setAppName("Simple Application").setMaster("local")
    val sc = new SparkContext(conf)
    val ctx = new org.apache.spark.sql.SQLContext(sc)
    import ctx.implicits._

    case class Person(age: Long, city: String, id: String, lname: String, name: String, sex: String)
    case class Person2(name: String, age: Long, city: String)

    val persons = ctx.read.json("/tmp/persons.json").as[Person]
    persons.printSchema()
  }
}

In IDE when I run the main function, 2 error occurs:

Error:(15, 67) Unable to find encoder for type stored in a Dataset.  Primitive types (Int, String, etc) and Product types (case classes) are supported by importing sqlContext.implicits._  Support for serializing other types will be added in future releases.
    val persons = ctx.read.json("/tmp/persons.json").as[Person]
                                                                  ^

Error:(15, 67) not enough arguments for method as: (implicit evidence$1: org.apache.spark.sql.Encoder[Person])org.apache.spark.sql.Dataset[Person].
Unspecified value parameter evidence$1.
    val persons = ctx.read.json("/tmp/persons.json").as[Person]
                                                                  ^

but in Spark Shell I can run this job without any error. what is the problem?

Unknowable answered 11/1, 2016 at 6:46 Comment(0)
C
38

The error message says that the Encoder is not able to take the Person case class.

Error:(15, 67) Unable to find encoder for type stored in a Dataset.  Primitive types (Int, String, etc) and Product types (case classes) are supported by importing sqlContext.implicits._  Support for serializing other types will be added in future releases.

Move the declaration of the case class outside the scope of SimpleApp.

Crewelwork answered 11/1, 2016 at 7:2 Comment(3)
Why does scoping make any difference here? I am getting that error while using the REPL.West
I'm trying to understand the why the scope of case class is making the difference if you can point me to any resource that I can read and understand it will be of great help. Pretty new with scala implicit :( @jacek-laskowskiPhillisphilly
I don't think I'm capable of explaining why the solution works the way it does. I vaguely remember that it has nothing to do with implicits which are simply a mechanism to plug a code and think the code itself is the root cause.Marthena
C
4

You have the same error if you add sqlContext.implicits._ and spark.implicits._ in SimpleApp (the order doesn't matter).

Removing one or the other will be the solution:

val spark = SparkSession
  .builder()
  .getOrCreate()

val sqlContext = spark.sqlContext
import sqlContext.implicits._ //sqlContext OR spark implicits
//import spark.implicits._ //sqlContext OR spark implicits

case class Person(age: Long, city: String)
val persons = ctx.read.json("/tmp/persons.json").as[Person]

Tested with Spark 2.1.0

The funny thing is if you add the same object implicits twice you will not have problems.

Ceaseless answered 28/2, 2017 at 14:18 Comment(0)
D
3

@Milad Khajavi

Define Person case classes outside object SimpleApp. Also, add import sqlContext.implicits._ inside main() function.

Dessiatine answered 30/8, 2018 at 7:23 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.