How to convert RDD of Avro's GenericData.Record to DataFrame?
Asked Answered
D

1

10

Perhaps this question may seem a bit abstract, here it is:

val originalAvroSchema : Schema   = // read from a file
val rdd : RDD[GenericData.Record] = // From some streaming source

// Looking for a handy:
val df: DataFrame   = rdd.toDF(schema)

I explore spark-avro but it has support only to read from a file, not from existing RDD.

Deci answered 29/3, 2016 at 18:0 Comment(3)
not sure why the answer was deletedSoftcover
there is a pull request for what you are looking for github.com/databricks/spark-avro/pull/113/filesSoftcover
I deleted the answer because it had been downvoted.Otte
G
0
import com.databricks.spark.avro._

val sqlContext = new SQLContext(sc)
val rdd : RDD[MyAvroRecord] = ...
val df = rdd.toAvroDF(sqlContext)
Griselgriselda answered 30/3, 2017 at 8:29 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.