I am trying to load a table from an SQLite .db file stored on a local disk. Is there any way to do this in PySpark?
My solution works but not as elegant. I read the table using Pandas though sqlite3. Schema information is not passed (may or may not be a problem). Is there a direct way to load the table without using Pandas?
import sqlite3
import pandas as pd
db_path = 'alocalfile.db'
query = 'SELECT * from ATableToLoad'
conn = sqlite3.connect(db_path)
a_pandas_df = pd.read_sql_query(query, conn)
a_spark_df = SQLContext.createDataFrame(a_pandas_df)
Using JDBC I have not figured out in PySpark.