Dask doesn't have a df.to_sql() like pandas and so I am trying to replicate the functionality and create an sql table using the map_partitions
method to do so. Here is my code:
import dask.dataframe as dd
import pandas as pd
import sqlalchemy_utils as sqla_utils
db_url = 'my_db_url_connection'
conn = sqla.create_engine(db_url)
ddf = dd.read_csv('data/prod.csv')
meta=dict(ddf.dtypes)
ddf.map_partitions(lambda df: df.to_sql('table_name', db_url, if_exists='append',index=True), ddf, meta=meta)
This returns my dask dataframe object, but when I go look into my psql server there's no new table... what is going wrong here?
UPDATE Still can't get it to work, but due to independent issue. Follow-up question: duplicate key value violates unique constraint - postgres error when trying to create sql table from dask dataframe