SQL Alchemy, pymssql, Pandas 0.24.2 to_sql Trying to create table when table already exists
Asked Answered
M

2

0

I am trying to use Pandas and Sql Alchemy. This is basically what I am trying to do. If I drop the table, it will create it but I want it to append and not have to do table renaming. I have tried updating and changing versions of all the libraries. I am at a loss. If I start with no table it creates it, then i run the code again and it crashes. The error message just says the table already exists, which I know, that is why I am telling it to append. Also, before the load i am reading data using PYMSSQL and it reads fine to a dataframe

Python Command

def writeDFtoSSDatabase(tgtDefiniton,df):
try:
    if int(tgtDefiniton.loadBatchSize) > 0:
        batchSize = int(tgtDefiniton.loadBatchSize)
    else:
        batchSize = 1000
    #Domain error using SQL Alchemy
    logging.debug("Writting Dataframe to SQL Server database")
    #hardcoded type beccause that is only type for now
    with createDBConnection(tgtDefiniton.tgtDatabaseServer
                                ,tgtDefiniton.tgtDatabaseDatabase
                                ,tgtDefiniton.tgtDatabaseUser
                                ,tgtDefiniton.tgtDatabasePassword,tgtDefiniton.tgtDataType).connect().execution_options(schema_translate_map={
                                                                                                                        None: tgtDefiniton.tgtDatabaseSchema}) as conn:
        logging.debug("Writting DF to Database table {0}".format(tgtDefiniton.tgtDatabaseTable))
        logging.debug("ifTableExists: {0}.".format(tgtDefiniton.ifTableExists))
        
        if tgtDefiniton.ifTableExists == "append":
            logging.debug('Appending Data')
            df.to_sql(tgtDefiniton.tgtDatabaseTable,con=conn,if_exists='append',chunksize = batchSize,index=False)
        elif tgtDefiniton.ifTableExists == "replace":
            logging.debug('Replacing Table and Data')
            df.to_sql(tgtDefiniton.tgtDatabaseTable,con=conn,if_exists='replace',chunksize = batchSize,index=False)
        else:
            df.to_sql(tgtDefiniton.tgtDatabaseTable,con=conn,if_exists='fail',index=False)
        logging.debug("Data wrote to database")
except Exception as e:
    logging.error(e)
    raise

Error

(Background on this error at: http://sqlalche.me/e/e3q8)
2021-08-30 13:31:42 ERROR    (pymssql.OperationalError) (2714, b"There is already an object 
named 'test' in the database.DB-Lib error message 20018, severity 16:\nGeneral SQL Server 
error: Check messages from the SQL Server\n")

EDIT: Log Entry

  2021-08-30 13:31:36 DEBUG    Writting Dataframe to SQL Server database
  2021-08-30 13:31:36 DEBUG    create_engine(mssql+pymssql://REST OF             CONNECTION INFO
  2021-08-30 13:31:36 DEBUG    DB Engine Created
  2021-08-30 13:31:36 DEBUG    Writting DF to Database table test
  2021-08-30 13:31:36 DEBUG    ifTableExists: append.
  2021-08-30 13:31:36 DEBUG    Appending Data
  2021-08-30 13:31:42 ERROR    (pymssql.OperationalError) (2714, b"There is already an object named 'test' in the database.DB-Lib error message 20018, severity 16:\nGeneral SQL Server error: Check messages from the SQL Server\n")

[SQL:

Maclay answered 30/8, 2021 at 13:42 Comment(5)
what's your query? add your execute line of codeApotheosis
A complete stack trace would be helpful here. Can you provide a minimal reproducible example?Sigismondo
I edited and added my full functionMaclay
Are you certain that tgtDefiniton.ifTableExists is getting set correctly? It looks like you're inadvertently calling .to_sql() with if_exists='fail'.Sigismondo
Yes, i am editing and adding a log entry @GordThompsonMaclay
D
2

I had the same problem and I found two ways to solve it although I lack the insight as to why this solves it:

  1. Either pass the database name in the url when creating a connection
  2. or pass the database name as a schema in pd.to_sql.

Doing both does not hurt.

```
#create connection to MySQL DB via sqlalchemy & pymysql
user = credentials['user']
password = credentials['password']
port = credentials['port']
host = credentials['hostname']
dialect = 'mysql'
driver = 'pymysql'
db_name = 'test_db'

# setup SQLAlchemy   
from sqlalchemy import create_engine 
cnx = f'{dialect}+{driver}://{user}:{password}@{host}:{port}/' 
engine = create_engine(cnx) 

# create database
with engine.begin() as con:
    con.execute(f"CREATE DATABASE {db_name}")

############################################################
# either pass the db_name  vvvv - HERE- vvvv after creating a database
cnx = f'{dialect}+{driver}://{user}:{password}@{host}:{port}/{db_name}'      
############################################################
engine = create_engine(cnx) 

table = 'test_table'
col = 'test_col'
with engine.begin() as con:
    # this would work here instead of creating a new engine with a new link
    # con.execute(f"USE {db_name}")
    con.execute(f"CREATE TABLE {table} ({col} CHAR(1));")

# insert into database
import pandas as pd
df = pd.DataFrame({col : ['a','b','c']})

with engine.begin() as con:
    # this has no effect here
    # con.execute(f"USE {db_name}")
    df.to_sql(
        name= table,
        if_exists='append',
        con=con, 
############################################################
# or pass it as a schema vvvv - HERE - vvvv
        #schema=db_name,
############################################################
        index=False
    )```

Tested with python version 3.8.13 and sqlalchemy 1.4.32. Same problem might have appeared here and here.

Dispirit answered 15/6, 2022 at 20:9 Comment(0)
D
0

If I understood you correctly you are trying to upload pandas dataframe into SQL table that already exists. Then you just need to create a connection with sql alchemy and write your data to the table:

 import pyodbc
 import sqlalchemy
 import urllib
 from sqlalchemy.pool import NullPool
 serverName = 'Server_Name'
 dataBase = 'Database_Name'
 conn_str = urllib.parse.quote_plus(
            r'DRIVER={SQL Server};SERVER=' + serverName + r';DATABASE=' + dataBase + r';TRUSTED_CONNECTION=yes')
 conn = 'mssql+pyodbc:///?odbc_connect={}'.format(conn_str) #IF you are using MS Sql Server Studio
 engine = sqlalchemy.create_engine(conn, poolclass=NullPool)
 connection = engine.connect()
 sql_table.to_sql('Your_Table_Name', engine, schema='Your_Schema_Name', if_exists='append', index=False,
                  chunksize=200)
 connection.close()
 
Deledda answered 30/8, 2021 at 18:25 Comment(5)
I am running on a linux machine with no SQL Server ODBC installed so i cannot use PYODBCMaclay
are you using sqlite?Deledda
sql server using pymssqlMaclay
Then it's engine = create_engine('mssql+pymssql://username@hostname:port/dbname')Deledda
I have a function creating the engine that returns the engine. That part is correct. I think it has to do with my packages or package versionsMaclay

© 2022 - 2025 — McMap. All rights reserved.