Execute SQL file with multiple statements separated by ";" using pyodbc
Asked Answered
B

3

17

I am currently writing a script to run multiple SQL files using Python, a little background before you mention alternative methods; this is to automate the scripts and Python is the only tools I have on our windows 2008 server. I have a script that works for one set but the issue is when the other set has two statements instead of one seperated by a ';' here is my code:

import os
import pyodbc

print ("Connecting via ODBC")

conn = pyodbc.connect('DSN=dsn', autocommit=True)

print ("Connected!\n")

 inputdir = 'C:\\path'
cursor = conn.cursor()

for script in os.listdir(inputdir):

   with open(inputdir+'\\' + script,'r') as inserts:

       sqlScript = inserts.readlines()

       sql = (" ".join(sqlScript))

       cursor.execute(sql)

       print (script)

conn.close()

print ('Run Complete!')

So this code works to show the entire file but it only executes one statement before ";".

Any help would be great!

Thanks.

Babe answered 9/8, 2016 at 16:57 Comment(1)
Many SQL APIs don't allow multiple statements in a single call. Why not just execute them with separate cursor.execute() calls?Avery
I
22

The API in the pyodbc connector (or pymysql) doesn't allow multiple statements in a SQL call. This is an issue of engine parsing; an API would need to completely understand the SQL that it's passing in order for multiple statements to be passed, and then multiple results handled upon return.

A slight modification to your script like the one below should allow you to send each of your statements individually with separate connectors:

import os
import pyodbc

print ("Connecting via ODBC")

conn = pyodbc.connect('DSN=dsn', autocommit=True)

print ("Connected!\n")

inputdir = 'C:\\path'

for script in os.listdir(inputdir):
    with open(inputdir+'\\' + script,'r') as inserts:
        sqlScript = inserts.readlines()
        for statement in sqlScript.split(';'):
            with conn.cursor() as cur:
                cur.execute(statement)
    print(script)

conn.close()

The with conn.cursor() as cur: opens a closes a cursor for each statement, exiting appropriately after each call is completed.

Ibson answered 9/8, 2016 at 18:19 Comment(5)
Minor problem in the example — probably you should do inserts.read() instead of inserts.readlines() as readlines returns a list of strings which cannot be split directly.Mcevoy
IMPORTANT: this example will crash if you have ; within /* ... */ comment, like commented-out statement. More stable approach would be to parse comments and string literals, and split by semicolons only outside of these entities.Mcevoy
I agree this is an incomplete solution. Again, this gets back to needing to be able to fully parse the code in order to split it up intelligently. Definitely take a look at the other answers if you need to try parsing commented code.Ibson
Splitting the script into single statements did the trick for me. If I tried to read the file line by line it was too error-prone. But splitting by the semicolon and then executing each single statement via db.session.execute(sql) worked. Thanks a lot, man. Sometimes the solution is so simple ...Willwilla
One option which avoids issues with stray semicolons (whether in comments or expressions) is to split on multiple characters. For example, you could end all statements with "/**/;" or TAB; Yes you would have to alter existing scripts to follow this convention and make sure to avoid this pattern elsewhere, but for me often times the sql is actually output from other sql so following this convention is easy enough.Prescription
M
9

The more correct approach is to parse comments and quoted strings, and only consider ;s outside of them. Or else your code will be broken immediately after you comment out several SQL statements with a block comment.

Here is a state machine based implementation I made for myself - this code is probably ugly and could be written much better, so please feel free to improve it by editing my answer. It doesn't handle MySQL-style #-starting comments but it is easy to add.

def split_sql_expressions(text):
    current = ''
    state = None
    for c in text:
        if state is None:  # default state, outside of special entity
            current += c
            if c in '"\'':
                # quoted string
                state = c
            elif c == '-':
                # probably "--" comment
                state = '-'
            elif c == '/':
                # probably '/*' comment
                state = '/'
            elif c == ';':
                # remove it from the statement
                current = current[:-1].strip()
                # and save current stmt unless empty
                if current:
                    yield current
                current = ''
        elif state == '-':
            if c != '-':
                # not a comment
                state = None
                current += c
                continue
            # remove first minus
            current = current[:-1]
            # comment until end of line
            state = '--'
        elif state == '--':
            if c == '\n':
                # end of comment
                # and we do include this newline
                current += c
                state = None
            # else just ignore
        elif state == '/':
            if c != '*':
                state = None
                current += c
                continue
            # remove starting slash
            current = current[:-1]
            # multiline comment
            state = '/*'
        elif state == '/*':
            if c == '*':
                # probably end of comment
                state = '/**'
        elif state == '/**':
            if c == '/':
                state = None
            else:
                # not an end
                state = '/*'
        elif state[0] in '"\'':
            current += c
            if state.endswith('\\'):
                # prev was backslash, don't check for ender
                # just revert to regular state
                state = state[0]
                continue
            elif c == '\\':
                # don't check next char
                state += '\\'
                continue
            elif c == state[0]:
                # end of quoted string
                state = None
        else:
            raise Exception('Illegal state %s' % state)

    if current:
        current = current.rstrip(';').strip()
        if current:
            yield current

And use it like this:

with open('myfile.sql', 'r') as sqlfile:
    for stmt in split_sql_expressions(sqlfile.read()):
        cursor.execute(stmt)
Mcevoy answered 9/8, 2016 at 16:57 Comment(0)
W
1

Try this, using dynamic sql (exec()) feature of Sql Server.

cursor.execute("exec('" + stmt.replace("'", "''") + "')")

This works, as pyodbc only executes first statement in a batch of statements. Above is a dynamic query feature as part of Microsoft SQL Server, with which we can trick pyodbc to think that entire batch of statements is one single "exec sql" statement.

Windowshop answered 16/3, 2023 at 3:42 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.