Pandas read_sql with parameters
Asked Answered
C

2

92

Are there any examples of how to pass parameters with an SQL query in Pandas?

In particular I'm using an SQLAlchemy engine to connect to a PostgreSQL database. So far I've found that the following works:

df = psql.read_sql(('select "Timestamp","Value" from "MyTable" '
                     'where "Timestamp" BETWEEN %s AND %s'),
                   db,params=[datetime(2014,6,24,16,0),datetime(2014,6,24,17,0)],
                   index_col=['Timestamp'])

The Pandas documentation says that params can also be passed as a dict, but I can't seem to get this to work having tried for instance:

df = psql.read_sql(('select "Timestamp","Value" from "MyTable" '
                     'where "Timestamp" BETWEEN :dstart AND :dfinish'),
                   db,params={"dstart":datetime(2014,6,24,16,0),"dfinish":datetime(2014,6,24,17,0)},
                   index_col=['Timestamp'])

What is the recommended way of running these types of queries from Pandas?

Casket answered 25/6, 2014 at 12:21 Comment(0)
S
132

The read_sql docs say this params argument can be a list, tuple or dict (see docs).

To pass the values in the sql query, there are different syntaxes possible: ?, :1, :name, %s, %(name)s (see PEP249).
But not all of these possibilities are supported by all database drivers, which syntax is supported depends on the driver you are using (psycopg2 in your case I suppose).

In your second case, when using a dict, you are using 'named arguments', and according to the psycopg2 documentation, they support the %(name)s style (and so not the :name I suppose), see http://initd.org/psycopg/docs/usage.html#query-parameters.
So using that style should work:

df = psql.read_sql(('select "Timestamp","Value" from "MyTable" '
                     'where "Timestamp" BETWEEN %(dstart)s AND %(dfinish)s'),
                   db,params={"dstart":datetime(2014,6,24,16,0),"dfinish":datetime(2014,6,24,17,0)},
                   index_col=['Timestamp'])
Sappy answered 25/6, 2014 at 20:49 Comment(8)
That's very helpful - I am using psycopg2 so the '%(name)s syntax works perfectly.Casket
We should probably mention something about that in the docstring:github.com/pydata/pandas/issues/7573Sappy
This solution no longer works on Postgres - one needs to use the :notation, and then be sure to wrap the SQL string with sqlalchemy.text()Hellcat
This doesn't work on SQLite with SQLAlchemy's engine.Meghan
SQLite uses the :name style, see docs.python.org/3/library/sqlite3.html#sqlite3.Cursor.executeSappy
The correct characters for the parameter style can be looked up dynamically by the way in nearly every database driver via the paramstyle attribute. This is acutally part of the PEP 249 definition.Formulaic
For SQL Server use ?. E.g., to call a stored proc with parameters, having a connection cnn, it is like pd.read_sql("[dbo].[MyProc] ?,?" , cnn, params=[a,b])Cobham
%(name)s still works perfectly fine in 2022 with psycopg2.Algebraist
I
-1

I was having trouble passing a large number of parameters when reading from a SQLite Table. Then it turns out since you pass a string to read_sql, you can just use f-string. Tried the same with MSSQL pyodbc and it works as well.

For SQLite, it would look like this:

# write a sample table into memory
from sqlalchemy import create_engine
df = pd.DataFrame({'Timestamp': pd.date_range('2020-01-17', '2020-04-24', 10), 'Value1': range(10)})
engine = create_engine('sqlite://', echo=False)
df.to_sql('MyTable', engine);

# query the table using a query
tpl = (1, 3, 5, 8, 9)
query = f"""SELECT Timestamp, Value1 FROM MyTable WHERE Value1 IN {tpl}"""
df = pd.read_sql(query, engine)

If the parameters are datetimes, it's a bit more complicated but calling the datetime conversion function of the SQL dialect you're using should do the job.

start, end = '2020-01-01', '2020-04-01'
query = f"""SELECT Timestamp, Value1 FROM MyTable WHERE Timestamp BETWEEN STRFTIME("{start}") AND STRFTIME("{end}")"""
df = pd.read_sql(query, engine)
Inaptitude answered 3/3, 2023 at 7:23 Comment(1)
I would say f-strings for SQL parameters are best avoided owing to the risk of SQL injection attacks, e.g. see psycopg.org/psycopg3/docs/basic/params.html#sql-injectionCasket

© 2022 - 2024 — McMap. All rights reserved.