Insert Python Dictionary using Psycopg2
Asked Answered
P

8

46

What is the best way to insert a Python dictionary with many keys into a Postgres database without having to enumerate all keys?

I would like to do something like...

song = dict()
song['title'] = 'song 1'
song['artist'] = 'artist 1'
...

cursor.execute('INSERT INTO song_table (song.keys()) VALUES (song)')
Pulido answered 5/4, 2015 at 20:34 Comment(0)
K
52
from psycopg2.extensions import AsIs

song = {
    'title': 'song 1',
    'artist': 'artist 1'
}

columns = song.keys()
values = [song[column] for column in columns]

insert_statement = 'insert into song_table (%s) values %s'

    # cursor.execute(insert_statement, (AsIs(','.join(columns)), tuple(values)))
print cursor.mogrify(insert_statement, (AsIs(','.join(columns)), tuple(values)))

Prints:

insert into song_table (artist,title) values ('artist 1', 'song 1')

Psycopg adapts a tuple to a record and AsIs does what would be done by Python's string substitution.

Kandrakandy answered 6/4, 2015 at 12:22 Comment(6)
Didn't know about AsIs. Interesting -- saves needing to deal with the multiplied-out %ss...Homologous
I was concerned this may leave the user vulnerable to injection attacks due to inadequate escaping, but - at least using a basic example - single apostrophes appear to be properly escaped. I don't have time to test more advanced injection techniques, such as (or similar to) the ones described at the following link, so they may still be a concern vs. more standard parameterization techniques. https://mcmap.net/q/40844/-sql-injection-that-gets-around-mysql_real_escape_stringParanoid
why not just use song.values() for the values? :)Taritariff
If I want to get the id of that inserted row, then what to do?Chalcedony
Getting mogrify requires a psycopg2.extensions.cursor but received a 'str' for insert_statement which is clearly a string.Caucus
It looks like this was written in python2, before dictionaries were ordered. Nowadays, you can use .keys() and .values() and order will be maintained.Jaddo
F
35

You can also insert multiple rows using a dictionary. If you had the following:

namedict = ({"first_name":"Joshua", "last_name":"Drake"},
            {"first_name":"Steven", "last_name":"Foo"},
            {"first_name":"David", "last_name":"Bar"})

You could insert all three rows within the dictionary by using:

cur = conn.cursor()
cur.executemany("""INSERT INTO bar(first_name,last_name) VALUES (%(first_name)s, %(last_name)s)""", namedict)

The cur.executemany statement will automatically iterate through the dictionary and execute the INSERT query for each row.

PS: This example is taken from here

Faris answered 15/6, 2016 at 20:45 Comment(2)
Hi vikas, any comparison on performance against cursor.execute() within loop ?Unaware
Is there a similar solution that leverages mogrify for the performance benefits?Upwind
H
16

Something along these lines should do it:

song = dict()
song['title'] = 'song 1'
song['artist'] = 'artist 1'

cols=song.keys();

vals = [song[x] for x in cols]
vals_str_list = ["%s"] * len(vals)
vals_str = ", ".join(vals_str_list)

cursor.execute("INSERT INTO song_table ({cols}) VALUES ({vals_str})".format(
               cols = cols, vals_str = vals_str), vals)

The key part is the generated string of %s elements, and using that in format, with the list passed directly to the execute call, so that psycopg2 can interpolate each item in the vals list (thus preventing possible SQL Injection).

Another variation, passing the dict to execute, would be to use these lines instead of vals, vals_str_list and vals_str from above:

vals_str2 = ", ".join(["%({0})s".format(x) for x in cols])

cursor.execute("INSERT INTO song_table ({cols}) VALUES ({vals_str})".format(
               cols = cols, vals_str = vals_str2), song)
Homologous answered 5/4, 2015 at 22:20 Comment(4)
I would also replace cols with [cursor.mogrify(x) for x in cols], and the same for vals_str, to thwart SQL injections.Untruthful
Agreed that that would certainly add extra protection.Homologous
Reading the psycopg2 doc a bit, mogrify may be unnecessary, as the defintion of that method says that The string returned is exactly the one that would be sent to the database running the execute() method or similar., so I think the columns and %s strings will be mogrify'ed during the execute call.Homologous
I think the main difference would be whether it did it up front, before calling execute, or during the execute itself. If done before, it would essentially just return the same string during the execute run, since it would've already been mogrified. If you wanted to know exactly what was going over the wire, say, for your logs, it may be beneficial to do ahead of time.Homologous
O
9

The new sql module was created for this purpose and added in psycopg2 version 2.7. According to the documentation:

If you need to generate dynamically an SQL query (for instance choosing dynamically a table name) you can use the facilities provided by the psycopg2.sql module.

Two examples are given in the documentation: https://www.psycopg.org/docs/sql.html

names = ['foo', 'bar', 'baz']

q1 = sql.SQL("insert into table ({}) values ({})").format(
    sql.SQL(', ').join(map(sql.Identifier, names)),
    sql.SQL(', ').join(sql.Placeholder() * len(names)))
print(q1.as_string(conn))

insert into table ("foo", "bar", "baz") values (%s, %s, %s)

q2 = sql.SQL("insert into table ({}) values ({})").format(
    sql.SQL(', ').join(map(sql.Identifier, names)),
    sql.SQL(', ').join(map(sql.Placeholder, names)))
print(q2.as_string(conn))

insert into table ("foo", "bar", "baz") values (%(foo)s, %(bar)s, %(baz)s)

Though string concatenation would produce the same result, it should not be used for this purpose, according to psycopg2 documentation:

Warning: Never, never, NEVER use Python string concatenation (+) or string parameters interpolation (%) to pass variables to a SQL query string. Not even at gunpoint.

Ohaus answered 30/7, 2017 at 16:49 Comment(0)
C
1

an other approach for query to mySQL or pgSQL from dictionary is using construction %(dic_key)s, it will be replaced by value from dictionary coresponding by dic_key like {'dic_key': 'dic value'} working perfect, and prevent sqlInjection tested: Python 2.7 see below:

# in_dict = {u'report_range': None, u'report_description': None, 'user_id': 6, u'rtype': None, u'datapool_id': 1, u'report_name': u'test suka 1', u'category_id': 3, u'report_id': None}


cursor.execute('INSERT INTO report_template (report_id, report_name, report_description, report_range, datapool_id, category_id, rtype, user_id) VALUES ' \
                                                                 '(DEFAULT, %(report_name)s, %(report_description)s, %(report_range)s, %(datapool_id)s, %(category_id)s, %(rtype)s, %(user_id)s) ' \
                                                                 'RETURNING "report_id";', in_dict)


OUT: INSERT INTO report_template (report_id, report_name, report_description, report_range, datapool_id, category_id, rtype, user_id) VALUES (DEFAULT, E'test suka 1', NULL, NULL, 1, 3, NULL, 6) RETURNING "report_id";
Catling answered 20/4, 2017 at 13:40 Comment(0)
L
1

Clodaldos answer gets simpler with python3's "ordered dict" promise:

from psycopg2.extensions import AsIs

song = dict(title='song 1', artist='artist 1')

insert_statement = 'insert into song_table (%s) values %s'
cursor.execute(insert_statement, (AsIs(','.join(song.keys())), tuple(song.values())))
Leathers answered 18/4, 2023 at 16:27 Comment(0)
M
0

Using execute_values https://www.psycopg.org/docs/extras.html is faster and has a fetch argument to return something. Next there is some code that might help. columns is a string like col_name1, col_name2 template is the one that allows the matching, a string like %(col_name1)s, %(col_name2)


def insert(cur: RealDictCursor,
        table_name: str,
        values: list[dict],
        returning: str = ''
        ):
    if not values:
        return []

    query = f"""SELECT
                    column_name AS c
                FROM
                    information_schema.columns
                WHERE
                    table_name = '{table_name}'
                AND column_default IS NULL;"""
    cur.execute(query)
    columns_names = cur.fetchall()

    fetch = False
    if returning:
        returning = f'RETURNING {returning}'
        fetch = True

    columns = ''
    template = ''
    for col in columns_names:
        col_name = col['c']
        for val in values:
            if col_name in val:
                continue
            val[col_name] = None

        columns += f'{col_name}, '
        template += f'%({col_name})s, '
    else:
        columns = columns[:-2]
        template = template[:-2]

    query = f"""INSERT INTO {table_name} 
                    ({columns})
                    VALUES %s {returning}"""
    return execute_values(cur, query, values,
                        template=f'({template})', fetch=fetch)
Muleteer answered 16/10, 2022 at 13:33 Comment(0)
M
-3

Python has certain inbuilt features such as join and list using which one can generate the query. Also,the python dictionary offers keys() and values() which can be used to extract column name and column values respectively. This is the approach I used and this should work.

song = dict()
song['title'] = 'song 1'
song['artist'] = 'artist 1'

query = '''insert into song_table (''' +','.join(list(song.keys()))+''') values '''+ str(tuple(song.values()))
cursor.execute(query)
Mourning answered 20/2, 2020 at 9:35 Comment(1)
This is a very, very bad idea. To quote the psycopg documentation: "Never, never, NEVER use Python string concatenation (+) or string parameters interpolation (%) to pass variables to a SQL query string. Not even at gunpoint."Henna

© 2022 - 2024 — McMap. All rights reserved.