Psycopg2: how to insert and update on conflict using psycopg2 with python?

Asked 25/3, 2020 at 9:10 Answered 7/12, 2022 at 19:0

I am using psycopg2 to insert command to postgres database and when there is a confilict i just want to update the other column values.

Here is the query:

        insert_sql = '''
        INSERT INTO tablename (col1, col2, col3,col4)
        VALUES (%s, %s, %s, %s) (val1,val2,val3,val4)
        ON CONFLICT (col1)
        DO UPDATE SET
        (col2, col3, col4)
        = (val2, val3, val4) ; '''

        cur.excecute(insert_sql)

I want to find where I am doing wrong? I am using variables val1 , val2, val3 not actual values.

Chiropractic answered 25/3, 2020 at 9:10 Comment(0)

To quote from psycopg2's documentation:

Warning Never, never, NEVER use Python string concatenation (+) or string parameters interpolation (%) to pass variables to a SQL query string. Not even at gunpoint.

Now, for an upsert operation you can do this:

insert_sql = '''
    INSERT INTO tablename (col1, col2, col3, col4)
    VALUES (%s, %s, %s, %s)
    ON CONFLICT (col1) DO UPDATE SET
    (col2, col3, col4) = (EXCLUDED.col2, EXCLUDED.col3, EXCLUDED.col4);
'''
cur.execute(insert_sql, (val1, val2, val3, val4))

Notice that the parameters for the query are being passed as a tuple to the execute statement (this assures psycopg2 will take care of adapting them to SQL while shielding you from injection attacks).

The EXCLUDED bit allows you to reuse the values without the need to specify them twice in the data parameter.

Burgher answered 25/3, 2020 at 13:32 Comment(1)

Very helpful! Small note - it should be (col2, col3, col4) = (EXCLUDED.col2, EXCLUDED.col3, EXCLUDED.col4); – Athelstan 10/11, 2020 at 4:33

Using:

INSERT INTO members (member_id, customer_id, subscribed, customer_member_id, phone, cust_atts) VALUES (%s, %s, %s, %s, %s, %s) ON CONFLICT (customer_member_id) DO UPDATE SET (phone) = (EXCLUDED.phone);

I received the following error:

psycopg2.errors.FeatureNotSupported: source for a multiple-column UPDATE item must be a sub-SELECT or ROW() expression
LINE 1: ...ICT (customer_member_id) DO UPDATE SET (phone) = (EXCLUDED.p...

Changing to:

INSERT INTO members (member_id, customer_id, subscribed, customer_member_id, phone, cust_atts) VALUES (%s, %s, %s, %s, %s, %s) ON CONFLICT (customer_member_id) DO UPDATE SET (phone) = ROW(EXCLUDED.phone);

Solved the issue.

Simulated answered 7/2, 2021 at 20:30 Comment(0)

Try:

INSERT INTO tablename (col1, col2, col3,col4)
        VALUES (val1,val2,val3,val4)
        ON CONFLICT (col1)
        DO UPDATE SET
        (col2, col3, col4)
        = (val2, val3, val4) ; '''

Naara answered 25/3, 2020 at 9:16 Comment(6)

it is giving error column val1 doesnot exist, I dont know why it is treating val1 as column – Chiropractic 25/3, 2020 at 9:44

You need to give literals or constants like 10, 'abc' for val1, val2, etc If you want to use a variable you need to use a prepared statement : see psycopg.org/articles/2012/10/01/prepared-statements-psycopg – Naara 25/3, 2020 at 9:48

i am using variable – Chiropractic 25/3, 2020 at 9:49

In place of val1 , val2, i should give {},{} and them cur.exceute( insert_sql.format(val1,val2)) , is this correct. – Chiropractic 25/3, 2020 at 9:51

can you please help ? – Chiropractic 25/3, 2020 at 9:53

Sorry but I am not a python/psycopg2 specialist. <br/> Try to adapt this code: <br/>

cur.execute('''     PREPARE prepared_insert(text, text) AS        INSERT INTO tablename (col1, col2)         VALUES ($1,$2') ''') val1 = '{}' val2 ='{}' cur.execute('EXECUTE prepared_insert(%s, %s)', (val1, val2))

– Naara 25/3, 2020 at 10:3

I haven't seen anyone comment on this, but you can utilize psycopg2.extras.execute_values to insert/update many rows of data at once, which I think is the intended solution to many inserts/updates.

There's a few tutorials on YouTube that illustrate this, one being How to connect to PSQL Database using psycopg2 + Python

In the video they load a dataframe using pandas and insert the data from a CSV source into multiple schemas/tables. The code snippet example in that video is

from psycopg2.extras import execute_values


sql_insert = """    
    INSERT INTO {state}.weather_county(fips_code, county_name, temperature)
    VALUES %s
    ON CONFLICT (fips_code) DO UPDATE
    SET
        temperature=excluded.temperature,
        updated_at=NOW()
    ;
    """



grouped = new_weather_data.groupby(by='state') ## new_weather_data is a dataframe

conn = create_rw_conn(secrets=secrets)


for state, df in grouped:
    # select only the neccessary columns
    df = df[['fips_code', 'county_name', 'temperature']]
    print("[{}] upsert...".format(state))
    # convert dataframe into list of lists for `execute_values`
    data = [tuple(x) for x in df.values.tolist()]
    cur = conn.cursor()
    execute_values(cur, sql_insert.format(state=state), data)
    conn.commit() # <- We MUST commit to reflect the inserted data
    print("[{}] changes were commited...".format(state))
    cur.close()

The Jupyter Notebook is psycopg2-python-tutorial/new-schemas-tables-insert.ipynb

Lindgren answered 7/12, 2022 at 19:0 Comment(0)

-5

Here's the function that takes df, schemaname of the table, name of the table, the column that you want to use as a conflict in the name of conflict, and the engine created by create_engine of sqlalchemy. It updates the table with respect to conflict column. This is extended solution for the solution of @Ionut Ticus . Don't use pandas.to_sql() together. pandas.to_sql() destroys the primary key setting. In this case, one need to set primary key by the ALTER query, which is a suggestion by the function below. Primary key does not necessarily be destroyed by pandas, one might haven't been set it. Error would be in that case: there is no unique constraint matching given keys for referenced table? Function will suggest you to execute below.

engine.execute('ALTER TABLE {schemaname}.{tablename} ADD PRIMARY KEY ({conflictcolumn});

Function:

def update_query(df,schemaname,tablename,conflictcolumn,engine ):
"""
This function takes dataframe as df, name of schema as schemaname,name of the table to append/add/insert as tablename, 
and column name that only  other elements of rows will be changed if it's existed as conflictname,
database engine as engine.
Example to engine : engine_portfolio_pg = create_engine('postgresql://pythonuser:vmqJRZ#[email protected]/cetrm_portfolio')
Example to schemaname,tablename : weatherofcities.sanfrancisco , schemaname = weatherofcities, tablename = sanfrancisco.

"""


excluded = ""
columns = df.columns.tolist()
deleteprimary = columns.copy()
deleteprimary.remove(conflictcolumn)
excluded = ""
replacestring = '%s,'*len(df.columns.tolist())
replacestring = replacestring[:-1]

for column in deleteprimary:
    excluded += "EXCLUDED.{}".format(column)+","
excluded = excluded[:-1]

columns = ','.join(columns)

deleteprimary  = ','.join(deleteprimary)

insert_sql = """ INSERT INTO {schemaname}.{tablename} ({allcolumns})
    VALUES ({replacestring})
    ON CONFLICT ({conflictcolumn}) DO UPDATE SET
    ({deleteprimary}) = ({excluded})""".format( tablename = tablename, schemaname=schemaname,allcolumns = columns, replacestring= replacestring,
                                               conflictcolumn= conflictcolumn,deleteprimary = deleteprimary,  excluded=excluded  )



conn = engine.raw_connection()
conn.autocommit = True

#conn = engine.connect()

cursor = conn.cursor()

i = 0
print("------------------------"*5)

print("If below error happens:")
print("there is no unique constraint matching given keys for referenced table?")    
print("Primary key is not set,you can execute:")
print("engine.execute('ALTER TABLE {}.{} ADD PRIMARY KEY ({});')".format(schemaname,tablename,conflictcolumn))
print("------------------------"*5)
for index, row in df.iterrows():

    cursor.execute(insert_sql, tuple(row.values))
    conn.commit() 
    if i == 0:
        print("Order of Columns in Operated SQL Query for Rows")
        columns = df.columns.tolist()
        print(insert_sql%tuple(columns))
        print("----")
        print("Example of Operated SQL Query for Rows")
        print(insert_sql%tuple(row.values))
        print("---")
        
    i += 1 

conn.close()

Experiment answered 28/12, 2020 at 14:26 Comment(1)

Please DO NOT recommend this style of building dynamic SQL using Python string interpolation. See here Parameters and here Dynamic SQL for reasons why. – Pyrrhic 29/12, 2020 at 0:1

Recommended topics

Hot tags