Postgres UPSERT (INSERT or UPDATE) only if value is different

Asked 12/8, 2010 at 4:22 Answered 6/6, 2024 at 4:27

I'm updating a Postgres 8.4 database (from C# code) and the basic task is simple enough: either UPDATE an existing row or INSERT a new one if one doesn't exist yet. Normally I would do this:

UPDATE my_table
SET value1 = :newvalue1, ..., updated_time = now(), updated_username = 'evgeny'
WHERE criteria1 = :criteria1 AND criteria2 = :criteria2

and if 0 rows were affected then do an INSERT:

INSERT INTO my_table(criteria1, criteria2, value1, ...)
VALUES (:criteria1, :criteria2, :newvalue1, ...)

There is a slight twist, though. I don't want to change the updated_time and updated_username columns unless any of the new values are actually different from the existing values to avoid misleading users about when the data was updated.

If I was only doing an UPDATE then I could add WHERE conditions for the values as well, but that won't work here, because if the DB is already up to date the UPDATE will affect 0 rows and then I would try to INSERT.

Can anyone think of an elegant way to do this, other than SELECT, then either UPDATE or INSERT?

Meagher answered 12/8, 2010 at 4:22 Comment(2)

possible duplicate of Insert, on duplicate update (postgresql) – Kingsly 12/8, 2010 at 4:28

No, not a duplicate. The answer there basically just encapsulates in a function what I've written above. – Meagher 12/8, 2010 at 4:31

Take a look at a BEFORE UPDATE trigger to check and set the correct values:

CREATE OR REPLACE FUNCTION my_trigger() RETURNS TRIGGER LANGUAGE plpgsql AS
$$
BEGIN
    IF OLD.content = NEW.content THEN
        NEW.updated_time= OLD.updated_time; -- use the old value, not a new one.
    ELSE
        NEW.updated_time= NOW();
    END IF;
    RETURN NEW;
END;
$$;

Now you don't even have to mention the field updated_time in your UPDATE query, it will be handled by the trigger.

http://www.postgresql.org/docs/current/interactive/plpgsql-trigger.html

Jae answered 12/8, 2010 at 7:25 Comment(0)

Two things here. Firstly depending on activity levels in your database you may hit a race condition between checking for a record and inserting it where another process may create that record in the interim. The manual contains an example of how to do this link example

To avoid doing an update there is the suppress_redundant_updates_trigger() procedure. To use this as you wish you wold have to have two before update triggers the first will call the suppress_redundant_updates_trigger() to abort the update if no change made and the second to set the timestamp and username if the update is made. Triggers are fired in alphabetical order. Doing this would also mean changing the code in the example above to try the insert first before the update.

Example of how suppress update works:

    DROP TABLE sru_test;

    CREATE TABLE sru_test(id integer not null primary key,
    data text,
    updated timestamp(3));

    CREATE TRIGGER z_min_update
    BEFORE UPDATE ON sru_test
    FOR EACH ROW EXECUTE PROCEDURE suppress_redundant_updates_trigger();

    DROP FUNCTION set_updated();

    CREATE FUNCTION set_updated()
    RETURNS TRIGGER
    AS $$
    DECLARE
    BEGIN
        NEW.updated := now();
        RETURN NEW;
    END;
    $$ LANGUAGE plpgsql;

    CREATE TRIGGER zz_set_updated
    BEFORE INSERT OR UPDATE ON sru_test
    FOR EACH ROW EXECUTE PROCEDURE  set_updated();

insert into sru_test(id,data) VALUES (1,'Data 1');
insert into sru_test(id,data) VALUES (2,'Data 2');

select * from sru_test;

update sru_test set data = 'NEW';

select * from sru_test;

update sru_test set data = 'NEW';

select * from sru_test;

update sru_test set data = 'ALTERED'  where id = 1;

select * from sru_test;

update sru_test set data = 'NEW' where id = 2;

select * from sru_test;

Assembler answered 12/8, 2010 at 10:0 Comment(0)

Postgres is getting UPSERT support . It is currently in the tree since 8 May 2015 (commit):

This feature is often referred to as upsert.

This is implemented using a new infrastructure called "speculative insertion". It is an optimistic variant of regular insertion that first does a pre-check for existing tuples and then attempts an insert. If a violating tuple was inserted concurrently, the speculatively inserted tuple is deleted and a new attempt is made. If the pre-check finds a matching tuple the alternative DO NOTHING or DO UPDATE action is taken. If the insertion succeeds without detecting a conflict, the tuple is deemed inserted.

A snapshot is available for download. It has not yet made a release.

Topography answered 8/5, 2015 at 6:39 Comment(0)

INSERT INTO table_name(column_list) VALUES(value_list) ON CONFLICT target action;

https://www.postgresqltutorial.com/postgresql-upsert/

Dummy example :

insert into user_profile (user_id, resident_card_no, last_name) values
 (103, '14514367', 'joe_inserted' ) 
on conflict on constraint user_profile_pk do 
update set resident_card_no = '14514367', last_name = 'joe_updated';

Fougere answered 11/3, 2020 at 13:20 Comment(1)

While this works as a general upsert, it does not provide a valid solution the OP's problem because it will update ` updated_time` and updated_username on a conflict without checking the values. – Ezara 27/3, 2024 at 11:5

The RETURNING clause enables you to chain your queries; the second query uses the results from the first. (in this case to avoid re-touching the same rows) (RETURNING is available since postgres 8.4)

Shown here embedded in a a function, but it works for plain SQL, too

DROP SCHEMA tmp CASCADE;
CREATE SCHEMA tmp ;
SET search_path=tmp;

CREATE TABLE my_table
        ( updated_time timestamp NOT NULL DEFAULT now()
        , updated_username varchar DEFAULT '_none_'
        , criteria1 varchar NOT NULL
        , criteria2 varchar NOT NULL
        , value1 varchar
        , value2 varchar
        , PRIMARY KEY (criteria1,criteria2)
        );

INSERT INTO  my_table (criteria1,criteria2,value1,value2)
SELECT 'C1_' || gs::text
        , 'C2_' || gs::text
        , 'V1_' || gs::text
        , 'V2_' || gs::text
FROM generate_series(1,10) gs
        ;

SELECT * FROM my_table ;

CREATE function funky(_criteria1 text,_criteria2 text, _newvalue1 text, _newvalue2 text)
RETURNS VOID
AS $funk$
WITH ins AS (
        INSERT INTO my_table(criteria1, criteria2, value1, value2, updated_username)
        SELECT $1, $2, $3, $4, COALESCE(current_user, 'evgeny' )
        WHERE NOT EXISTS (
                SELECT * FROM my_table nx
                WHERE nx.criteria1 = $1 AND nx.criteria2 = $2
                )
        RETURNING criteria1 AS criteria1, criteria2 AS criteria2
        )
        UPDATE my_table upd
        SET value1 = $3, value2 = $4
        , updated_time = now()
        , updated_username = COALESCE(current_user, 'evgeny')
        WHERE 1=1
        AND criteria1 = $1 AND criteria2 = $2 -- key-condition
        AND (value1 <> $3 OR value2 <> $4 )   -- row must have changed
        AND NOT EXISTS (
                SELECT * FROM ins -- the result from the INSERT
                WHERE ins.criteria1 = upd.criteria1
                AND ins.criteria2 = upd.criteria2
                )
        ;
$funk$ language sql
        ;

SELECT funky('AA', 'BB' , 'CC', 'DD' );            -- INSERT
SELECT funky('C1_3', 'C2_3' , 'V1_3', 'V2_3' );    -- (null) UPDATE 
SELECT funky('C1_7', 'C2_7' , 'V1_7', 'V2_7777' ); -- (real) UPDATE 

SELECT * FROM my_table ;

RESULT:

        updated_time        | updated_username | criteria1 | criteria2 | value1 | value2  
----------------------------+------------------+-----------+-----------+--------+---------
 2013-03-13 16:37:55.405267 | _none_           | C1_1      | C2_1      | V1_1   | V2_1
 2013-03-13 16:37:55.405267 | _none_           | C1_2      | C2_2      | V1_2   | V2_2
 2013-03-13 16:37:55.405267 | _none_           | C1_3      | C2_3      | V1_3   | V2_3
 2013-03-13 16:37:55.405267 | _none_           | C1_4      | C2_4      | V1_4   | V2_4
 2013-03-13 16:37:55.405267 | _none_           | C1_5      | C2_5      | V1_5   | V2_5
 2013-03-13 16:37:55.405267 | _none_           | C1_6      | C2_6      | V1_6   | V2_6
 2013-03-13 16:37:55.405267 | _none_           | C1_8      | C2_8      | V1_8   | V2_8
 2013-03-13 16:37:55.405267 | _none_           | C1_9      | C2_9      | V1_9   | V2_9
 2013-03-13 16:37:55.405267 | _none_           | C1_10     | C2_10     | V1_10  | V2_10
 2013-03-13 16:37:55.463651 | postgres         | AA        | BB        | CC     | DD
 2013-03-13 16:37:55.472783 | postgres         | C1_7      | C2_7      | V1_7   | V2_7777
(11 rows)

Claw answered 13/3, 2013 at 15:13 Comment(0)

Here is a method that doesn't require a stored procedure or trigger at all.

create table my_table (
    id       varchar(32) primary key not null,
    username varchar(32),
    -- ...
    etag     varchar(32), -- use a hash of the fields you want to trigger a version increment
    version  int default 1 not null
);

insert into my_table 
    (id, username, etag) 
values 
    (?, ?, ?) 
on conflict 
    (id) 
do update set 
    username = excluded.username,
    etag     = excluded.etag,
    version  = case when excluded.etag <> my_table.etag then my_table.version + 1 else my_table.version end
;

Catarinacatarrh answered 6/6, 2024 at 4:27 Comment(0)

-1

Start a transaction. Use a select to see if the data you'd be inserting already exists, if it does, do nothing, otherwise update, if it does not exist, then insert. Finally close the transaction.

Unsex answered 12/8, 2010 at 7:17 Comment(1)

This is suboptimal because a common use case could be to bulk insert a bunch of rows in a transaction – Deprecate 23/4, 2013 at 23:43

Recommended topics

Hot tags