How to find out if an upsert was an update with PostgreSQL 9.5+ UPSERT?
Asked Answered
W

4

39

Writable CTEs were considered a solution to UPSERT prior to 9.5 as described in Insert, on duplicate update in PostgreSQL?

It is possible to perform an UPSERT with the information whether it ended up as an UPDATE or an INSERT with the following Writable CTEs idiom:

WITH
    update_cte AS (
        UPDATE t SET v = $1 WHERE id = $2 RETURNING 'updated'::text status
    ),
    insert_cte AS (
        INSERT INTO t(id, v) SELECT $2, $1 WHERE NOT EXISTS
            (SELECT 1 FROM update_cte) RETURNING 'inserted'::text status
    )
 (SELECT status FROM update_cte) UNION (SELECT status FROM insert_cte)

This query will return either "updated" or "inserted", or may (rarely) fail with a constraint violation in as described in https://dba.stackexchange.com/questions/78510/why-is-cte-open-to-lost-updates

Can something similar be achieved using PostgreSQL 9.5+ new "UPSERT" syntax, benefiting from its optimization and avoiding the possible constraint violation?

Wallin answered 13/1, 2016 at 9:23 Comment(0)
D
37

I believe xmax::text::int > 0 would be the easiest trick:

so=# DROP TABLE IF EXISTS tab;
NOTICE:  table "tab" does not exist, skipping
DROP TABLE
so=# CREATE TABLE tab(id INT PRIMARY KEY, col text);
CREATE TABLE
so=# INSERT INTO tab(id, col) VALUES (1,'a');
INSERT 0 2
so=# INSERT INTO tab(id, col)
VALUES (3, 'c'), (4, 'd'), (1,'aaaa')
ON CONFLICT (id) DO UPDATE SET col = EXCLUDED.col
returning *,case when xmax::text::int > 0 then 'updated' else 'inserted' end,ctid;
 id | col  |   case   | ctid
----+------+----------+-------
  3 | c    | inserted | (0,3)
  4 | d    | inserted | (0,4)
  1 | aaaa | updated  | (0,5)
(3 rows)

INSERT 0 3
so=# INSERT INTO tab(id, col)
VALUES (3, 'c'), (4, 'd'), (1,'aaaa')
ON CONFLICT (id) DO UPDATE SET col = EXCLUDED.col
returning *,case when xmax::text::int > 0 then 'updated' else 'inserted' end,ctid;
 id | col  |  case   | ctid
----+------+---------+-------
  3 | c    | updated | (0,6)
  4 | d    | updated | (0,7)
  1 | aaaa | updated | (0,8)
(3 rows)

INSERT 0 3
Detonation answered 29/10, 2017 at 15:17 Comment(2)
You could just use: ``` RETURNING (xmax = 0) AS inserted ```Esra
Is there any way to find out which rows were not affected because of DO NOTHING clause?Infrared
W
10

Drawing from @lad2025's answer, the result can be achieved by abusing settings and customized options with related functions in WHERE clauses to get a required side-effect.

CREATE TABLE t(id INT PRIMARY KEY, v TEXT);

INSERT INTO t (id, v)
    SELECT $1, $2
    WHERE 'inserted' = set_config('upsert.action', 'inserted', true)
    ON CONFLICT (id) DO UPDATE
        SET v = EXCLUDED.v
        WHERE 'updated' = set_config('upsert.action', 'updated', true)
RETURNING current_setting('upsert.action') AS "upsert.action";

The third parameter of set_config is is_local: true means the setting will go away at the end of transaction. More precisely, current_setting('upsert.action') will return NULL (and not throw an error) until the end of the session.

Wallin answered 13/1, 2016 at 13:41 Comment(3)
Oh my, that's ugly! It makes my vision blur. I definitely do not want to remember this, let alone apply it!~Chromium
you can do it with xmax::text::int > 0 - without transaction level setting (which is indeed original hack)Detonation
@VaoTsun is right. See https://mcmap.net/q/409803/-postgres-9-5-upsert-to-return-the-count-of-updated-and-inserted-rows for an example of how to use xmax for this purpose. Not pretty, but definitely prettier than set_config / current_setting.Vendor
S
7

In SQL Server MERGE statement has $action that returns string 'INSERT', 'UPDATE', or 'DELETE'.

For Postgresql I can't find function/variable that does similiar thing for RETURNING.

One way to workaround it is to add column is_updated to your table:

DROP TABLE IF EXISTS tab;

CREATE TABLE tab(id INT PRIMARY KEY, col VARCHAR(100),
                 is_updated BOOLEAN DEFAULT false);
INSERT INTO tab(id, col) VALUES (1,'a'), (2, 'b');


-- main query
INSERT INTO tab(id, col)
VALUES (3, 'c'), (4, 'd'), (1,'aaaa')
ON CONFLICT (id) DO UPDATE SET col = EXCLUDED.col, is_updated = true
RETURNING id,col,
          CASE WHEN is_updated THEN 'UPDATED' ELSE 'INSERTED' END AS action;

Rextester Demo

Output:

╔════╦══════╦══════════╗
║ id ║ col  ║  action  ║
╠════╬══════╬══════════╣
║  3 ║ c    ║ INSERTED ║
║  4 ║ d    ║ INSERTED ║
║  1 ║ aaaa ║ UPDATED  ║
╚════╩══════╩══════════╝
Solicit answered 13/1, 2016 at 10:39 Comment(5)
Where is VALUES (2, 'b') in the output?Flashing
@Flashing There is no need for it. It exists before. Why do you want to return entire table? Only records that are inserted/updatedSolicit
But (2, 'b') is also inserted. And is_updated defaults to false. So I figured it would show up as INSERTED.Flashing
@Flashing The point is this: imagine you have table with 2 milions records. Now you execute the main query. The returning will give back only 3 records (2 inserted, 1 updated). The first insert is to prepopulate data (for demo).Solicit
OK, so it's is relative to that particular INSERT. Very useful, thank you.Flashing
P
4

(xmax::text::bigint > 0) or (NOT xmax = 0). Typecast to integer will break once transaction count will reach the integer overflow.

Presa answered 3/9, 2018 at 16:3 Comment(2)
How likely is this to happen?Hollar
It depends on your transactions volume. It happened for us on a database that was running for years with multiple transactions per second.Presa

© 2022 - 2024 — McMap. All rights reserved.