INSERT or SELECT strategy to always return a row?

Asked 5/10, 2017 at 13:24 Answered 7/10, 2017 at 3:4

Solved sql postgresql concurrency upsert sql-returning

Using Postgres 9.6, I have followed the strategy recommended in https://mcmap.net/q/28920/-return-rows-from-insert-with-on-conflict-without-needing-to-update to do an INSERT or SELECT and return the resulting id:

with ins as (
  insert into prop (prop_type, norm, hash, symbols)
  values (
    $1, $2, $3, $4
  ) on conflict (hash) do
    update set prop_type = 'jargon' where false
  returning id)
select id from ins
union all
select id from prop where hash = $3

However, sometimes this returns nothing. I would have expected it to return a row no matter what. How can I fix it to insure it always returns an id?

NB, despite not returning a row, the row does seem to exist on inspection. I believe the problem may be related to trying to add the same record via two sessions simultaneously.

The table in question is defined as:

create table prop (
  id serial primary key,
  prop_type text not null references prop_type(name),
  norm text not null,
  hash text not null unique,
  symbols jsonb
);

Data:

EDT DETAIL:  parameters: $1 = 'jargon', $2 = 'j2', $3 = 'lXWkZSmoSE0mZ+n4xpWB', $4 = '[]'

If I change prop_type = 'jargon' to prop_type = 'foo' it works! It would seem the lock isn't taken if the expression wouldn't change anything even given the where false clause. Does this really need to depend on my guessing a value that wouldn't be in the row, though? Or is there a better way to ensure you get the lock?

--- UPDATE ---

The overall situation is that the application tried to save a directed acyclic graph using a connection pool (...with autocommit), and was using this query to get id while winnowing out duplications. [Turns out that much smarter is to use a transaction and just serialize to one connection. But the behavior when there is contention here is odd.]

The foreign key constraint doesn't seem to affect the insert -- e.g.:

create table foo(i int unique, prop_id int references prop(id));
insert into foo values (1, 208);
insert into foo values (1, 208) 
on conflict (i) do update set prop_id = 208 where false;
--> INSERT 0 0
insert into foo values (1, 208) 
on conflict (i) do update set prop_id = -208 where false;
--> INSERT 0 0

Note one with valid fk 208, the other with invalid -208. If I connect a select onto either of these with the full pattern, then in situations without contention, they both return i = 1 as expected.

Spurge answered 5/10, 2017 at 13:24 Comment(4)

please post data sample as well, If im not mistaken it always return a row, thus you could even tell from xmin,xmax if it was updated or inserted – Kalman 5/10, 2017 at 13:28

Added data. I think its related to the application trying to save a complex datastructure in which this leaf appears twice, using a connection pool. So the record is added in two different connections. – Spurge 5/10, 2017 at 13:47

with your update I noticed where false which makes update not happen - with that answer will be different :) – Kalman 6/10, 2017 at 8:12

My referenced answer is about SELECT or INSERT, not INSERT or UPDATE. Says so in bold in the first line. And so is this question. I suggest you update the first sentence. – Hintze 7/10, 2017 at 3:8

Your observation seems impossible. The above command should always return an id, either for the newly inserted row or for the pre-existing row. Concurrent writes cannot mess with this since existing conflicting rows are locked. Explanation in this related answer:

How to use RETURNING with ON CONFLICT in PostgreSQL?

Unless an exception is raised, of course. You get an error message instead of a result in that case. Did you check that? Do you have error-handling in place? (In case your app somehow discards error messages: 1) Fix that. 2) There is an additional entry in the DB log with default logging settings.)

I do see a FK constraint in your table definition:

prop_type text not null references prop_type(name),

If you try to insert a row that violates the constraint, that's exactly what happens. If there is no row with name = 'jargon' in table prop_type, that's what you get:

ERROR:  insert or update on table "prop" violates foreign key constraint "prop_prop_type_fkey"
DETAIL:  Key (prop_type)=(jargon) is not present in table "prop_type".

Demo:

dbfiddle here

Your observation would fit the crime:

If I change prop_type = 'jargon' to prop_type = 'foo' it works!

But your explanation is based on misconceptions:

It would seem the lock isn't taken if the expression wouldn't change anything even given the where false clause.

That's not how Postgres works. The lock is taken either way (explanation in above linked answer), and the Postgres locking mechanism never even considers how the new row compares to the old.

Does this really need to depend on my guessing a value that wouldn't be in the row, though? Or is there a better way to ensure you get the lock?

No. And no.

If missing FK values are indeed the problem, you might add missing (distinct) values in a single statement with rCTEs. Simple for single-row inserts like you demonstrate, but works for inserting many rows at once, too. Related:

Hintze answered 7/10, 2017 at 3:4 Comment(7)

thanks for the reply. With your confirmation it does seem unlikely. In fact, jargon is in prop_type while foo is not. I have in the meantime switched to serializing using a single client for the whole graph of objects rather than relying on a pool, which means the error doesn't reproduce, but will try to see if I can go back, as it seemed very odd to me. – Spurge 7/10, 2017 at 6:25

@shaunc: You mean the other way round? foo in prop_type and not jargon? Because if foo was not, this would be starting to get really weird ... – Hintze 7/10, 2017 at 14:31

That is what I mean (!) However, I can't now check on the other possibility you mentioned -- that there was an error I missed somehow (though they are thrown in app unless caught, and my test application log doesn't have any info; I cant find conclusively in pg_log now though). So I'm leaning toward that option as I know its quite difficult to find real bugs. As I said, I can't reproduce with current code but it is weird enough that I do want to go back and try. If there isn't an error, what evidence should I try to look for to [dis/]confirm something strange is happening? – Spurge 8/10, 2017 at 3:50

Sorry -- I mean my original statement was correct. The test fixture has create table prop_type ( name text primary key ); insert into prop_type values ('jargon'), ('code'); – Spurge 8/10, 2017 at 4:9

How can it work with foo while the FK constraint is violated? Impossible. There must be some misunderstanding somewhere ... – Hintze 8/10, 2017 at 4:19

@EriwnBrandstetter - see update above. fk constraint doesn't seem to matter in the absence of contention. One possibility is that the original doesn't always work with "foo" and sometimes not with "jargon" -- rather I was just lucky with "foo". – Spurge 8/10, 2017 at 5:10

Ah, I see. Your update clarified that you used 'foo' only for the UPDATE, which is never executed, not for $1 in the INSERT. In that case fooor jargon or whatever does not make any difference. And your observation was coincidence. In the answer I wrote: If you try to insert a row that violates the constraint which does not apply to the update part. I got confused later, too. – Hintze 8/10, 2017 at 13:5

https://www.postgresql.org/docs/9.5/static/sql-insert.html

ON CONFLICT DO UPDATE guarantees an atomic INSERT or UPDATE outcome; provided there is no independent error, one of those two outcomes is guaranteed, even under high concurrency.

this is regarding the lock youmention in your updated post. Now regarding the initial question with returning row - I read it incarefully first. Now that I saw the where false - with this clause not always you have a row returned. eg:

t=# create table a(i int, e int);
CREATE TABLE
t=# insert into a select 1,1;
INSERT 0 1
t=# create unique index b on a (i);
CREATE INDEX
---now insert on conflict do nothing:
t=# insert into a select 1,1 on conflict do nothing returning *,xmax,xmin;
 i | e | xmax | xmin
---+---+------+------
(0 rows)

INSERT 0 0
-- where false same effect - no rows
t=# insert into a select 1,1 on conflict(i) do update set e=2 where false returning *,xmax,xmin;
 i | e | xmax | xmin
---+---+------+------
(0 rows)
-- now insert without conflict:
t=# insert into a select 2,2 on conflict(i) do update set e=2 where EXCLUDED.e=1 returning *,xmax;
 i | e | xmax
---+---+------
 2 | 2 |    0
(1 row)
-- now insert with update on conflict:
INSERT 0 1
t=# insert into a select 1,1 on conflict(i) do update set e=2 where EXCLUDED.e=1 returning *,xmax;
 i | e |   xmax
---+---+-----------
 1 | 2 | 126943767
(1 row)

Kalman answered 6/10, 2017 at 8:11 Comment(2)

The intro to the question is misleading, it's actually about INSERT or SELECT, not INSERT or UPDATE. And you may have missed the appended UNION ALL SELECT ... in the question. – Hintze 7/10, 2017 at 3:14

@ErwinBrandstetter thank you - Indeed I was answering wrong question :) – Kalman 7/10, 2017 at 8:43

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags