Delete parent if it's not referenced by any other child

P

2

11

I have an example situation: parent table has a column named id, referenced in child table as a foreign key.

When deleting a child row, how to delete the parent as well if it's not referenced by any other child?

Pearly answered 4/4, 2013 at 11:8 Comment(0)

A

17

In PostgreSQL 9.1 or later you can do this with a single statement using a data-modifying CTE. This is generally less error prone. It minimizes the time frame between the two DELETEs in which a race conditions could lead to surprising results with concurrent operations:

WITH del_child AS (
    DELETE FROM child
    WHERE  child_id = 1
    RETURNING parent_id, child_id
    )
DELETE FROM parent p
USING  del_child x
WHERE  p.parent_id = x.parent_id
AND    NOT EXISTS (
   SELECT FROM child c
   WHERE  c.parent_id = x.parent_id
   AND    c.child_id <> x.child_id   -- !
   );

fiddle
_{Old sqlfiddle}

The child is deleted in any case. I quote the manual:

Data-modifying statements in WITH are executed exactly once, and always to completion, independently of whether the primary query reads all (or indeed any) of their output. Notice that this is different from the rule for SELECT in WITH: as stated in the previous section, execution of a SELECT is carried only as far as the primary query demands its output.

The parent is only deleted if it has no other children.
Note the last condition. Contrary to what one might expect, this is necessary, since:

The sub-statements in WITH are executed concurrently with each other and with the main query. Therefore, when using data-modifying statements in WITH, the order in which the specified updates actually happen is unpredictable. All the statements are executed with the same snapshot (see Chapter 13), so they cannot "see" each others' effects on the target tables.

Bold emphasis mine.
I used the column name parent_id in place of the non-descriptive id.

Eliminate race condition

To eliminate the possible race conditions completely, lock the parent row first. All similar operations must follow the same procedure to make it work.

WITH lock_parent AS (
   SELECT p.parent_id, c.child_id
   FROM   child  c
   JOIN   parent p ON p.parent_id = c.parent_id
   WHERE  c.child_id = 12              -- provide child_id here once
   FOR    NO KEY UPDATE                -- locks parent row.
   )
 , del_child AS (
   DELETE FROM child c
   USING  lock_parent l
   WHERE  c.child_id = l.child_id
   )
DELETE FROM parent p
USING  lock_parent l
WHERE  p.parent_id = l.parent_id
AND    NOT EXISTS (
   SELECT FROM child c
   WHERE  c.parent_id = l.parent_id
   AND    c.child_id <> l.child_id   -- !
   );

This way only one transaction at a time can lock the same parent. So it cannot happen that multiple transactions delete children of the same parent, still see other children and spare the parent, while all of the children are gone afterwards. (Updates on non-key columns are still allowed with FOR NO KEY UPDATE.)

If such cases never occur or you can live with it (hardly ever) happening - the first query is cheaper. Else, this is the secure path.

FOR NO KEY UPDATE was introduced with Postgres 9.4. Details in the manual. In older versions use the stronger lock FOR UPDATE instead.

Amaliaamalie answered 4/4, 2013 at 11:44 Comment(7)

It minimizes the time frame between the two DELETEs in which a race conditions could lead to surprising results with concurrent operations Could that happen if both deletes are wrapped in a transaction? – Expedition 4/4, 2013 at 12:11

@ClodoaldoNeto: Yes, it could. Locks are acquired along the way. It would be possible for a concurrent transaction to INSERT a child between the two DELETEs. Update: If the INSERT commits first, the DELETE on the parent would not go through. If the DELETE commits first, I think the INSERT would end up in a foreign key violation and be rolled back. Unlikely, because the time frame is tiny, but possible. – Amaliaamalie 4/4, 2013 at 12:35

Is a transaction failing a surprising result? – Ethical 7/4, 2014 at 16:45

Depends on the use case and the concurrent load. I have never seen it fail, yet (except when I provoked it for a test case). – Amaliaamalie 8/4, 2014 at 11:47

@KirillPetrov: Your suggested edit was rejected, but you have a point: In many setups, race conditions must be ruled out completely. I added an improved query and some explanation. – Amaliaamalie 26/9, 2017 at 13:42

Does this still work, when using cascaded deletes? (On the child table: REFERENCES ... ON DELETE CASCADE) Or could this fail, since they are executed concurrently? – Cormorant 20/12, 2017 at 17:26

@Felix: Since parents are only deleted when there are no children left, a cascading FK never fires in this scenario. But it should work either way. Please start a new question if it's still unclear. You can always reference this one for context. – Amaliaamalie 20/12, 2017 at 18:1

E

2

delete from child
where parent_id = 1

After deleted in the child do it in the parent:

delete from parent
where
    id = 1
    and not exists (
        select 1 from child where parent_id = 1
    )

The not exists condition will make sure it will only be deleted if it does not exist in the child. You can wrap both delete commands in a transaction:

begin;
first_delete;
second_delete;
commit;

Expedition answered 4/4, 2013 at 11:17 Comment(0)

Eliminate race condition

Recommended topics

Hot tags