Deleting hierarchical data in SQL table

M

8

17

I have a table with hierarchical data.
A column "ParentId" that holds the Id ("ID" - key column) of it's parent.

When deleting a row, I want to delete all children (all levels of nesting).

How to do it?

Thanks

Martelle answered 19/5, 2009 at 12:8 Comment(0)

A

4

When the number of rows is not too large, erikkallen's recursive approach works.

Here's an alternative that uses a temporary table to collect all children:

create table #nodes (id int primary key)
insert into #nodes (id) values (@delete_id)
while @@rowcount > 0
    insert into #nodes 
    select distinct child.id 
    from table child
    inner join #nodes parent on child.parentid = parent.id
    where child.id not in (select id from #nodes)

delete
from table
where id in (select id from #nodes)

It starts with the row with @delete_id and descends from there. The where statement is to protect from recursion; if you are sure there is none, you can leave it out.

Accroach answered 19/5, 2009 at 12:51 Comment(2)

I am not so strong in sql. so I ask this: Why do you need: "select id from table where id = @delete_id" why can't just use @delete_id as value? – Martelle 19/5, 2009 at 14:27

I'd be really surprised if this performs better than a recursive CTE – Fremitus 22/12, 2010 at 15:29

F

11

On SQL Server: Use a recursive query. Given CREATE TABLE tmp(Id int, Parent int), use

WITH x(Id) AS (
    SELECT @Id
    UNION ALL
    SELECT tmp.Id
      FROM tmp
      JOIN x ON tmp.Parent = x.Id
)
DELETE tmp
  FROM x
  JOIN tmp ON tmp.Id = x.Id

Fremitus answered 19/5, 2009 at 12:28 Comment(0)

S

6

Add a foreign key constraint. The following example works for MySQL (syntax reference):

ALTER TABLE yourTable
ADD CONSTRAINT makeUpAConstraintName
FOREIGN KEY (ParentID) REFERENCES yourTable (ID)
ON DELETE CASCADE;

This will operate on the database level, the dbms will ensure that once a row is deleted, all referencing rows will be deleted, too.

Szombathely answered 19/5, 2009 at 12:18 Comment(4)

Self-referential cascade deletes are not supported by SQL Server 2005. You'll get an error when you try to delete a row with "child" rows. – Ballew 19/5, 2009 at 12:30

The author hadn't specified a DBMS at the point I wrote this answer. I'll leave it in for reference. – Szombathely 19/5, 2009 at 12:39

ON DELETE CASCADE works in Oracle too. This is the best answer in my opinion (if your DBMS supports it). – Pistoleer 19/5, 2009 at 13:6

Clarification, I've only used this in cases where the master-detail relationship was one level deep and modelled in two tables. – Pistoleer 19/5, 2009 at 13:7

A

4

When the number of rows is not too large, erikkallen's recursive approach works.

Here's an alternative that uses a temporary table to collect all children:

create table #nodes (id int primary key)
insert into #nodes (id) values (@delete_id)
while @@rowcount > 0
    insert into #nodes 
    select distinct child.id 
    from table child
    inner join #nodes parent on child.parentid = parent.id
    where child.id not in (select id from #nodes)

delete
from table
where id in (select id from #nodes)

It starts with the row with @delete_id and descends from there. The where statement is to protect from recursion; if you are sure there is none, you can leave it out.

Accroach answered 19/5, 2009 at 12:51 Comment(2)

I am not so strong in sql. so I ask this: Why do you need: "select id from table where id = @delete_id" why can't just use @delete_id as value? – Martelle 19/5, 2009 at 14:27

I'd be really surprised if this performs better than a recursive CTE – Fremitus 22/12, 2010 at 15:29

V

3

Depends how you store your hierarchy. If you only have ParentID, then it may not be the most effective approach you took. For ease of subtree manipulation you should have an additional column Parents that wouls store all parent IDs like:

/1/20/25/40

This way you'll be able to get all sub-nodes simply by:

where Parents like @NodeParents + '%'

Second approach
Instead of just ParentID you could also have left and right values. Inserts doing it this way are slower, but select operations are extremely fast. Especially when dealing with sub-tree nodes... http://en.wikipedia.org/wiki/Tree_traversal

Third approach
check recursive CTEs if you use SQL 2005+

Fourth approach
If you use SQL 2008, check HierarchyID type. It gives enough possibilities for your case. http://msdn.microsoft.com/en-us/magazine/cc794278.aspx

Vigil answered 19/5, 2009 at 12:12 Comment(4)

NO, I don't want to store whole parents-chain in a column, because there constant parents changing is involved. And it will be hard to keep track of all that. Can't it be done as it is right now? – Martelle 19/5, 2009 at 12:16

What are primary operations over your Hierarchy data? Is it inserts, updates or reads? – Vigil 19/5, 2009 at 12:18

I tend to agree with the first approach - we've got tables of hierarchical data that we're doing the same thing with. It helps with getting rid of children, and it also helps if you need to do path-based processing of the tree (such as having to quickly return all children of a parent for calculation). We originally tried using triggers to maintain this, but really found that the performance implications when adding large amounts of data were prohibitive. – Ballenger 19/5, 2009 at 13:5

If your'e on SQL Server 2005, the third approach is probably something you should definitely look into. That way you won't have to maintain a Parents column that has multiple attributes, which is a pain. – Detroit 19/5, 2009 at 13:31

W

2

Add a trigger to the table like this

create trigger TD_MyTable on myTable for delete as -- Delete one level of children delete M from deleted D inner join myTable M on D.ID = M.ID

Each delete will call a delete on the same table, repeatedly calling the trigger. Check books online for additional rules. There may be a restriction to the number of times a trigger can nest.

ST

Woolf answered 19/5, 2009 at 12:13 Comment(3)

those triggers available in SQL SERVER 2005 Express? – Martelle 19/5, 2009 at 12:21

I believe they are however you have to write them yourself in express. There's no wizard for it. – Woolf 19/5, 2009 at 12:24

I guess trigger will work, but the problem with trigger is that it will be activated on every delete, even in a case when I want to delete only one row... – Martelle 19/5, 2009 at 13:22

S

0

Depends on your database. If you are using Oracle, you could do something like this:

DELETE FROM Table WHERE ID IN (
  SELECT ID FROM Table
  START WITH ID = id_to_delete
  CONNECT BY PRIOR.ID = ParentID
)

ETA:

Without CONNECT BY, it gets a bit trickier. As others have suggested, a trigger or cascading delete constraint would probably be easiest.

Sayyid answered 19/5, 2009 at 12:17 Comment(1)

I am using MS SQL SERVER 2005 express – Martelle 19/5, 2009 at 12:19

S

0

Triggers can only be used for hierarchies 32 levels deep or less:

http://sqlblog.com/blogs/alexander_kuznetsov/archive/2009/05/11/defensive-database-programming-fun-with-triggers.aspx

Styles answered 19/5, 2009 at 14:2 Comment(1)

yes. the maximum level depth is about 8,9. Anyway I don't think I am going to use triggers, because I don't need the trigger to be activated on every delete command that is run on the database. – Martelle 19/5, 2009 at 14:11

M

-1

What you want is referential integrity between these tables.

Mistaken answered 19/5, 2009 at 12:28 Comment(0)

Recommended topics

Hot tags