Enforce Referential Integrity on Materialized Path?
Asked Answered
D

4

6

I'm trying to implement a tree like structure using a Materialized Path model described here: http://www.dbazine.com/oracle/or-articles/tropashko4.

Is it possible to enforce referential integrity on the [path] field? I don't see how SQL could do it, do I have to do it manually in the DAL?

Duo answered 7/8, 2009 at 4:38 Comment(0)
R
3

Yes, you have to enforce data integrity yourself in the DAL when you use either Materialized Path or Nested Sets solutions for hierarchical data.

Adjacency List supports referential integrity, and this is true also for a design I call "Closure Table" (Tropashko calls this design "transitive closure relation").

Radicalism answered 7/8, 2009 at 5:36 Comment(4)
Here's an article on a transitive closure solution: codeproject.com/KB/database/Modeling_DAGs_on_SQL_DBs.aspx @Bill Karwin: how big datasets have you deployed using "closure tables" in OLTP/interactive scenarios without hitting performance problems for inserts?Farthest
@nawroth: Only on the order of thousands of nodes in the tree, so good point. If you need to represent very deep trees, you end up with a lot of rows. If you need to represent many shallow trees, it's more modest.Radicalism
New information: I did a proof of concept with a tree of 490,000 nodes and I can fetch the list of ancestors of any node in 0.0006sec, and a whole tree with per-level expansion in 0.3 seconds. slideshare.net/billkarwin/models-for-hierarchical-dataRadicalism
thanks for remembering this thread. I studied the approaches a bit more and realized I like closure table the best. It makes sense in my scenario that users inserting data will not notice the overhead, and it enables large, fast retrieves.Duo
R
3

"Materialized path" as presented by Vadim Tropashko in that article, introduces the notion of order into a relation ("Jones is the second member".).

"Materialized path" is nothing but "some form of materialized view" on the transitive closure, and therefore suffers all and exactly the same problems as any other "materialized view", except that matters are algorithmically worse precisely because of the involvement of a closure.

SQL is almost completely powerless when constraints-on-a-closure are in play. (Meaning : yes, SQL requires you to do everything yourself.) It's one of those areas where the RM shows the maximum of its almost unlimited power, but SQL fails abysmally, and where it is such a shame that the majority of people mistake SQL for being relational.

(@Bill Karwin : I'd like to be able to give you +1 for your remark on the relation between the depth of the trees and the result on performance. There are no known algorithms to compute closures that perform well in the case of trees with "crazy" depths. It's an algorithmic problem, not an SQL nor a relational one.)

EDIT

Yes, RM = Relational Model

Record answered 7/8, 2009 at 20:20 Comment(2)
+1 Yes, the materialized path is an example of denormalization. It has some benefits of efficiency for certain types of queries, but it sacrifices benefits of the RM such as referential integrity.Radicalism
Btw, you can hover over an invisible upvote arrow to the left of a comment, and give the comment a little endorsement. That doesn't give rep points, though.Radicalism
P
1

In the materialized path model you can use arbitrary strings (maybe unicode strings to allow more than 256 children) instead of special strings of form "x.y.z". The id of the parent is then the id of the direct children with the last character removed. You can easily enforce this with a check constraint like (my example works in PostgreSQL)

check(parent_id = substring(id from 1 for char_length(id)-1)),

within your create table command. If you insist on strings of form "x.y.z", you'll have to play around with regular expressions, but I'd guess it's possible to find a corresponding check constraint.

Pantin answered 7/8, 2009 at 4:39 Comment(1)
If you want to enforce that roots have char_length(id)=1, you can additionally add the constraint check((parent_id is null) or (char_length(id)=1)) to the table definition.Pantin
F
-1

Yes, we can "enforce referential integrity on the [path] field". I did that a couple of years ago, described here:

Store your configuration settings as a hierarchy in a database

Fiedling answered 3/3, 2010 at 21:55 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.