Why can you not have a foreign key in a polymorphic association?

Asked 28/5, 2009 at 17:32 Answered 23/2, 2023 at 4:57

Solved ruby-on-rails database foreign-key-relationship polymorphic-associations

Why can you not have a foreign key in a polymorphic association, such as the one represented below as a Rails model?

class Comment < ActiveRecord::Base
  belongs_to :commentable, :polymorphic => true
end

class Article < ActiveRecord::Base
  has_many :comments, :as => :commentable
end

class Photo < ActiveRecord::Base
  has_many :comments, :as => :commentable
  #...
end

class Event < ActiveRecord::Base
  has_many :comments, :as => :commentable
end

Hyperbolic answered 28/5, 2009 at 17:32 Comment(1)

Just for clarity of others, the OP is not talking about the foreign_key option that can be passed to belongs_to. The OP is talking about a "Foreign Key Constraint" of the native database. That confused me for a while. – Nefarious 6/1, 2018 at 14:19

197

A foreign key must reference only one parent table. This is fundamental to both SQL syntax, and relational theory.

A Polymorphic Association is when a given column may reference either of two or more parent tables. There's no way you can declare that constraint in SQL.

The Polymorphic Associations design breaks rules of relational database design. I don't recommend using it.

There are several alternatives:

Exclusive Arcs: Create multiple foreign key columns, each referencing one parent. Enforce that exactly one of these foreign keys can be non-NULL.
Reverse the Relationship: Use three many-to-many tables, each references Comments and a respective parent.

Concrete Supertable: Instead of the implicit "commentable" superclass, create a real table that each of your parent tables references. Then link your Comments to that supertable. Pseudo-rails code would be something like the following (I'm not a Rails user, so treat this as a guideline, not literal code):

 class Commentable < ActiveRecord::Base
   has_many :comments
 end

 class Comment < ActiveRecord::Base
   belongs_to :commentable
 end

 class Article < ActiveRecord::Base
   belongs_to :commentable
 end

 class Photo < ActiveRecord::Base
   belongs_to :commentable
 end

 class Event < ActiveRecord::Base
   belongs_to :commentable
 end

I also cover polymorphic associations in my presentation Practical Object-Oriented Models in SQL, and my book SQL Antipatterns Volume 1: Avoiding the Pitfalls of Database Programming.

Re your comment: Yes, I do know that there's another column that notes the name of the table that the foreign key supposedly points to. This design is not supported by foreign keys in SQL.

What happens, for instance, if you insert a Comment and name "Video" as the name of the parent table for that Comment? No table named "Video" exists. Should the insert be aborted with an error? What constraint is being violated? How does the RDBMS know that this column is supposed to name an existing table? How does it handle case-insensitive table names?

Likewise, if you drop the Events table, but you have rows in Comments that indicate Events as their parent, what should be the result? Should the drop table be aborted? Should rows in Comments be orphaned? Should they change to refer to another existing table such as Articles? Do the id values that used to point to Events make any sense when pointing to Articles?

These dilemmas are all due to the fact that Polymorphic Associations depends on using data (i.e. a string value) to refer to metadata (a table name). This is not supported by SQL. Data and metadata are separate.

I'm having a hard time wrapping my head around your "Concrete Supertable" proposal.

Define Commentable as a real SQL table, not just an adjective in your Rails model definition. No other columns are necessary.
```
 CREATE TABLE Commentable (
   id INT AUTO_INCREMENT PRIMARY KEY
 ) TYPE=InnoDB;
```

Define the tables Articles, Photos, and Events as "subclasses" of Commentable, by making their primary key be also a foreign key referencing Commentable.

 CREATE TABLE Articles (
   id INT PRIMARY KEY, -- not auto-increment
   FOREIGN KEY (id) REFERENCES Commentable(id)
 ) TYPE=InnoDB;

 -- similar for Photos and Events.

Define the Comments table with a foreign key to Commentable.

 CREATE TABLE Comments (
   id INT PRIMARY KEY AUTO_INCREMENT,
   commentable_id INT NOT NULL,
   FOREIGN KEY (commentable_id) REFERENCES Commentable(id)
 ) TYPE=InnoDB;

When you want to create an Article (for instance), you must create a new row in Commentable too. So too for Photos and Events.

 INSERT INTO Commentable (id) VALUES (DEFAULT); -- generate a new id 1
 INSERT INTO Articles (id, ...) VALUES ( LAST_INSERT_ID(), ... );

 INSERT INTO Commentable (id) VALUES (DEFAULT); -- generate a new id 2
 INSERT INTO Photos (id, ...) VALUES ( LAST_INSERT_ID(), ... );

 INSERT INTO Commentable (id) VALUES (DEFAULT); -- generate a new id 3
 INSERT INTO Events (id, ...) VALUES ( LAST_INSERT_ID(), ... );

When you want to create a Comment, use a value that exists in Commentable.

 INSERT INTO Comments (id, commentable_id, ...)
 VALUES (DEFAULT, 2, ...);

When you want to query comments of a given Photo, do some joins:

 SELECT * FROM Photos p JOIN Commentable t ON (p.id = t.id)
 LEFT OUTER JOIN Comments c ON (t.id = c.commentable_id)
 WHERE p.id = 2;

When you have only the id of a comment and you want to find what commentable resource it's a comment for. For this, you may find that it's helpful for the Commentable table to designate which resource it references.
```
 SELECT commentable_id, commentable_type FROM Commentable t
 JOIN Comments c ON (t.id = c.commentable_id)
 WHERE c.id = 42;
```

Then you'd need to run a second query to get data from the respective resource table (Photos, Articles, etc.), after discovering from commentable_type which table to join to. You can't do it in the same query, because SQL requires that tables be named explicitly; you can't join to a table determined by data results in the same query.

Admittedly, some of these steps break the conventions used by Rails. But the Rails conventions are wrong with respect to proper relational database design.

Yorke answered 28/5, 2009 at 17:59 Comment(28)

Thanks for following up. Just so we're on the same page, in Rails polymorphic associations use two columns in our Comment for the foreign key. One column holds the id of the target row, and the second column tells Active Record which model that key is in (Article, Photo or Event). Knowing this, would you still recommend the three alternatives you have proposed? I'm having a hard time wrapping my head around your "Concrete Supertable" proposal. What do you mean when you say "link your Comments to that supertable" (Commentable)? – Hyperbolic 28/5, 2009 at 18:47

Thank you for explaining. I think I understand why you say the Rails conventions are wrong with respect to proper relational database design - the pattern in some ways resembles using flat files as a storage mechanism in that it loses out on the ability to enforce various relational constraints. – Hyperbolic 28/5, 2009 at 21:3

Exactly. It should be a strong "code smell" that it's not correct relational database design when the Polymorphic Associations documentation itself says that you can't use foreign key constraints! – Yorke 28/5, 2009 at 22:37

The claim that "this is fundamental to relational theory" is wrong. See Chpt 13 in "Database Explorations" : Inclusion dependencies and foreign keys. – Sessoms 3/8, 2011 at 21:51

@Erwin Smout: Thanks for your comment, I'll look forward to reading that book. I still think referential integrity constraint must involve no more than one referencing relvar and one referenced relvar (and the two may be the same relvar), if we work in the scope of relational theory that SQL is based on. I haven't read about Third Manifesto stuff yet, so I'll be interested if it expands the definition of a foreign key. – Yorke 3/8, 2011 at 22:49

Good description of alternatives for Rails polymorphic associations. One thing to point out is in the example for concrete supertables, the relationship specification for Article (and Photo and Event) is flipped. Child table models specify relationship to parent table by "belongs_to" (not "has_one"), so: class Article < ActiveRecord::Base belongs_to :commentable end – Gormandize 7/2, 2012 at 22:29

@TsenYing, thank you, I've edited the above to reflect your correction. – Yorke 8/2, 2012 at 0:46

One downside to the Concrete Supertable solution is that it does not enforce referential integrity on the children table. For instance, it would be possible for an Events row and a Photos row to have the same commentable_id. Granted, using good procedure to create the commentable_id and assign it to a child table should avoid this situation, but the possibility still exists. – Hesketh 3/6, 2013 at 16:26

@JasonMartens, there's a solution to use referential integrity to enforce that only one child table can reference a given id. See my answer to #988154 – Yorke 3/6, 2013 at 16:58

@BillKarwin Do you implement that by adding A FK on the Answers table? My situation is a little different, so I don't think that would work for me, but it is a good idea. I ended up going with the Exclusive Arcs solution for my design. Thanks for the great answer by the way. – Hesketh 4/6, 2013 at 13:11

@BillKarwin could we not make the same argument against STI then? Or does the foreign key being named after an association solve the problem? – Bastion 24/7, 2013 at 16:57

@Mohamad, STI would work fine. You could still define foreign keys if your parent table used STI. Or even if the child table used STI. – Yorke 24/7, 2013 at 17:3

@BillKarwin - thank you for this great post. In the Concrete Supertable approach, it seems that buggy application code could accidentally create an Article (for example) without creating the corresponding Commentable. Is this something that can be prevented by using constraints? – Azotobacter 22/11, 2013 at 18:58

@antinome, yes, if you create the tables with a real foreign key constraint as I showed, you won't be able to create a row in the Articles table unless the corresponding id exists in the Commentable table. – Yorke 22/11, 2013 at 19:3

@BillKarwin - Thanks, good point. Not sure how I missed that :-) – Azotobacter 22/11, 2013 at 19:11

I've got into a similar situation with a bit more complexity. My table Articles and Comments can contain images. So I decide to create another photo_entity supertable. However, articles id can not be the LAST_INSERT_ID() of photo_entity since it's already been taken from the Commentable one. Could you please give me an advice in this case, @BillKarwin? – Whipstitch 29/6, 2015 at 4:5

@Tresdin, I would add a foreign key in the other direction -- photo_entity should contain an FK that references Commentable(id). – Yorke 29/6, 2015 at 5:21

@BillKarwin After being inspired by your comment, I intend to create a supertable entity for tables Commentable and photo_entity. Articles id and Comments id are now being taken from entity instead of their respective entity tables. But I'm not sure it's a good idea since my db structure would be much more complex. – Whipstitch 29/6, 2015 at 6:53

@Tresdin, I think I misunderstood initially what your photo_entity table is supposed to be. I thought that table was to store photos, with a reference to the article/comment to which the photo belongs. But it sounds like you're using photo_entity as a new kind of superclass. Well, now you can see why many OO languages don't try to implement multiple inheritance! :-) – Yorke 29/6, 2015 at 14:20

Thanks for your replies @BillKarwin. Actually, I've already given up this idea after writing some deeply nested queries. ~~. – Whipstitch 29/6, 2015 at 15:18

While polymorphic associations may break the rules of relational database design, relational databases do not lend themselves to elegance when adding additional columns to HUGE tables. That being said, I believe using polymorphism is completely necessary in relational database design as it prevents having to add columns in the future. – Petree 12/8, 2016 at 17:54

@EricAlford, It's more accurate to say that MySQL doesn't lend itself to adding columns. Some other RDBMS implementations have versioned schema, so you can add columns easily, regardless of the size of the table. – Yorke 12/8, 2016 at 18:25

What are my options if I need to perform a query the other way around? That is, given a comment ID, how can I find the entity that is being commented (the article, the photo or the event)? – Dessert 19/8, 2017 at 17:29

@AlexBrand, I added another example to my answer above. – Yorke 19/8, 2017 at 18:17

So if I have a bunch of comments, I basically have to perform one query for each? – Dessert 20/8, 2017 at 3:13

@AlexBrand, All optimizations depend on how your application need to query the data. If you need to query from comments to resources of different types, then perhaps the data organization I describe in my answer isn't the right choice. Instead of the Commentable table and various dependent tables, you might have to use Single-Table Inheritance. – Yorke 20/8, 2017 at 5:11

Will Postgres table inheritance do exactly what the "Concrete Supertable" does? – Holofernes 10/6, 2018 at 2:20

@Meglio, I think you should read carefully the "Caveats" section of postgresql.org/docs/current/static/ddl-inherit.html I am not a PostgreSQL user, but my understanding is that there are a number of cases of missing functionality that makes inherited tables kind of useless for most applications. – Yorke 10/6, 2018 at 16:50

Bill Karwin is correct that foreign keys cannot be used with polymorphic relationships due to SQL not really having a native concept polymorphic relationships. But if your goal of having a foreign key is to enforce referential integrity you can simulate it via triggers. This gets DB specific but below is some recent triggers I created to simulate the cascading delete behavior of a foreign key on a polymorphic relationship:

CREATE FUNCTION delete_related_brokerage_subscribers() RETURNS trigger AS $$
  BEGIN
    DELETE FROM subscribers
    WHERE referrer_type = 'Brokerage' AND referrer_id = OLD.id;
    RETURN NULL;
  END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER cascade_brokerage_subscriber_delete
AFTER DELETE ON brokerages
FOR EACH ROW EXECUTE PROCEDURE delete_related_brokerage_subscribers();


CREATE FUNCTION delete_related_agent_subscribers() RETURNS trigger AS $$
  BEGIN
    DELETE FROM subscribers
    WHERE referrer_type = 'Agent' AND referrer_id = OLD.id;
    RETURN NULL;
  END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER cascade_agent_subscriber_delete
AFTER DELETE ON agents
FOR EACH ROW EXECUTE PROCEDURE delete_related_agent_subscribers();

In my code a record in the brokerages table or a record in the agents table can relate to a record in the subscribers table.

Democrat answered 8/11, 2016 at 15:22 Comment(1)

This is great. Any ideas on how one might create a similar trigger to make sure that newly-created polymorphic associations point to a valid type and ID? – Progressive 7/3, 2020 at 20:21

Let's say you want to create a like system in SQL. in polymorphic association you create a LIKE table with those columns:

id_of_like    user_id     liked_id   liked_type
  1            12          3          image
  2            3           5          video

liked_id refers to id of image or id of video.

Normally when you create a table, you set a foreign key and tell the SQL database which table will this foreign key reference. Before you insert the row, your database will check if the current foreign key value is valid or not. for example if in a table where foreign key references USER table, your database will check if the current foreign key exists in the USER table. This verification makes sure your database is consistent.

Let's say in the above table, we somehow set the liked_id as a FOREIGN KEY, how the database would know which table (IMAGE or VIDEO) to visit to validate the current key value? liked_type is not for the database manager it is for humans to read it, database manager does not read it to check which table to go.

Imagine I inserted a new row with liked_id=333333

id_of_like    user_id     liked_id   liked_type
  1            12          333333     image

your database manager will not be able to validate if id=333333 exist in IMAGE table or not.

Ski answered 23/2, 2023 at 4:57 Comment(1)

All of this is applicable to existing DB engines, I believe. But I don't see any reason why this cannot be implemented in addition to traditional mechanisms. – Filiate 19/3 at 7:12

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags