Possible to do a MySQL foreign key to one of two possible tables?
Asked Answered
S

6

217

Well here's my problem I have three tables; regions, countries, states. Countries can be inside of regions, states can be inside of regions. Regions are the top of the food chain.

Now I'm adding a popular_areas table with two columns; region_id and popular_place_id. Is it possible to make popular_place_id be a foreign key to either countries OR states. I'm probably going to have to add a popular_place_type column to determine whether the id is describing a country or state either way.

Stimulate answered 13/1, 2009 at 22:2 Comment(1)
@BillKarwin You have forced the Q which is mysql and relational into something that it is not (polymorphic-associations), which is a very narrow & limited OO/ORM mindset (sub-standard, relative to mysql & relational), and then supplied an Answer in that narrow mindset. The seeker's Q did not have that narrow mindset, the Answers do not need to be limited to it. See mine for details.Vituline
C
344

What you're describing is called Polymorphic Associations. That is, the "foreign key" column contains an id value that must exist in one of a set of target tables. Typically the target tables are related in some way, such as being instances of some common superclass of data. You'd also need another column along side the foreign key column, so that on each row, you can designate which target table is referenced.

CREATE TABLE popular_places (
  user_id INT NOT NULL,
  place_id INT NOT NULL,
  place_type VARCHAR(10) -- either 'states' or 'countries'
  -- foreign key is not possible
);

There's no way to model Polymorphic Associations using SQL constraints. A foreign key constraint always references one target table.

Polymorphic Associations are supported by frameworks such as Rails and Hibernate. But they explicitly say that you must disable SQL constraints to use this feature. Instead, the application or framework must do equivalent work to ensure that the reference is satisfied. That is, the value in the foreign key is present in one of the possible target tables.

Polymorphic Associations are weak with respect to enforcing database consistency. The data integrity depends on all clients accessing the database with the same referential integrity logic enforced, and also the enforcement must be bug-free.

Here are some alternative solutions that do take advantage of database-enforced referential integrity:

Create one extra table per target. For example popular_states and popular_countries, which reference states and countries respectively. Each of these "popular" tables also reference the user's profile.

CREATE TABLE popular_states (
  state_id INT NOT NULL,
  user_id  INT NOT NULL,
  PRIMARY KEY(state_id, user_id),
  FOREIGN KEY (state_id) REFERENCES states(state_id),
  FOREIGN KEY (user_id) REFERENCES users(user_id),
);

CREATE TABLE popular_countries (
  country_id INT NOT NULL,
  user_id    INT NOT NULL,
  PRIMARY KEY(country_id, user_id),
  FOREIGN KEY (country_id) REFERENCES countries(country_id),
  FOREIGN KEY (user_id) REFERENCES users(user_id),
);

This does mean that to get all of a user's popular favorite places you need to query both of these tables. But it means you can rely on the database to enforce consistency.

Create a places table as a supertable. As Abie mentions, a second alternative is that your popular places reference a table like places, which is a parent to both states and countries. That is, both states and countries also have a foreign key to places (you can even make this foreign key also be the primary key of states and countries).

CREATE TABLE popular_areas (
  user_id INT NOT NULL,
  place_id INT NOT NULL,
  PRIMARY KEY (user_id, place_id),
  FOREIGN KEY (place_id) REFERENCES places(place_id)
);

CREATE TABLE states (
  state_id INT NOT NULL PRIMARY KEY,
  FOREIGN KEY (state_id) REFERENCES places(place_id)
);

CREATE TABLE countries (
  country_id INT NOT NULL PRIMARY KEY,
  FOREIGN KEY (country_id) REFERENCES places(place_id)
);

Use two columns. Instead of one column that may reference either of two target tables, use two columns. These two columns may be NULL; in fact only one of them should be non-NULL.

CREATE TABLE popular_areas (
  place_id SERIAL PRIMARY KEY,
  user_id INT NOT NULL,
  state_id INT,
  country_id INT,
  CONSTRAINT UNIQUE (user_id, state_id, country_id), -- UNIQUE permits NULLs
  CONSTRAINT CHECK (state_id IS NOT NULL OR country_id IS NOT NULL),
  FOREIGN KEY (state_id) REFERENCES places(place_id),
  FOREIGN KEY (country_id) REFERENCES places(place_id)
);

In terms of relational theory, Polymorphic Associations violates First Normal Form, because the popular_place_id is in effect a column with two meanings: it's either a state or a country. You wouldn't store a person's age and their phone_number in a single column, and for the same reason you shouldn't store both state_id and country_id in a single column. The fact that these two attributes have compatible data types is coincidental; they still signify different logical entities.

Polymorphic Associations also violates Third Normal Form, because the meaning of the column depends on the extra column which names the table to which the foreign key refers. In Third Normal Form, an attribute in a table must depend only on the primary key of that table.


Re comment from @SavasVedova:

I'm not sure I follow your description without seeing the table definitions or an example query, but it sounds like you simply have multiple Filters tables, each containing a foreign key that references a central Products table.

CREATE TABLE Products (
  product_id INT PRIMARY KEY
);

CREATE TABLE FiltersType1 (
  filter_id INT PRIMARY KEY,
  product_id INT NOT NULL,
  FOREIGN KEY (product_id) REFERENCES Products(product_id)
);

CREATE TABLE FiltersType2 (
  filter_id INT  PRIMARY KEY,
  product_id INT NOT NULL,
  FOREIGN KEY (product_id) REFERENCES Products(product_id)
);

...and other filter tables...

Joining the products to a specific type of filter is easy if you know which type you want to join to:

SELECT * FROM Products
INNER JOIN FiltersType2 USING (product_id)

If you want the filter type to be dynamic, you must write application code to construct the SQL query. SQL requires that the table be specified and fixed at the time you write the query. You can't make the joined table be chosen dynamically based on the values found in individual rows of Products.

The only other option is to join to all filter tables using outer joins. Those that have no matching product_id will just be returned as a single row of nulls. But you still have to hardcode all the joined tables, and if you add new filter tables, you have to update your code.

SELECT * FROM Products
LEFT OUTER JOIN FiltersType1 USING (product_id)
LEFT OUTER JOIN FiltersType2 USING (product_id)
LEFT OUTER JOIN FiltersType3 USING (product_id)
...

Another way to join to all filter tables is to do it serially:

SELECT * FROM Product
INNER JOIN FiltersType1 USING (product_id)
UNION ALL
SELECT * FROM Products
INNER JOIN FiltersType2 USING (product_id)
UNION ALL
SELECT * FROM Products
INNER JOIN FiltersType3 USING (product_id)
...

But this format still requires you to write references to all tables. There's no getting around that.

Cheek answered 13/1, 2009 at 22:32 Comment(9)
Which one would you suggest Bill? I am in the middle of designing a database but I'm lost. I basically need to associate filters to a product and the values of the filters will be populated accross different tables. But the problem is that filters will be generated by admins so depending on the filter type the data may vary and hence the join target will change as well...... Am I complicating too much or what? Help!Catharinecatharsis
+1 thank you for an awesome solution. One question I have with the first/second solution is: is there any normalization violation with the fact that multiple tables can refer to the same primary key in that meta-table? I know you can solve this with logic, but I don't see any way for the database to enforce it, unless I'm missing something.Arbitral
@Rob, there is no problem with respect to normalization. If you don't want it to happen, that's a different matter, but it's not because of normalization. To ensure that only one child table can reference a given row, see my example in #988154Cheek
I really like approach with "CONSTRAINT CHECK". But it can be improved if we change "OR" to "XOR". That way we insure that only one column from set is NOT NULLMarcoux
@alex_b, yes, that's good, but logical XOR is not standard SQL and is not supported by all SQL brands. MySQL has it, but PostgreSQL does not. Oracle has it, but Microsoft does not until 2016. And so on.Cheek
"violates First Normal Form, because the popular_place_id is in effect a column with two meanings" - does that mean ISBN "violates 1NF" because it is a subtype of EAN? "also violates Third Normal Form..." Let me cut to the chase: this is a non-issue. If you create a compound key on place_id and place_type then all your 'violations' disappear. Celko has a good example of this in his books.Photochromy
"These two columns may be NULL; in fact only one of them should be non-NULL" -- this would violate 1NF!Photochromy
Have posted my own answer to demonstrate the "supertable" approach using a compound key.Photochromy
I'm always partial to the two-columns solution. If your RDBMS has check constraints, check that COALESCE(col1, col2) IS NOT NULL AND (col1 IS NULL OR col2 IS NULL) if it should be one and only one (omit the COALESCE part if it can be one or neither.Surber
H
13

This isn't the most elegant solution in the world, but you could use concrete table inheritance to make this work.

Conceptually you are proposing a notion of a class of "things that can be popular areas" from which your three types of places inherit. You could represent this as a table called, for example, places where each row has a one-to-one relationship with a row in regions, countries, or states. (Attributes that are shared between regions, countries, or states, if any, could be pushed into this places table.) Your popular_place_id would then be a foreign key reference to a row in the places table which would then lead you to a region, country, or state.

The solution you propose with a second column to describe the type of association happens to be how Rails handles polymorphic associations, but I'm not a fan of that in general. Bill explains in excellent detail why polymorphic associations are not your friends.

Hershelhershell answered 13/1, 2009 at 22:16 Comment(2)
a.k.a. "the supertype-subtype pattern"Support
Also this article expalin well the concept duhallowgreygeek.com/polymorphic-association-bad-sql-smellTyphogenic
P
7

Here is a correction to Bill Karwin's "supertable" approach, using a compound key ( place_type, place_id ) to resolve the perceived normal form violations:

CREATE TABLE places (
  place_id INT NOT NULL UNIQUE,
  place_type VARCHAR(10) NOT NULL
     CHECK ( place_type = 'state', 'country' ),
  UNIQUE ( place_type, place_id )
);

CREATE TABLE states (
  place_id INT NOT NULL UNIQUE,
  place_type VARCHAR(10) DEFAULT 'state' NOT NULL
     CHECK ( place_type = 'state' ),
  FOREIGN KEY ( place_type, place_id ) 
     REFERENCES places ( place_type, place_id )
  -- attributes specific to states go here
);

CREATE TABLE countries (
  place_id INT NOT NULL UNIQUE,
  place_type VARCHAR(10) DEFAULT 'country' NOT NULL
     CHECK ( place_type = 'country' ),
  FOREIGN KEY ( place_type, place_id ) 
     REFERENCES places ( place_type, place_id )
  -- attributes specific to country go here
);

CREATE TABLE popular_areas (
  user_id INT NOT NULL,
  place_id INT NOT NULL,
  UNIQUE ( user_id, place_id ),
  FOREIGN KEY ( place_type, place_id ) 
     REFERENCES places ( place_type, place_id )
);

What this design cannot ensure that for every row in places there exists a row in states or countries (but not both). This is a limitations of foreign keys in SQL. In a full SQL-92 Standards compliant DBMS you could define deferrable inter-table constraints that would allow you to achieve the same but it is clunky, involves transaction and such a DBMS has yet to make it to market.

Photochromy answered 21/10, 2016 at 12:55 Comment(0)
V
6

Relational Answer

Noting the mysql tag, which implies relational, because SQL is the data sub-language defined in Codd's Relational Model.

  • The solution is simple and straight-forward, we had it before the RM, and we have had a Relational solution since 1981.
  • The Relational solution provides both Referential Integrity (physical, at the SQL level) and Relational Integrity (logical).
  • To comply with Open Architecture Standards (sanity), all Constraints; business rules; etc that govern the data, as well as all Transactions, should be deployed in the database, not the framework, not the app GUI, not the app middle-tier.  Note that it is a single recovery unit.

The polymorphic-associations tag is false, it was not requested by OP.  Forcing it into an OO/ORM mindset, and then proving a solution in that mindset, is outside the scope of the question.

  • Further, it requires a framework, and code to enforce constraints; etc, outside the database, which is sub-standard.
  • Further, it does not have the basic integrity, let alone Relational Integrity, of the Relational solution.
  • Further, it violates 1NF and 3NF (as detailed in the Karvan Answer).
  • Nulls are a Normalisation error, they should never be stored.
  • A Nullable FOREIGN KEY is a gross Normalisation error.

Solution

Well here's my problem I have three tables; regions, countries, states. Countries can be inside of regions, states can be inside of regions. Regions are the top of the food chain.

Make it Relational

Let's understand what that is in the Relational context.  It is a typical hierarchy of tables.

  • Do not use ID fields.  Do not declare them as PRIMARY KEY, that will only confuse you, becasue it is not a Key, it does not provide row uniqueness as demanded in the Relational Model
  • A Key must be made up from the data
  • An ID field is not data.  It is always an additional field, and an additional index
  • With ID fields, you might be able to implement Referential Integrity (physical, SQL), but you have no chance of implementing Relational Integrity (logical)
  • For full discussion, including SQL code, refer to:
    Creating a Relational table with 2 different auto_increment, §1 & 2 only.

Base Tables

Foo

Notation

  • All my data models are rendered in IDEF1X, the notation for Relational Data Modelling, which we have had since the early 1980’s, made the Standard for Relational Data Modelling, in 1993, last updated 2016.

  • The IDEF1X Introduction is essential reading for those who are new to the Relational Model, or its modelling method.  Note that IDEF1X models are rich in detail and precision, showing all required details, whereas a home-grown model, being unaware of the imperatives of the Standard, have far less definition.  Which means, the notation needs to be fully understood.  

  • ERD is not a Standard, it does not support the Relational Model, and it is completely inadequate for modelling.

  • That the academics and "textbooks" teach and market anti-Relational as "relational" is criminal.

Subtype

Now I'm adding a popular_areas table with two columns; region_id and popular_place_id. Is it possible to make popular_place_id be a foreign key to either countries OR states.

No problem at all.  The Relational Model is founded in Mathematics; Logic, it it entirely Logical.  An OR or XOR gate is a fundament of Logic.  In the Relational or SQL paradigm, it is called a Subtype cluster.

  • Even in freeware "SQLs", which are not SQL compliant, it is done with full Referential Integrity

    • the notion that it can't be done, or that it requires the horrendous additional fields and indices marketed by the academics, is false.
  • For full implementation details, including links to SQL code, refer to the Subtype document.

  • For examples and discussion, refer to:
    How to Implement Referential Integrity in Subtypes

  • For clarification of issues that confuse this Question, and thus the other Answers:
    Relational schema for a book graph

I'm probably going to have to add a popular_place_type column to determine whether the id is describing a country or state either way.

Correct, you are thinking logically.  Here we need an XOR Gate, which requires a Discriminator.

Add Place Table

Foo

Relational Integrity

Whereas Referential Integrity is the physical feature provided in SQL, Relational Integrity, which is Logical, is on top of that (when modelling properly, the Logical precedes the physical).  

This is a great, simple example of Relational Integrity.  Note the second FOREIGN KEY in the Subtypes.

  • PlaceCountry is constrained to a Country that is in the same Region as Place.Region

  • PlaceState is constrained to a State that is in the same Region as Place.Region

  • Note that this is possible only with Relational Keys (composite)

    • Relational Integrity is not possible in the primitive Record Filing Systems, which are characterised by ID fields as "keys", and heavily marketed by the academics and authors as "relational"
    • In such primitive files (they are not tables), PlaceCountry would allow any Country, it cannot be constrained to a Country that is in the same Region as Place.Region.
Vituline answered 4/1, 2022 at 2:2 Comment(0)
J
0

I realize that this thread is old, but I saw this and a solution came to mind and I thought I'd throw it out there.

Regions, Countries and States are Geographical Locations that live in a hierarchy.

You could avoid your problem altogether by creating a domain table called geographical_location_type which you would populate with three rows ( Region, Country, State).

Next, instead of the three location tables, create a single geographical_location table that has a foreign key of geographical_location_type_id (so you know if the instance is a Region, Country or State).

Model the hierarchy by making this table self-referencing so that a State instance holds the fKey to its parent Country instance which in turn holds the fKey to its parent Region instance. Region instances would hold NULL in that fKey. This is no different than what you would have done with the three tables (you would have 1 - many relationships between region and country and between country and state) except now it's all in one table.

The popular_user_location table would be a scope resolution table between user and georgraphical_location (so many users could like many places).

Soooo …

enter image description here

CREATE TABLE [geographical_location_type] (
    [geographical_location_type_id] INTEGER NOT NULL,
    [name] VARCHAR(25) NOT NULL,
    CONSTRAINT [PK_geographical_location_type] PRIMARY KEY ([geographical_location_type_id])
)

-- Add 'Region', 'Country' and 'State' instances to the above table


CREATE TABLE [geographical_location] (
   [geographical_location_id] BIGINT IDENTITY(0,1) NOT NULL,
    [name] VARCHAR(1024) NOT NULL,
    [geographical_location_type_id] INTEGER NOT NULL,
    [geographical_location_parent] BIGINT,  -- self referencing; can be null for top-level instances
    CONSTRAINT [PK_geographical_location] PRIMARY KEY ([geographical_location_id])
)

CREATE TABLE [user] (
    [user_id] BIGINT NOT NULL,
    [login_id] VARCHAR(30) NOT NULL,
    [password] VARCHAR(512) NOT NULL,
    CONSTRAINT [PK_user] PRIMARY KEY ([user_id])
)


CREATE TABLE [popular_user_location] (
    [popular_user_location_id] BIGINT NOT NULL,
    [user_id] BIGINT NOT NULL,
    [geographical_location_id] BIGINT NOT NULL,
    CONSTRAINT [PK_popular_user_location] PRIMARY KEY ([popular_user_location_id])
)

ALTER TABLE [geographical_location] ADD CONSTRAINT [geographical_location_type_geographical_location] 
    FOREIGN KEY ([geographical_location_type_id]) REFERENCES [geographical_location_type] ([geographical_location_type_id])



ALTER TABLE [geographical_location] ADD CONSTRAINT [geographical_location_geographical_location] 
    FOREIGN KEY ([geographical_location_parent]) REFERENCES [geographical_location] ([geographical_location_id])



ALTER TABLE [popular_user_location] ADD CONSTRAINT [user_popular_user_location] 
    FOREIGN KEY ([user_id]) REFERENCES [user] ([user_id])



ALTER TABLE [popular_user_location] ADD CONSTRAINT [geographical_location_popular_user_location] 
    FOREIGN KEY ([geographical_location_id]) REFERENCES [geographical_location] ([geographical_location_id])

Wasn't sure what the target DB was; the above is MS SQL Server.

Jemimah answered 23/1, 2019 at 6:11 Comment(0)
M
0

Well, i have two tables:

  1. songs

a) Song number b) Song title ....

  1. playlists a) Playlist number b) Playlist title ...

and i have an third

  1. songs_to_playlist_relation

The problem is that some kinds of playlists have link to other playlists. But in mysql we don't have foreign key that is associated with two tables.

My solution: I will put a third column in songs_to_playlist_relation. That column will be boolean. If 1 then song, else will link to the playlist table.

So:

  1. songs_to_playlist_relation

a) Playlist_number (int) b) Is song (boolean) c) Relative number (song number or playlist number) (int) (not foreign key to any table)

 #create table songs
    queries.append("SET SQL_MODE = NO_AUTO_VALUE_ON_ZERO;")
    queries.append("CREATE TABLE songs (NUMBER int(11) NOT NULL,SONG POSITION int(11) NOT NULL,PLAY SONG tinyint(1) NOT NULL DEFAULT '1',SONG TITLE varchar(255) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,DESCRIPTION varchar(1000) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,ARTIST varchar(255) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL DEFAULT 'Άγνωστος καλλιτέχνης',AUTHOR varchar(255) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL DEFAULT 'Άγνωστος στιχουργός',COMPOSER varchar(255) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL DEFAULT 'Άγνωστος συνθέτης',ALBUM varchar(255) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL DEFAULT 'Άγνωστο άλμπουμ',YEAR int(11) NOT NULL DEFAULT '33',RATING int(11) NOT NULL DEFAULT '5',IMAGE varchar(600) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,SONG PATH varchar(500) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,SONG REPEAT int(11) NOT NULL DEFAULT '0',VOLUME float NOT NULL DEFAULT '1',SPEED float NOT NULL DEFAULT '1') ENGINE=InnoDB DEFAULT CHARSET=utf8;")
    queries.append("ALTER TABLE songs ADD PRIMARY KEY (NUMBER), ADD UNIQUE KEY POSITION (SONG POSITION), ADD UNIQUE KEY TITLE (SONG TITLE), ADD UNIQUE KEY PATH (SONG PATH);")
    queries.append("ALTER TABLE songs MODIFY NUMBER int(11) NOT NULL AUTO_INCREMENT;")

#create table playlists
queries.append("CREATE TABLE `playlists` (`NUMBER` int(11) NOT NULL,`PLAYLIST POSITION` int(11) NOT NULL,`PLAYLIST TITLE` varchar(255) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,`PLAYLIST PATH` varchar(500) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL) ENGINE=InnoDB DEFAULT CHARSET=utf8;")
queries.append("ALTER TABLE `playlists` ADD PRIMARY KEY (`NUMBER`),ADD UNIQUE KEY `POSITION` (`PLAYLIST POSITION`),ADD UNIQUE KEY `TITLE` (`PLAYLIST TITLE`),ADD UNIQUE KEY `PATH` (`PLAYLIST PATH`);")
queries.append("ALTER TABLE `playlists` MODIFY `NUMBER` int(11) NOT NULL AUTO_INCREMENT;")

#create table for songs to playlist relation
queries.append("CREATE TABLE `songs of playlist` (`PLAYLIST NUMBER` int(11) NOT NULL,`SONG OR PLAYLIST` tinyint(1) NOT NULL DEFAULT '1',`RELATIVE NUMBER` int(11) NOT NULL) ENGINE=InnoDB DEFAULT CHARSET=utf8;")
queries.append("ALTER TABLE `songs of playlist` ADD KEY `PLAYLIST NUMBER` (`PLAYLIST NUMBER`) USING BTREE;")
queries.append("ALTER TABLE `songs of playlist` ADD CONSTRAINT `playlist of playlist_ibfk_1` FOREIGN KEY (`PLAYLIST NUMBER`) REFERENCES `playlists` (`NUMBER`) ON DELETE RESTRICT ON UPDATE RESTRICT")

That's all!
playlists_query = "SELECT s1.*, s3.*, s4.* FROM songs as s1 INNER JOIN `songs of playlist` as s2 ON s1.`NUMBER` = s2.`RELATIVE NUMBER` INNER JOIN `playlists` as s3 ON s3.`NUMBER` = s2.`PLAYLIST NUMBER` INNER JOIN `playlists` as s4 ON s4.`NUMBER` = s2.`RELATIVE NUMBER` ORDER BY s3.`PLAYLIST POSITION`,`s1`.`SONG POSITION`"
Madden answered 6/5, 2020 at 15:32 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.