Polymorphism in SQL database tables?
Asked Answered
M

6

26

I currently have multiple tables in my database which consist of the same 'basic fields' like:

name character varying(100),
description text,
url character varying(255)

But I have multiple specializations of that basic table, which is for example that tv_series has the fields season, episode, airing, while the movies table has release_date, budget etc.

Now at first this is not a problem, but I want to create a second table, called linkgroups with a Foreign Key to these specialized tables. That means I would somehow have to normalize it within itself.

One way of solving this I have heard of is to normalize it with a key-value-pair-table, but I do not like that idea since it is kind of a 'database-within-a-database' scheme, I do not have a way to require certain keys/fields nor require a special type, and it would be a huge pain to fetch and order the data later.

So I am looking for a way now to 'share' a Primary Key between multiple tables or even better: a way to normalize it by having a general table and multiple specialized tables.

Metatherian answered 18/2, 2009 at 15:39 Comment(3)
Ok, I'm not 100% sure I'm reading this right. You have a bunch of disparate tables that have some common structure. You want to add a single table that references these tables, but you can't use a FK because each table has its own PK. If you can, I'd create a common master table and hang off that.Kasey
Yeah but my question is to how exactly do this most elegantlyMetatherian
What do you mean, "normalize"? Rearranging a table into multiple subtype tables (possibly plus a common/supertype table) or into a key-value encoding is not DB normalization. Normalization to higher NFs replaces a table by others that natural join back to it. PS Putting words in scare quotes does not clarify the idiosyncratic meaning that you don't make clear by actually saying what you mean.Varico
O
29

Right, the problem is you want only one object of one sub-type to reference any given row of the parent class. Starting from the example given by @Jay S, try this:

create table media_types (
  media_type     int primary key,
  media_name     varchar(20)
);
insert into media_types (media_type, media_name) values
  (2, 'TV series'),
  (3, 'movie');

create table media (
  media_id       int not null,
  media_type     not null,
  name           varchar(100),
  description    text,
  url            varchar(255),
  primary key (media_id),
  unique key (media_id, media_type),
  foreign key (media_type) 
    references media_types (media_type)
);

create table tv_series (
  media_id       int primary key,
  media_type     int check (media_type = 2),
  season         int,
  episode        int,
  airing         date,
  foreign key (media_id, media_type) 
    references media (media_id, media_type)
);

create table movies (
  media_id       int primary key,
  media_type     int check (media_type = 3),
  release_date   date,
  budget         numeric(9,2),
  foreign key (media_id, media_type) 
    references media (media_id, media_type)
);

This is an example of the disjoint subtypes mentioned by @mike g.


Re comments by @Countably Infinite and @Peter:

INSERT to two tables would require two insert statements. But that's also true in SQL any time you have child tables. It's an ordinary thing to do.

UPDATE may require two statements, but some brands of RDBMS support multi-table UPDATE with JOIN syntax, so you can do it in one statement.

When querying data, you can do it simply by querying the media table if you only need information about the common columns:

SELECT name, url FROM media WHERE media_id = ?

If you know you are querying a movie, you can get movie-specific information with a single join:

SELECT m.name, v.release_date
FROM media AS m
INNER JOIN movies AS v USING (media_id)
WHERE m.media_id = ?

If you want information for a given media entry, and you don't know what type it is, you'd have to join to all your subtype tables, knowing that only one such subtype table will match:

SELECT m.name, t.episode, v.release_date
FROM media AS m
LEFT OUTER JOIN tv_series AS t USING (media_id)
LEFT OUTER JOIN movies AS v USING (media_id)
WHERE m.media_id = ?

If the given media is a movie,then all columns in t.* will be NULL.

Obfuscate answered 18/2, 2009 at 17:18 Comment(6)
Anything known about support of it by JDBC? Will it just pass queries and updates to the DBMS, even if special annotations for subtypes are used? Or will there be problems?Frasquito
No, you need to write the correct SQL. JDBC does very little with respect to query rewriting. It does not support any annotations, just a few escape sequences to help vendor compatibility.Obfuscate
So result set will have just null values for attributes not used by the current sub types? And how will a read off the sub type itself, via the checked attribute?Frasquito
In this schema no column is serial or auto_increment, so you manage the values of media.media_id entirely in your code - right? It seems possible to have two entries with, say, media_id: 2, media_type: 2 and media_id: 2, media_type: 3 in the media table. If so, this query is not advicable: SELECT name, url FROM media WHERE media_id = ?Cattycornered
@hallvors, You are right, it was twelve years ago that I wrote the example above, and I did not code the constraints correctly. One would use a primary key on the id column alone, and add a second unique key constraint so that the pair of columns could be referenced by 2-column foreign keys in the other subtype tables. Do you need me to edit the code above to show that?Obfuscate
I have found solutions and was just looking at answers to see if any suggested better solutions :) but in general, I think editing to remove bugs from SO answers is a good idea and will bestowe amazing karma on you ;)Cattycornered
D
8

Consider using a main basic data table with tables extending off of it with specialized information.

Ex.

basic_data
id int,
name character varying(100),
description text,
url character varying(255)


tv_series
id int,
BDID int, --foreign key to basic_data
season,
episode
airing


movies
id int,
BDID int, --foreign key to basic_data
release_data
budget
Diploma answered 18/2, 2009 at 15:47 Comment(6)
Thanks, I have already thought of that before, the only problem I have had with this is that it theoretically gives the possibiltiy to have multiple entries in the tables for one in basic_data (i.e. there could be an entry in tv_series and movies). If nothing better comes, Ill do it this way.Metatherian
And also this is a unidirectional relationship since only the specialized tables point to the general table.Metatherian
Up vote, although I'd suggest that you be careful with the "basic_data" table not to make it too generic. If it's "media" or something like that it makes sense. Don't try to force things into it that don't make sense though.Anaanabaena
Yes, this is what I tried to say below but obviously used far too few words. I'll be more verbose next time.Goldner
Having more than one entry relate to basic_data is the whole IDEA of relational databases. And it is not unidirectional -- it is bidirectional, but with a one-to-many relationship from the basic table to the leaf tables.Kasey
@Kasey - I believe that the idea in this case is to subtype the "basic_data" table. It would be a 1:1 relationship, which should only exist in one other subtable, although I've seen other situations where allowing entries in multiple subtypes to be permitted too.Anaanabaena
G
2

What you are looking for is called 'disjoint subtypes' in the relational world. They are not supported in sql at the language level, but can be more or less implemented on top of sql.

Garber answered 18/2, 2009 at 17:3 Comment(0)
G
1

You could create one table with the main fields plus a uid then extension tables with the same uid for each specific case. To query these like separate tables you could create views.

Goldner answered 18/2, 2009 at 15:44 Comment(0)
F
1

Using the disjoint subtype approach suggested by Bill Karwin, how would you do INSERTs and UPDATEs without having to do it in two steps?

Getting data, I can introduce a View that joins and selects based on specific media_type but AFAIK I cant update or insert into that view because it affects multiple tables (I am talking MS SQL Server here). Can this be done without doing two operations - and without a stored procedure, natually.

Thanks

Farriery answered 13/5, 2010 at 0:51 Comment(1)
Well, I guess, I could use an instead of trigger to make this work, but it means I would have to code these up for each sub-type I have. And also, I think triggers are so "secret" when debugging... Is there a better way?Farriery
R
1

Question is quite old but for modern postresql versions it's also worth considering using json/jsonb/hstore type. For example:

create table some_table (
    name character varying(100),
    description text,
    url character varying(255),
    additional_data json
);
Rombert answered 5/5, 2015 at 13:10 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.