How do I apply subtypes into an SQL Server database?
Asked Answered
E

5

12

I am working on a program in which you can register complaints. There are three types of complaints: internal (errors from employees), external (errors from another company) and supplier (errors made by a supplier). They hold different data which cannot be shared. I currently have 4 tables (complaint, employee, company and supplier). Here's a visualisation of the tables:

I have a basic understanding of subtypes but I cannot seem to translate them from an ERD into an actual SQL Server database, or at least in this scenario. This is roughly how the 4 tables look (irrelevant attributes omitted):

Complaint
ComplaintId PK

Employee
EmployeeId PK
EmployeeName

Company
CompanyId PK
CompanyName

Supplier
SupplierId PK
SupplierName

When registering a complaint, the error is made by either of the 3 types and they all store different information. What is the best way to store information in this case? I have thought of putting 2 discriminators in the Complaint-table: ComplaintType and Id so I can point to which table to check and what Id I need, but that isn't very clean nor efficient.

Please assist.

Emanuelemanuela answered 8/10, 2010 at 20:26 Comment(0)
A
4

I highly recommend you DO NOT use the "2 discriminators" method. You will effectively have a foreign key column that points to one of three tables, depending on the ComplaintType field. If you do this you will be by-passing the referential integrity checks supplied by SQL Server and all of the benefits that come with foreign keys. At my previous job, there was a table called EntityTypeIndexLabel which was a "bridge table" that attached IndexLabels (basically metadata) to various "entities", which were many different potential tables (Document, Binder, Workflow, etc...). This was simply awful. The FK in this table could point to many different tables. Orphaned records could pop-up everywhere. Extra logic had to be implemented to determine which table to join on. Joins were a pain to write in general. It was all kinds of headache.

I think your two options are:

-3 columns in Complaint: EmployeeComplaintID, CompanyComplaintID, SupplierComplaintID. ComplaintIDs should be unique across all of the tables (think GUIDs here instead of IDENTITY columns). Each row in Complaint will have only one of these IDs populated, the other two will be NULL. Then you can simply LEFT OUTER JOIN on these tables in every query to get the data that you need.

-One giant table with all of the possible fields you need for each complaint type, setting unused fields of other complaint types to NULL.

Albano answered 8/10, 2010 at 20:46 Comment(3)
Thank you for your suggestion; I will definitely stay far away from the 2 discriminators method. I have also tried the 3 columns method but not yet with any sort of constraint. Is it possible to force at least one of the fields not to be null? If so, how would you do that?Emanuelemanuela
I've chosen to accept your answer as 'the answer' because it's informative and I've also chosen your first option to make things work.Emanuelemanuela
@Albano I do not see how a ComplaintType column and three separate tables would be a problem. Each of child tables contains a foreign key to table Parent. See this fiddle: sqlfiddle.com/#!3/6118b6/2 The only problem is that one could reference wrong Parent, the parent of incorect type. But this can be solved with an introduction of one more constraint and extension of foreign keys. See this fiddle: sqlfiddle.com/#!3/58cd0/1 But, as SQL Server does waste space storing null columns I would go with ine big table for all types and store null values in columns where not applicable.Sphenic
I
21

See a few really good resources on the topic:

There's basically three well-known approaches:

  • Table per Subclass
  • Table per Hierarchy
  • Table per Concrete Type

Each has pros and cons, shines in some situation and sucks in others - study the resources and see which of the three suits your needs the best.

Ingathering answered 8/10, 2010 at 20:55 Comment(0)
A
4

I highly recommend you DO NOT use the "2 discriminators" method. You will effectively have a foreign key column that points to one of three tables, depending on the ComplaintType field. If you do this you will be by-passing the referential integrity checks supplied by SQL Server and all of the benefits that come with foreign keys. At my previous job, there was a table called EntityTypeIndexLabel which was a "bridge table" that attached IndexLabels (basically metadata) to various "entities", which were many different potential tables (Document, Binder, Workflow, etc...). This was simply awful. The FK in this table could point to many different tables. Orphaned records could pop-up everywhere. Extra logic had to be implemented to determine which table to join on. Joins were a pain to write in general. It was all kinds of headache.

I think your two options are:

-3 columns in Complaint: EmployeeComplaintID, CompanyComplaintID, SupplierComplaintID. ComplaintIDs should be unique across all of the tables (think GUIDs here instead of IDENTITY columns). Each row in Complaint will have only one of these IDs populated, the other two will be NULL. Then you can simply LEFT OUTER JOIN on these tables in every query to get the data that you need.

-One giant table with all of the possible fields you need for each complaint type, setting unused fields of other complaint types to NULL.

Albano answered 8/10, 2010 at 20:46 Comment(3)
Thank you for your suggestion; I will definitely stay far away from the 2 discriminators method. I have also tried the 3 columns method but not yet with any sort of constraint. Is it possible to force at least one of the fields not to be null? If so, how would you do that?Emanuelemanuela
I've chosen to accept your answer as 'the answer' because it's informative and I've also chosen your first option to make things work.Emanuelemanuela
@Albano I do not see how a ComplaintType column and three separate tables would be a problem. Each of child tables contains a foreign key to table Parent. See this fiddle: sqlfiddle.com/#!3/6118b6/2 The only problem is that one could reference wrong Parent, the parent of incorect type. But this can be solved with an introduction of one more constraint and extension of foreign keys. See this fiddle: sqlfiddle.com/#!3/58cd0/1 But, as SQL Server does waste space storing null columns I would go with ine big table for all types and store null values in columns where not applicable.Sphenic
T
2

Is the main issue that you need some sort of "serial number" to uniquely identify a complaint regardless of which type? Basically, then, you need a table for each type of complaint (as you will have, I think), plus the master "Complaint" table with the ComplaintId. Each of the type-specific tables will have a foreign key to Complaint.ComplaintId. You may find it useful to have a "type" field in Complaint, but that isn't really required for the model.

Trainman answered 8/10, 2010 at 20:30 Comment(2)
I have considered having separate tables for each complaint type, but there are simply too many attributes in each table that are redundant.Emanuelemanuela
Then put those attributes in the main "Complaint" table, putting only the unique elements in each of the more specific tables.Trainman
I
1

You can have a complaintSubTypeID with a FK relationship to the PK of all three of your subtype tables- employee, company, and supplier.

Indelicate answered 8/10, 2010 at 20:36 Comment(4)
I've tried this, but doesn't the Id you enter in the Complaint table have to match that of all three the tables?Emanuelemanuela
no, one of the three. the assumption is that the keys are mutually exclusive.Indelicate
I must be doing something wrong then. I created a SubTypeId in my Complaint table and created a relation to all three tables' primary keys. I couldn't enter any Id except that which is the same across all three tables.Emanuelemanuela
you should be able to enter pk 1 in the main table, then pk 1 in one of the subtables. then pk2 in main, pk2 in one of the subs.Indelicate
P
1

In response to you're comment on the accepted answer:

Below is a way to have a check check to ensure only one of the three keys has data:

alter table complaint_master 
    add constraint loc_attribute_has_one_value 
    check ( 
        (case when complaint_employee is null then 0 else 1 end) + 
        (case when complaint_supplier is null then 0 else 1 end) + 
        (case when complaint_external is null then 0 else 1 end)  = 1 
    ); 
Poltroon answered 5/9, 2014 at 1:20 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.