Data modeling for Same tables with same columns

Asked 24/8, 2012 at 23:26 Answered 25/8, 2012 at 19:1

Solved c#sql-server database sql-server-2008 database-design

I have many tables that have same number of columns and names because they are all lookup tables. For example, there are LabelType and TaskType tables. LabelType and TaskType tables have TypeID and TypeName columns. They will be used as a foreign key in other tables such as LabelType table with shippingLog table and TaskType table with EmployeeTask Table.

LabelType Table
TypeID TypeName
1      Fedex
2      UPS
3      USPS

TaskType Table
TypeID TypeName
1      Receiving
2      Pickup
3      Shipping

So far, I have more than 20 tables and I am expecting it is going to be keep increasing. I have no problem with it , but I am just wondering whether there is any better or smarter way of using tables or not. I was even thinking to consolidate all those tables as one lookup Type Table and differentiate them by adding a foreign key from lookup table. The lookup table may have data like Label, Task, and etc. Then I just need one or two tables for all those lookup data.

Please, advise me if you have any better or smarter way of data modeling.

Hachure answered 24/8, 2012 at 23:26 Comment(0)

Just because data has similar structure doesn't mean it has the same meaning or same constraints. Keep your lookup tables separate. This keeps foreign keys separate, so the database can protect itself from referencing the wrong kind of lookup data.¹

I wish relational DBMSes supported inheritance, where you could define the basic structure in the parent table and just add specific FKs in the child tables. As it stands now, you'll need to endure some repetition in your DDL...

NOTE: One exception from "keep lookup tables separate" rule might be when your system needs to be dynamic (i.e. be able to add new kinds of lookup data without actually creating new physical tables in the database), but it doesn't look that way from your question.

¹ With one big lookup table, FKs alone won't stop (for example) the ShippingLog table from referencing a row meant for the EmployeeTask table. By using identifying relationships and migrating PKs, you can protect yourself from this, but not without introducing some redundancies and needing some careful constraining. It's cleaner and probably more performant to simply do the right thing and keep lookup tables separate.

Carper answered 25/8, 2012 at 19:1 Comment(4)

Thanks for clear answer. Creating tables are not bothering at all. I just wondering what the most efficient and right way of creating tables and relationships on database. – Hachure 27/8, 2012 at 13:39

I had an argument with my co-worker who prefers one table for all category because it is convenient for coding as a .Net developer. But I disagreed with him in terms of integrity between tables. I agree it is convenient for coding as I am a .Net developer because it is easier to use one table to make all different kind dropdownbox such as showing Label type, Shipping type, and etc on user interface. But I think I can use dynamic SQL stored procedure to achieve that matter. – Hachure 27/8, 2012 at 13:43

@Hachure You can always "re-merge" the lookup tables through a VIEW if for some reason that makes things easier on the client-side. You can even have the Kind field indicating which table any particular row came from, so you can use a single parametrized query to get any of the tables. See this SQL Fiddle for working example. – Carper 28/8, 2012 at 11:42

Cool! That is even better idea with view tables. – Hachure 28/8, 2012 at 16:31

Keep your lookup tables separate. It's faster at lookup time, and you will do millions of lookups between times when you add a new lookup table.

A lot of tables is not a big problem.

Handspike answered 24/8, 2012 at 23:49 Comment(4)

Why are a ton of tables more efficient than a single, properly-indexed and properly-keyed table? If we're really talking about millions of lookups then separating them physically but not logically (e.g. partitioning) is much easier to manage and to code (tons of similar tables seems to lend itself to lots of dynamic SQL). – Whispering 25/8, 2012 at 0:3

If the pair 1--Receiving and the pair 1--Fedex are stored in separate rows, then a third column will have to be added to disambiguate at lookup time. That third column will make the lookups slower. If the two pairs are stored in the same row, the you need two separate columns to store Fedex and Receiving in. This slows things down, and involves more manual management than lots of tables does. – Handspike 25/8, 2012 at 10:20

If that third column is indexed, lookups should not be slower. Plus it is much better to parameterize a column value than a table name. With your proposal the code has to change every single time you have to add a new lookup. I don't think you've made a strong case here at all, especially from the management perspective. Inserting a new row into an existing table involves more manual management than adding a new table and changing the code? – Whispering 25/8, 2012 at 10:48

+1 but not because it is more 'efficient' just likely to cause less pain later when you realise 'labeltype' needs an extra column for example. – Senarmontite 25/8, 2012 at 18:22

Recommended topics

Hot tags