Should I put optional record properties in a separate table?

Asked 27/8, 2012 at 4:49 Answered 27/8, 2012 at 10:31

Solved database ms-access database-design ms-access-2007 normalization

I have a table of about 1,000 records. Around half of them will utilise a set of fields containing certain characteristics. There's about 10 relevant fields. The other half of the records won't need that information filled in.

This table is central to the database and will be taking the bulk of the operations. Though at only around 1,000 records, it's not much.

The hardware that the database is stored on is old and slow (spinning hard drive not SSD... ) so I want to have a fairly optimised structure to make the most of it. Obviously the increased size of the database alone due to the blank fields isn't a major concern, but if it's slowing down queries then that's not good.

I guess I should describe the setup. Currently Access 2007 client and Access backend, though the backend will soon move to SQL server. Currently the backend is on the main server rack, but when moved to SQL Server it will get its own older server rack.

So should I make a separate table to store the aforementioned set of characteristics, or should I leave it as is?

Incomparable answered 27/8, 2012 at 4:49 Comment(4)

Well - putting them into a separate table is additional overhead and makes querying more difficult.... since you say about half the rows do have these columns, I would probably just put them into the main table and make them nullable. I would only go to a separate table if 75% or more of the columns are "optional" (only valid for a given set of rows). Just a rough "rule of thumb" – Ryon 27/8, 2012 at 4:53

True, the added complexity will be a bit annoying, particularly with forms and such. May just leave it as is then. – Incomparable 27/8, 2012 at 4:54

"The backend will soon move to SQL server" Then what's to stop you building it for SQL Server now? Installing SQL Server Express takes only about 10 minutes. Best practice application design for SQL Server is very different to Jet/ACE. Even basic features like uniqueness constraints and referential integrity work differently. I suggest you could save yourself a whole lot of work if you design it for SQL Server from the beginning. – St 27/8, 2012 at 9:52

I wanted to first change my field and table names, because when I started building I didn't follow good naming practice (standardised, no spaces etc.). I was contemplating doing this table change also I suppose. Reason is that I have barely any idea about SQL Server, and I figure it would be smart to minimise the number of "learning hurdles" I have to jump when I make the transition. The SQL server transition will be next week or the week after's job, though I will try start playing around with it sooner. – Incomparable 27/8, 2012 at 13:2

The querying overhead of putting the optional fields into a separate table and then using a join doesn't provide much benefit to size or data managment. Especially if it's 1-to-1 like in your example. For size, the optional fields will NULL don't affect you much. And yes, 75% is good random threshold for when you should start moving things out but even then, you're not actually normalizing anything by moving out the optional fields (if they are 1-to-1 with the record and you will always be fetching it along with the main record).

Worth noting: With most DBs, getting large rows in single queries is better than several small queries...in case you later have the urge to get the optional data in the 2nd table in a separate query. In Access 2007 this may matter less though.

And regardless of whether or not you move those optional fields out, add indexes for those fields which you may use in a where/having/join.

Steel answered 27/8, 2012 at 7:45 Comment(0)

My impression from what you've said is that you should use separate tables. The dependencies you want to represent and the needs of data integrity ("business rules") should determine which table(s) any attribute goes in.

In your case it sounds like you have two kinds of facts to be represented. Those fact types have distinct sets of attributes and therefore they belong in different tables. If you combine two different fact types into one table and make one set of attributes nullable then you could compromise data integrity: i.e. by permitting values for some attribute when the business rules require no such value and by allowing a value to be absent when business rules in fact require it.

For a more formal way of answering this, see Fifth Normal Form and the Principle of Orthogonal Design. If you aren't already aware of those design principles then you should familiarise yourself with them.

St answered 27/8, 2012 at 9:40 Comment(5)

To get some clarification on what you're saying - the OP hasn't specified the fact types apart from their optional attribute being optional. So a business rule (at a database or program level) that's wrongly implemented would affect the non-NULLable values as well as NULLable values; quote: "by permitting values for some attribute when the business rules require no such value and by allowing a value to be absent when business rules in fact require it." So how does moving it to a separate table solve that problem? And since this info is 1-1 part of the same record, imho, the 5NF doesn't apply. – Steel 27/8, 2012 at 10:37

Aneroid seems to have understood my situation quite well. It's a good point considering business rules, but there's no strict business rules here about not containing data. The data could quite rightly be filled in for the nulls, but it just wouldn't be that useful. – Incomparable 27/8, 2012 at 13:6

@andrewb, I'm in no position to identify dependencies in your data and that's why I began by stating "from what you've said". But if you do need guidance about where attributes belong then dependency theory and normalization should be your guide. Note also that the relationship implied by the "optional" attributes is 1 to 0/1 and not 1 to 1. – St 27/8, 2012 at 13:34

@sqlvogel if the optional fields were say, free text without any standard; as in...not country, language, genre, etc., would you still say that the optional fields should be kept in a separate table? That was my understanding of this question. My impression of your answer is that it applies when the data in those fields could be normalized like country, lang, genre, and the relations maintained in a separate lookup table similar to what branko-dimitrijevic mentioned below (and in that case I would agree). – Steel 27/8, 2012 at 16:21

The table is for clients. "clients" are like the groups for companies. This extra information is always valid, but it's only important (though not absolutely necessary) if we provide certain services. There is no business rule of why the data can't be entered if we don't provide the certain services - it's still valid. The fields are mostly normalised (except for one notes field). – Incomparable 27/8, 2012 at 23:19

Vertical partitioning makes sense for a large data set to make the cache more efficient. 1000 rows doesn't qualify as "large" even on a rather old hardware.

So unless there are other reasons to redesign this table (you didn't merge lookups didn't you?), you are good to go.

Unattended answered 27/8, 2012 at 10:31 Comment(1)

Agreed, 1000 rows is not large. Though with any operation there will be a degree of latency, and it just so happens that I'm already noticing significant latency with some tasks. Probably best I wait till moving to SQL Server before I give it much thought though. I don't think I've merged lookups, I'm not exactly sure what that is but I've made sure that each table and field is for a specific set of data. – Incomparable 27/8, 2012 at 13:11

Recommended topics

Hot tags