Should draft records be kept in a separate table?
Asked Answered
H

4

8

We're building a simple web based system whereby somebody adds a record, a CMS page for example, that gets approved by someone in charge before being shown on website.

If the author then decides to edit that page later, we want to create a draft based on the live copy, on approval it will replace the old live page.

We thought about doing complete version control but believe we can keep this simpler by just having 1. Just a draft, 2. Just a live, or 3. One draft and one live.

This functionality is required across multiple 'things' not just pages.

Finally the question: Do you think it would be better to store these two records in the same table, or would a mirror table be better?

I guess it probably depends but I don't like the ideal of having two tables with the same structure. Is the trade off for slightly slower operations (as we'll have to query out the drafts all the time when displaying the data) worth it?

Hawse answered 11/3, 2010 at 16:9 Comment(3)
"slightly slower operations" Did you measure? Is this a fact? Or are you assuming it would be slower?Scottscotti
How can it be a fact when I haven't even written it yet? Clearly I'm assuming adding a where clause as opposed to just selecting all records from a live table will be slower, but I'm very open to learning why I'm wrong.Hawse
If you haven't written it, then you need to built two spike solutions and measure the difference. Assuming is bad. Measuring is good.Scottscotti
R
5

No. One entity type, one table.

Reasons to reconsider:

  1. Draft records outnumber live records by a factor of thousands to one.

  2. Security conditions require that some users who access the database directly have certain rights over draft or live records at the GRANT/REVOKE level but not over the other type of record.

A second design to consider would be one table for the Items and a second table for LiveItems. The second table contains only the IDs for the Items that are live. That way you're maintaining your single table design, but you can find the LiveItems by joining your one-column table back to the main table.

Reminisce answered 11/3, 2010 at 16:28 Comment(0)
S
10

Moving stuff from table to table when there's a state change is a bad idea.

When you want to add additional states to the workflow, you have to add yet more tables.

It's just a state change -- that's what relational databases are optimized for.

One table, multiple states is the standard approach.

If you find that things are horrifyingly slow -- and you can prove that the state-based query is the entire cause -- you can resort to "materialized views" or similar technology where the state change (and resulting move) is handled by the RDBMS.

Table-per-state is a bad idea.

  1. You can't easily add states. You have to add tables, also, making it painful. Further, you have to update code with the new table name(s) to reflect the new workflow.

    If a state is just a column, adding new states is adding new values and new if-statements in the code. State changes are just updates, not "delete-insert".

    Data lasts forever, workflows come and go every time a user has a clever idea. Don't punish them for wanting to change the workflow.

  2. You can't easily have sub-states. Many state machines are actually multiple, nested, state machines. Adding a sub-state with table-per-state creates yet more tables with yet more rules.

    If a state is just a column, a nested substate is just another column with new if-statements in the code. State changes are just updates, not "delete-insert".

  3. You can't easily have parallel state machines. Many times there are many parallel status code changes. Sometimes there are manual workflows (approvals) and automated worflows (archiving, copying to the data warehouse, etc.) With table-per-state and parallel state machines, there's no way to implement it rationally

    If each state is just a columns, parallel state machines are just parallel updates.

Scottscotti answered 11/3, 2010 at 16:15 Comment(4)
On the whole I agree with what you are saying, but you are aware right that this isn't just one record that goes through a set of states right? It will be initially, but then a draft will need to exist along with the live related record, and possible replace it later after approval.Hawse
@rsdbrown: "adds a record". Sounds singular to me. Please update the question if it not one record. What's important is that updating multiple records or updating a single record doesn't matter. Table-per-state is a bad design and rapidly gets worse when you try to add states. State is just a status code on the row.Scottscotti
Look, we're in agreement with the two tables being a bad design, I even hinted to my thoughts on it in my question! The reason I asked it was because I've seen how others have implemented this in rails plugins such as acts_as_versioned and has_draft. I don't think you understand my question and that's probably my doing. However others seem to grasp the idea and I believe the question is clear enough. Thank you for your input. Yours, others and my own thoughts have helped confirm that I will not have two tables.Hawse
"I don't think you understand my question". That comment didn't clarify your question, nor did you edit your question to clarify it. Simply saying I didn't understand doesn't seem to clarify it either. Feel free to actually update your question to make it more clear.Scottscotti
R
5

No. One entity type, one table.

Reasons to reconsider:

  1. Draft records outnumber live records by a factor of thousands to one.

  2. Security conditions require that some users who access the database directly have certain rights over draft or live records at the GRANT/REVOKE level but not over the other type of record.

A second design to consider would be one table for the Items and a second table for LiveItems. The second table contains only the IDs for the Items that are live. That way you're maintaining your single table design, but you can find the LiveItems by joining your one-column table back to the main table.

Reminisce answered 11/3, 2010 at 16:28 Comment(0)
C
2

Agreed with all the comments given above : only one table.
With the scopes, you can easily get only the published posts or the drafts.

I wouldn't recommand for it.
But if you really wish to have two different models for the drafts and published entries, there's an other solution though : STI.

You'd have two models :

class Post < ActiveRecord::Base
end

class Draft < Post
end

Any Draft object is taken from the Post table.
The Type parameter makes it a post or a draft.

Whenever you want to publish a post you'd then have to do :

@draft = Draft.first
@draft[:type] = 'Post'
Clientele answered 11/3, 2010 at 17:5 Comment(1)
STI was something else I've been considering, having a status column with scopes is almost the same as having a type column. Would you say the scopes/status way would be better then? I like the idea of STI as then the controllers will be like LivePost.all and DraftPost.allHawse
C
0

I just made a gem for such a use case. It stores drafts in a separate table: https://github.com/ledermann/drafting

Cagliostro answered 4/8, 2015 at 9:0 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.