Snapshot History With Entity Framework
Asked Answered
T

4

17

I've been looking at some auditing hooks with Entity Framework. Many of them show old/new value comparisons. This does great for an audit trail but I'm looking to snapshot objects.

For example... Let's say I have an application that manages products. A product has multiple attributes and associated other objects. Let's say I change an object 10 times. Let's also say that's important that I can view screens of those object changes (not an audit trail but what the screen actually looked like in a read only format). What I'm interested in is being able to retrieve the original EF product object (with all of the associated data) for all 10 of those changes (depending on which I want to see)and use that to bind to my screen.

If I'm using SQL Server, what type should I use for the serialized object nowadays (XML, blob, etc)? Does it make sense to do this?

Tudela answered 21/9, 2009 at 14:27 Comment(3)
I will add a bounty to get more detailed answer please :)Basicity
@AnynameDonotcare for each table you want to store history for: 1) add Version column (can be time, can be incrementing counter). 2) add another table with all the same properties as the main table (but better expand foreign keys). 3) In before update trigger, if Version column has changed - copy old values to this version table. 4) Profit.Shovelboard
@Shovelboard : i will be so grateful if you could add an detailed answer with simple example(using EF) could be used as a base for enterprise application (concerning delete action and m-m relationships)Basicity
S
12

Let's see. You have a requirement to take an object graph and serialize it into the database in a format which will allow you to materialize it later on. I think that there are tools which do exactly this. One of them, it strikes me, is the Entity Framework.

What you want to do is a very common thing. Consider a wiki engine. The wiki needs to have a tip revision that everyone sees, plus back revisions of every document. The wiki also needs to be able to display a back revision in just the same way that a tip revision is displayed. Therefore, the same storage format should be used for both of them.

I would propose that you allow all of your entity types to be versioned. When you edit an entity type, you will edit the tip revision and store a back revision containing the previous values. (The reason you edit the tip revision instead of inserting a new tip is because other objects, which are not currently materialized into an ObjectContext, may contain links to the tip which you would like to preserve as links to the tip, rather than links to the back revision.)

If necessary, you can partition your SQL Server tables so that the back revisions are stored in a different file group. This would allow you to backup the tip revisions and back revisions separately.

Sauterne answered 21/9, 2009 at 15:10 Comment(1)
I would propose that you allow all of your entity types to be versioned. Could you clarify more please about what you mean by this?Basicity
S
5

First you need to add a set of properties to your tables:

  • Version - time of last modification (can also be autoincrementing counter instead of time).
  • LastModifiedBy - reference to the user which made last modification (if you store that).

Then you have several options about how to store your version history. You can

  1. Create a new table for each of the main tables you want to store history for. That history tables will have all the same fields as main table, but primary and foreign keys will not be enforced. For each foreign key also store Version of referenced entry at the time version was created.

  2. OR you can serialize everything interesting about your entity and store all that serialized blobs for all entities you want to version in one global history table (I personally prefer first approach).

How do you fill your history tables? Via update and delete triggers.

  • In update trigger for your entity - copy all previous values to the history table. For each foreign key - also copy current Version of referenced entity.
  • In delete trigger - basically do the same.

Note that more and more modern systems do NOT really delete anything. They just mark things as deleted. If you would want to follow this pattern (which has several benefits) - instead of deleting add IsDeleted flag to your entities (of course you then have to filter deleted entities out everywhere).

How do you view your history? Just use history table, since it has all the same properties as main table - should not be a problem. But - when expanding foreign keys - ensure that referenced entity Version is the same as you store in your history table. If it's not - you need to go to History table of that referenced entity and grab values there. This way you will always have a snapshot of how entity looked like at THAT moment, including all references.

In addition to all above - you can also restore state of your entity to any previous version.

Note that this implementation, while easy, can consume some space, because it stores snapshot, not only changes being made. If you want to just store changes - in update trigger you can detect what fields has been changed, serialize them and store in global history table. That way you can at least show in user interface what has been changed and by whom (though you might have troubles to reverting to some previous version).

Shovelboard answered 27/5, 2016 at 9:59 Comment(11)
Why you prefer the first approach over the second one although it includes more redundant data and requires more db storage ? The triggers cause performance issues in addition to that i want to handle all these operations through 'EF'.Could you provide a simple example(two tables)Basicity
I hear first time that "triggers cause perfomance issues". Triggers by itself cannot cause them, only if you do some heavy logic inside. All in all you can do that in EF in exactly the same way - just in your update handler copy all current values first to history table (not sure which example I can provide here - I suppose you know how to copy values from one entity to another). I prefer first approach because it's more simple and clear, plus allows for easy restore to any previous version. In addition to that - it allows to search in history (which might be harder to do if you serialize).Shovelboard
if i want to handle the trigger logic through 'EF' , should it be in transaction ?Basicity
Yes sure (but remember that SaveChanges already runs inside transaction, so if you add your History entry to the context together will applying update there - you doing that in transaction already).Shovelboard
Thanks a lot please .1-what's the benefits of the two properties for each table if i would add those properties in the equivalent table that will be used for versioning.2-what 's about m-m relationships ?3-you have said that the modern systems don't delete anything , but i read a lot of considerations concerning soft-deletes mainly about the constrains .Basicity
1. Not sure what you mean by "two properties for each table". If that's about foreighn keys - I mean in history table to store versions of referenced keys, because referenced entities also can change. Suppose you have Table Person which references PersonInfo, which in turn has Address field. When you save history for Person table, you need (in history table) save both PersonInfoID (foreign key to PersonInfo) and PersonInfo.Version, because later PersonInfo might change and you need to know it's state at the moment you are saving it's state.Shovelboard
Suppose that instead you just serialize all properties from Person table and store that as blob. Then you cannot revert back to that version later, because anything could have changed during that time in any of the related tables. Again - this approach only if you want to be able to restore back, otherwise you want to just serialize changed columns and show that to user if necessary.Shovelboard
I mean Version ,LastModifiedBy in every table ? in addition to an equivalent table for every tableBasicity
2. About m-m relationships - I've only did that with IsDeleted column mentioned. There nothing is deleted and you can restore the state to any point of time. I hope it's obvious how m-n relationships work with IsDeleted column (basically the same way as all other). I won't advice to how to handle them without soft deletes, since I didn't do that myself. 3. That brings complications yes, so you need to have good reasons to do so. If you can do without - of course don't use soft deletes. However I cannot say it's always bad practice.Shovelboard
Version column you will copy to your History table on change. Say you have Version = yesterday. You update your row, now you have to copy all columns (including Version) to history table. So after that you will have a row in History table with Version = yesterday, and your main table would have Version = now.Shovelboard
By the way, if you want something more simple and with EF - look here github.com/loresoft/EntityFramework.Extended, at AuditLog functionality.Shovelboard
S
3

On a project I recently built we used we plugged in to the SaveChanges method in the DbContext class. This gave us access to an instance of the ChangeTracker class. Calling ChangeTracker.Entries() gives you access to a list of DbEntityEntry. DbEntityEntry has the following interesting properties and methods:

  • State - is the object newly created, modified or being deleted
  • Entity - a copy of the object as it stands
  • CurrentValues - an enumeration of the edited valued
  • OriginalValues - an enumeration of the original values

We created a set of POCOs for change sets and changes that we could then access through EF. This allowed our users to view field level changes along with dates and responsible users.

Seaton answered 27/5, 2016 at 8:33 Comment(0)
L
0

Have a look at Temporal tables (system-versioned temporal tables) if you are using SQL Server 2016< or Azure SQL.

https://learn.microsoft.com/en-us/sql/relational-databases/tables/temporal-tables?view=sql-server-ver15

From documentation:

Database feature that brings built-in support for providing information about data stored in the table at any point in time rather than only the data that is correct at the current moment in time. Temporal is a database feature that was introduced in ANSI SQL 2011.

I wrote a complete guide how to implement it with Entity Framework Core without any third party libraries here:

https://mcmap.net/q/139112/-audit-trail-with-entity-framework-core

Lamoureux answered 7/10, 2020 at 13:10 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.