Subscriber-only trigger

Asked 10/10, 2017 at 7:4 Answered 30/10, 2017 at 3:49

sql-server t-sql triggers replication database-trigger

There are two servers. The first is ERP system on production. The second one is the BI server for heavy analytical queries. We update the BI server on a daily basis via backups. However, it's not enough, some users want to see their data changes more often than the next day. I have no access to the ERP server and can't do anything, except asking for backups or replications.

Before starting to ask for the replication. I want to understand if it's possible to use subscriber triggers in order to process not all the data, but changed. There is an ETL process to make some queries faster (indexing, transformation, etc). Triggers should do the trick, but I can't find a way to use them on the subscriber side only. The ERP system doesn't allow to make any changes on the DB level. So, the subscriber database seems to be fine for triggers (they don't affect on the ERP server performance). Nonetheless, I can't find a way to set them up. Processing all the data is an insane overhead.

Use case: Simplified example, say, we have two replicated tables:

+------------+-------------+--------+
|     dt     | customer_id | amount |
+------------+-------------+--------+
| 2017-01-01 |           1 |    234 |
| 2017-01-02 |           1 |    123 |
+------------+-------------+--------+

+------------+-------------+------------+------------+
| manager_id | customer_id | date_from  |  date_to   |
+------------+-------------+------------+------------+
|          1 |           1 | 2017-01-01 | 2017-01-02 |
|          2 |           1 | 2017-01-02 |       null |
+------------+-------------+------------+------------+

I need to transform them into the following indexed table:

+----------+-------------+------------+--------+
|  dt_id   | customer_id | manager_id | amount |
+----------+-------------+------------+--------+
| 20170101 |           1 |          1 |    234 |
| 20170102 |           1 |          2 |    123 |
+----------+-------------+------------+--------+

So, I created yet another database where I store the table above. Now, in order to update the table I have to truncate it and reinsert all the data again. I may join them all the in order to check the diffs, but it's too heavy for big tables as well. The trigger helps to track only changing records. The first input table can use a trigger:

create trigger SaleInsert
on Table1
after insert 
begin 
    insert into NewDB..IndexedTable
    select 
    //some transformation
    from inserted
    left join Table2
    on Table1.customer_id = Table2.customer_id
    and Table1.dt >= Table2.date_from
    and Table1.dt < Table2.date_to
end

The same idea for update, delete, a similar approach for the second table. I could get automatically updated DWH with little lags. Yeah, I expect performance lags for high-loaded databases. Theoretically, it should work smoothly on servers with the same configurations.

But, again, there are no triggers on the subscriber side only. Any ideas, alternatives?

Desmarais answered 10/10, 2017 at 7:4 Comment(4)

Are you asking if it is possible to have triggers on subscriber DB that are not present on publisher? – Frenchify 18/10, 2017 at 1:52

Right. It would be perfect. But I understand it's a sql server limitation, so I am looking for the closer and painless approach. – Desmarais 18/10, 2017 at 5:0

Have a look at this: sqlservercentral.com/Forums/Topic514032-291-1.aspx – Frenchify 20/10, 2017 at 0:27

Divide trigger for all and "not for replication". – Condonation 20/10, 2017 at 15:20

MS SQL Server has "Change Tracking" features that maybe be of use to you. You enable the database for change tracking and configure which tables you wish to track. SQL Server then creates change records on every update, insert, delete on a table and then lets you query for changes to records that have been made since the last time you checked. This is very useful for syncing changes and is more efficient than using triggers. It's also easier to manage than making your own tracking tables. This has been a feature since SQL Server 2005.

How to: Use SQL Server Change Tracking

Change tracking only captures the primary keys of the tables and let's you query which fields might have been modified. Then you can query the tables join on those keys to get the current data. If you want it to capture the data also you can use Change Capture, but it requires more overhead and at least SQL Server 2008 enterprise edition.

Change Data Capture

The general process is:

Get the current sync version
Get the last sync version you used to get changes
Get all the primary keys of the tables that have changed since that last version (inserts, updates, and deletes).
Join the keys with the data and pull down the data (if not using change capture).
Save the data and the current sync version (from the first step).

Then you repeat this process whenever you want to subscribe to the next set of changes. SQL Server does all the magic behind the scenes for you of storing the changes and the versioning. You also might want to look into Snapshot Isolation... it works well with it. The linked article has more information about that.

Australasia answered 20/10, 2017 at 20:52 Comment(6)

Wow, cool. Do you have an example code for my case? I'll take time to test it next week. Can I use it for the replicated db only? – Desmarais 21/10, 2017 at 7:26

As far as I understand, it's not applyable to tables with no primary keys, right? – Desmarais 21/10, 2017 at 8:32

Yep. Unique index or PK needed – Condonation 21/10, 2017 at 8:36

It doesn't solve my problem then. Read my initial post once again. I can't change replicated db structure. I need to monitor changes and respond to them. – Desmarais 21/10, 2017 at 11:20

Your erp didn't use primary keys? That would be very unusual. – Australasia 21/10, 2017 at 20:23

It uses, wherever it makes sense. Some tables have no keys. – Desmarais 22/10, 2017 at 16:51

This answer is somewhat roundabout, but given your tight limitations perhaps you'll consider it.

First, go for replication as you seem to have decided. You mentioned creating yet another database but were stuck for how to create triggers to populate it. The answer lies in the ability to run post-snapshot scripts. When creating the replication publication the DBA can specify a script to run on the Subscriber after the Snapshot.

You can have the script create all the triggers you require.

Also, to prevent replication from overwriting your triggers with "no trigger" (as defined in the ERP database) the DBA will need to verify that for each table on which you have triggers the property Copy user triggers is set to False.

Heronry answered 30/10, 2017 at 3:49 Comment(8)

Wow, looks promising. 1. I'm not sure about snapshots, the script must contain the changed data. 2. Can I use triggers on the replicated db if I set it to False? – Desmarais 30/10, 2017 at 8:21

1. You want to ask the DBA for Transactional Replication. In this configuration the snapshot is just your initial...well...snapshot. :-) Subsequently each transaction is replicated individually. Occasionally there may be a circumstance where another snapshot is required, hence the purpose of having a script (re)define all the triggers thereafter. 2. Yes. (You can in fact define a trigger directly in the Subscriber db but normally it instantly disappears because the replication brings it back in line with the Publication db, i.e. with no trigger. This option prevents that happening.) – Heronry 30/10, 2017 at 9:45

Okay. You mean I have to set up the transaction replication. Synchronize them. Run a create triggers script and it will work as expected. Did I get you right? – Desmarais 30/10, 2017 at 11:47

More or less. The create triggers script will run automatically on the initial and any subsequent snapshots. You'll need to prewrite all your triggers in order to put them in the script. You said in your original post you can "ask for replication" so I assume there's a DBA you'll be talking to. They'll need to set the properties as I've indicated above. – Heronry 31/10, 2017 at 0:30

I'll give a try. My dba has no clue how to set up a replication properly. He has an access. I have to find out how to make it happen myself and don't hurt ERP system in the same time. – Desmarais 31/10, 2017 at 6:32

Replication won't hurt your ERP system - it's designed to be low-impact. You didn't say what version of SQL Server you're running but there should be a wizard you can run in SQL Server Management Studio on the ERP Server from the Replication / Local Publications menu. Choose a Publication type of Transactional - this inherently incorporates the initial Snapshot functionality. Your BI Server will subscribe to the Publication produced by the ERP Server. If you think my post has been useful please vote it up. – Heronry 31/10, 2017 at 7:50

I'm working on Mirroring. Transactional replication requires primary key on every single table, but I need tables with no primary keys as well. – Desmarais 31/10, 2017 at 10:21

Okay. For what it's worth it appears Merge Replication can be done without the primary key constraint however I have no experience with that type. – Heronry 31/10, 2017 at 12:10

You cannot get updates in ERP database until you implement Triggers, or change tracking on ERP database. The best way is to go for replication, If you have no access to the ERP server and can't do anything, except asking for backups or replications.

Nosh answered 24/10, 2017 at 10:57 Comment(1)

I'll go for replication. I will have an access to the replicated db. However, there is no way to track changes with no keys or put triggers with no access to ERP db itself. – Desmarais 24/10, 2017 at 12:25

If you are using full recovery model on the ERP system database you could use Log file shipping to accomplish this. This would still have some level of delay between the production system and the reporting system. If some delay between DML statements being issued on the ERP system and the reporting system this solution would work. If you need nearly instant access to the data in the reporting system, replication and the associated overhead is your only good option. For more information of configuring Log File shipping: https://learn.microsoft.com/en-us/sql/database-engine/log-shipping/about-log-shipping-sql-server

Capillarity answered 27/10, 2017 at 15:7 Comment(3)

Delays are acceptable. My goal is to run my ETL procedures to handle the changed data only. Let's say, I can rerun it every half hour. Unfortunately, there are no unique options like triggers on replicated database. – Desmarais 27/10, 2017 at 20:35

Transaction Log shipping would only contain the changes since the last Tranlog backup. Basically if you had 25 transactions come in since the last Tranlog backup on the ERP system. Those transactions would be the only this in the tranlog file backup. It is a incremental change set. In my current environment we have a system that does this as frequently as every five minutes. – Capillarity 30/10, 2017 at 12:23

Can I catch them? For example, pass a list of changed customers to custom procedures. – Desmarais 30/10, 2017 at 12:49

Recommended topics

Hot tags