Executing a trigger AFTER the completion of a transaction
Asked Answered
H

2

18

In PostgreSQL, are DEFERRED triggers executed before (within) the completion of the transaction or just after it?

The documentation says:

DEFERRABLE
NOT DEFERRABLE

This controls whether the constraint can be deferred. A constraint that is not deferrable will be checked immediately after every command. Checking of constraints that are deferrable can be postponed until the end of the transaction (using the SET CONSTRAINTS command).

It doesn't specify if it is still inside the transaction or out. My personal experience says that it is inside the transaction and I need it to be outside!

Are DEFERRED (or INITIALLY DEFERRED) triggers executed inside of the transaction? And if they are, how can I postpone their execution to the time when the transaction is completed?

To give you a hint what I'm after, I'm using pg_notify and RabbitMQ (PostgreSQL LISTEN Exchange) to send out messages. I process such messages in an external application. Right now I have a trigger which notifies the external app of the newly inserted records by including the record's id in the message. But in a non-deterministic way, once in a while, when I try to select a record by its id at hand, the record can not be found. That's because the transaction is not complete yet and the record is not actually added to the table. If I can only postpone the execution of the trigger for after the completion of the transaction, everything will work out.

In order to get better answers let me explain the situation even closer to the real world. The actual scenario is a little more complicated than what I explained before. The source code can be found here if anyone's interested. Becuase of reasons that I'm not gonna dig into, I have to send the notification from another database so the notification is actually sent like:

PERFORM * FROM dblink('hq','SELECT pg_notify(''' || channel || ''', ''' || payload || ''')');

Which I'm sure makes the whole situation much more complicated.

Helyn answered 30/8, 2016 at 4:53 Comment(7)
It would clearly be inside. It says the end of the transaction, not after the transaction ends or after the conclusion of the transaction.Zoller
Why don't you do what everybody else does: Have your application poll the database for stuff to work on. It's time-tested, very widely used, supported out of the box by lots of frameworks and is as close to bulletproof as you can hope to get. Having an inverted/push pattern is not best practice.Griffy
Maybe the title should be something like "How to deliver notifications after transaction commit".Brice
@Griffy Polling is absolutely not the way. It's a terrible design and it only works on data added. I'm also after data updated and data deleted. Of course, you can translate update and delete to add (keep a log table) but I'm also trying to have multiple programs work on a single database (to keep the data up-to-date). Long story short, never ever go for the polling, it's never the answer.Helyn
@mehran re update, you can have a trigger insert the id into the queue table on update. re "never", I've built exactly this (trigger on update, delete and insert to insert into a "queue" table) at one of the largest and most profitable websites in Australia and it worked flawlessly, had very high performance and was simple to implement, so there :pGriffy
@Griffy Your solution would be my last resort which I hope it would never come to that. What I'm trying to do here strongly depends on the whole system to be as light as possible. Thanks for the advice, I'll keep it in my mind.Helyn
@Mehran, polling is always the way to go. Every system you ever worked on that you thought was pushing, was just polling and queueing under the hood so you didn't have to see it.Kellykellyann
B
17

Triggers (including all sorts of deferred triggers) fire inside the transaction.

But that is not the problem here, because notifications are delivered between transactions anyway.

The manual on NOTIFY:

NOTIFY interacts with SQL transactions in some important ways. Firstly, if a NOTIFY is executed inside a transaction, the notify events are not delivered until and unless the transaction is committed. This is appropriate, since if the transaction is aborted, all the commands within it have had no effect, including NOTIFY. But it can be disconcerting if one is expecting the notification events to be delivered immediately. Secondly, if a listening session receives a notification signal while it is within a transaction, the notification event will not be delivered to its connected client until just after the transaction is completed (either committed or aborted). Again, the reasoning is that if a notification were delivered within a transaction that was later aborted, one would want the notification to be undone somehow — but the server cannot "take back" a notification once it has sent it to the client. So notification events are only delivered between transactions. The upshot of this is that applications using NOTIFY for real-time signaling should try to keep their transactions short.

Bold emphasis mine.

pg_notify() is just a convenient wrapper function for the SQL NOTIFY command.

If some rows cannot be found after a notification has been received, there must be a different cause! Go find it. Likely candidates:

  • Concurrent transactions interfering
  • Triggers doing something more or different than you think they do.
  • All sorts of programming errors.

Either way, like the manual suggests, keep transactions that send notifications short.

dblink

Update: Transaction control in a PROCEDURE or DO statement in Postgres 11 or later makes this a lot simpler. Just COMMIT; to (also) send waiting notifications.


Original answer (mostly for Postgres 10 or older):

PERFORM * FROM dblink('hq','SELECT pg_notify(''' || channel || ''', ''' || payload || ''')');

... which should be rewritten with format() to simplify and make the syntax secure:

PRERFORM dblink('hq', format('NOTIFY %I, %L', channel, payload));

dblink is a game-changer here, because it opens a separate transaction in the other database. This is sometimes used to fake autonomous transaction.

dblink() waits for the remote command to finish. So the remote transaction will most probably commit first. The manual:

The function returns the row(s) produced by the query.

If you can send notification from the same transaction instead, that would be a clean solution.

Workaround for dblink

If notifications have to be sent from a different transaction, there is a workaround with dblink_send_query():

dblink_send_query sends a query to be executed asynchronously, that is, without immediately waiting for the result.

DO  -- or plpgsql function
$$
BEGIN
   -- do stuff

   PERFORM dblink_connect   ('hq',   'your_connstr_or_foreign_server_here');
   PERFORM dblink_send_query('con1', format('SELECT pg_sleep(3); NOTIFY %I, %L ', 'Channel', 'payload'));
   PERFORM dblink_disconnect('con1');
END
$$;

If you do this right before the end of the transaction, your local transaction gets 3 seconds (pg_sleep(3)) head start to commit. Chose an appropriate number of seconds.

There is an inherent uncertainty to this approach, since you get no error message if anything goes wrong. For a secure solution you need a different design. After successfully sending the command, chances for it to still fail are extremely slim, though. The chance that successful notifications are missed seem much higher, but that's built into your current solution already.

Safe alternative

A safer alternative would be to write to a queue table and poll it like discussed in @Bohemian's answer. This related answer demonstrates how to poll safely:

Brice answered 13/11, 2016 at 5:28 Comment(5)
Thank you for the answer. To be honest, I wasn't totally honest in my scenario. I'll update the question to match the real problem.Helyn
@Mehran: I still got a better idea. Consider the update. BTW, the unlikely timing was pure coincidence. You just happened to accept my answer the minute before I finished my update. :)Brice
I have already tested the pg_sleep but a little different. I defined a stored procedure with basically what you wrote here. But it made the whole situation much worse! If I was facing with the mentioned problem in less than 10% of the times, using the pg_sleep it was 100%! I don't know why and I ended up implementing the polling system (for several reasons) so it would be hard for me to give this solution one more try. But thanks anyway.Helyn
@Mehran: Did you use dblink_send_query? That is the important part in the workaround. I tested and it worked for me: notification came seconds after my calling transaction finished. Either way, I think a queuing solution is probably a better design anyway.Brice
You are right. I tested with plain db_link. Unfortunately, I don't have the test set anymore to give db_link_query a try. I totally agree with you and Bohemian that polling is a better design here (not in general). Thank you so much for your time and effort.Helyn
G
10

I'm posting this as an answer, assuming the actual problem you are trying to solve is deferring execution of an external process until after the transaction is completed (rather than the X-Y "problem" you're trying to solve using trigger Kung Fu).

Having the database tell an app to do something is a broken pattern. It's broken because:

  1. There's no fallback if the app doesn't get the message, eg because it's down, network explodes, whatever. Even the app replying with an acknowledgment (which it can't), wouldn't fix this problem (see next point)
  2. There's no sensible way to retry the work if the app gets the message but fails to complete it (for any of lots of reasons)

In contrast, using the database as a persistant queue, and having the app poll it for work, and take the work off the queue when work is complete, has none of the above problems.

There are lots of ways to achieve this. The one I prefer is to have some process (usually trigger on insert, update and delete) put data into a "queue" table. Have another process poll that table for work to do, and delete from the table when work is complete.

It also adds some other benefits:

  • The production and consumption of work is decoupled, which means you can safely kill and restart your app (which must happen from time to time, eg deploying) - the queue table will happily grow while the app is down, and will drain when the app is back up. You can even replace the app with an entirely new one
  • If for whatever reason you want to initiate processing of certain items, you can just manually insert rows into the queue table. I used this technique myself to initiate the processing of all items in a database that needed initialising by being put on the queue once. Importantly, I didn't need to do a perfunctory update to every row just to fire the trigger
  • Getting to your question, a slight delay can be introduced by adding a timestamp column to the queue table and having the poll query only select rows that are older than (say) 1 second, which gives the database time to complete its transaction
  • You can't overload the app. The app will read only as much work as it can handle. If your queue is growing, you need a faster app, or more apps If multiple consumers are operating, concurrency can be solved by (for example) adding a "token" column to the queue table

Queues that are backed by database tables is the basis of how persistent queues are implemented in commercial grade queue-based platforms, so the pattern is well tested, used and understood.

Leave the database to do what it does best, and the only thing it does well: Manage data. Don't try to make your database server into an app server.

Griffy answered 13/11, 2016 at 13:33 Comment(2)
Thank you so much for sharing your experience. It is really helpful. Even though I am considering your solution for me project but unfortunately it can not be accepted as the answer for this particular question. In any case, thanks again.Helyn
@mehan any answer can be accepted. If this has the "best" solution then accept it. If another answer is "best" accept that. It's a subjective thing.Griffy

© 2022 - 2024 — McMap. All rights reserved.