Committing transactions while executing a postgreql Function
Asked Answered
N

4

26

I have Postgresql Function which has to INSERT about 1.5 million data into a table. What I want is I want to see the table getting populated with every one records insertion. Currently what is happening when I am trying with say about 1000 records, the table gets populated only after the complete function gets executed. If I stop the function half way through, no data gets populated. How can I make the record committed even if I stop after certain number of records have been inserted?

Namnama answered 12/3, 2014 at 12:7 Comment(1)
Possible duplicate of Are PostgreSQL functions transactional?Halve
P
18

This can be done using dblink. I showed an example with one insert being committed you will need to add your while loop logic and commit every loop. You can http://www.postgresql.org/docs/9.3/static/contrib-dblink-connect.html

CREATE OR REPLACE FUNCTION log_the_dancing(ip_dance_entry text)
RETURNS INT AS
$BODY$
    DECLARE
    BEGIN
        PERFORM dblink_connect('dblink_trans','dbname=sandbox port=5433 user=postgres');
        PERFORM dblink('dblink_trans','INSERT INTO dance_log(dance_entry) SELECT ' || '''' || ip_dance_entry || '''');
        PERFORM dblink('dblink_trans','COMMIT;');
        PERFORM dblink_disconnect('dblink_trans'); 

        RETURN 0;
    END;
$BODY$
  LANGUAGE plpgsql VOLATILE
  COST 100;

ALTER FUNCTION log_the_dancing(ip_dance_entry text)
  OWNER TO postgres;

BEGIN TRANSACTION;
  select log_the_dancing('The Flamingo');
  select log_the_dancing('Break Dance');
  select log_the_dancing('Cha Cha');
ROLLBACK TRANSACTION;

--Show records committed even though we rolled back outer transaction
select *
from dance_log;
Papyraceous answered 12/3, 2014 at 13:37 Comment(6)
I typically only use this method when logging. Outside of logging it's usually not the correct approach to do batch commits, but I'll leave it to you to decide that for your use case.Papyraceous
Is dblink available for postgresql 9.2?Namnama
Yes, you should be able to execute CREATE EXTENSION dblink; from a SQL session connected as the postgres user and test it out.Papyraceous
@Bob It's also useful for long running load/transform/generation processes when you want to commit at some interval to avoid losing work (and therefore time). It could be argued that this belongs at the application tier (or in a shell script or something), but sometimes I have to do this kind of work manually and would really appreciate a lighter weight way of doing it in DB.Yusem
Opening a new dblink connection every time you want to log some event is OK for tracking batch processes, but not for general log capture. Any other options here?Boxwood
@AndrewWolfe Take a look at this idea github.com/GoatWalker/plog. The idea was to try and reusing a dblink connection if one existed to help reduce creating a new one every single time. It's been years since I played around with that, but maybe there's some value in it.Papyraceous
T
12

If you have the flexibility to change from function to procedure, from PostgreSQL 12 onwards you can do internal commits if you use procedures instead of functions, invoked by CALL command. Therefore your function will be changed to a procedure and invoked with CALL command: e.g:

    CREATE PROCEDURE transaction_test2()
    LANGUAGE plpgsql
    AS $$
    DECLARE
     r RECORD;
    BEGIN
      FOR r IN SELECT * FROM test2 ORDER BY x LOOP
        INSERT INTO test1 (a) VALUES (r.x);
        COMMIT;
      END LOOP;
    END;
    $$;

    CALL transaction_test2();

More details about transaction management regarding Postgres are available here: https://www.postgresql.org/docs/12/plpgsql-transactions.html

Twi answered 10/7, 2020 at 10:33 Comment(2)
Note that the procedure must be called from "top level" or another procedure, with no function call in the chain. As the linked docs state: _Transaction control is only possible in CALL or DO invocations from the top level or nested CALL or DO invocations without any other intervening command. _Prothallus
I think this is available since PG 11? postgresql.org/docs/11/plpgsql-transactions.htmlEcru
Y
9

What you're asking for is generally called an autonomous transaction.

PostgreSQL does not support autonomous transactions at this time (9.4).

To properly support them it really needs stored procedures, not just the user-defined functions it currently supports. It's also very complicated to implement autonomous tx's in PostgreSQL for a variety of internal reasons related to its session and process model.

For now, use dblink as suggested by Bob.

Yellowlegs answered 2/7, 2014 at 5:46 Comment(0)
U
3

For Postgresql 9.5 or newer you can use dynamic background workers provided by pg_background extension. It creates autonomous transaction. Please, refer the github page of the extension. The sollution is better then db_link. There is a complete guide on Autonomous transaction support in PostgreSQL. There is a third way to start autonomous transaction in Postgres, but some patching neede. Please see Peter's Eisentraut patch proposal for OracleDB-style transactions.

Unwished answered 3/3, 2017 at 23:18 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.