Update with function called once for each row in Postgres 8.4
Asked Answered
S

3

6

I have the following UPDATE statement:

update mytable
   set a = first_part(genid()),
       b = second_part(genid()),
       c = third_path(genid())
 where package_id = 10;

In this example the function genid() is called three times for each row, which is wrong - I want it to be called only once for each row of mytable.

I'm using PostgreSQL 8.4 database. How to write the correct update?

I've tried something like this:

update mytable
   set a = first_part(g),
       b = second_part(g),
       c = third_path(g)
 where package_id = 10
  from genid() as g;

But it didn't work, because genid() has been called only once for the whole update statement.

Specie answered 30/12, 2011 at 8:12 Comment(6)
Can't you put that in a stored procedure and first call genid(), store it in a variable and then use it in your update statement?Repertory
you simplified too much the example, I ques you have gen_id(column)Loafer
@Florin No, genid takes no parameters, it generates a unique value on each call, based on an arbitrary sequence and random generator. That's why I want it to be called once for the each row of the mytable.Specie
Looks like there is already a working solution in the answers, but is defining an additional column for the full genid + a row-level trigger that updates a, b and c on insert/update of that column an option?Slate
Additional column is not an option, we have about 2000 such tables, and the schema is rather constant. This column would be too much overhead for millions of records.Specie
@Specie can you try with a cursor? as I suggested in my updated answerLoafer
Q
10

Have you tried Postgres' non-standard UPDATE .. FROM clause? I imagine, this would work

update mytable
   set a = first_part(gen.id),
       b = second_part(gen.id),
       c = third_path(gen.id)
  from (
          select genid() as genid, id
          from mytable 
          where package_id = 10
       ) gen
 where mytable.id = gen.id;
 --and package_id = 10 -- This predicate is no longer necessary as the subquery
                       -- already filters on package_id, as Erwin mentioned

Note that I'm forcing genid() to be called exactly once per record in mytable within the subselect. Then I'm self-joining mytable and gen using a hypothetical id column. See the documentation here:

http://www.postgresql.org/docs/current/interactive/sql-update.html

This seems to have been introduced with Postgres 9.0 only, though. If that seems too complicated (i.e. not very readable), you can still resort to pgplsql as user Florin suggested here.

Quits answered 30/12, 2011 at 8:18 Comment(7)
This works great, thanks. I have some performance concerns, though, because of the hash joins this involves.Specie
The second where package_id = 10 is redundant. The subquery already filters (as it should). If mytable.id is indexed (as it should be), the redundant check is of no use.Repetitive
@Cezariusz: Have you analysed the execution plan? If you self-join on primary keys, the join should be fairly efficient.Quits
@ErwinBrandstetter: You're right. I hadn't thought about it that way. I'll fix itQuits
+1 Now I like your answer. :) You should probably link to the manual for version 8.4 (OP's version) or the current version. Preferably to the /interactive branch. Consider this.Repetitive
@ErwinBrandstetter: Nice link, I wasn't aware of "/current" documentation sub-paths. They're usually not the ones found on google... :) btw: You can edit answers too, if you feel something should be improved (e.g. a link)Quits
@LukasEder: When dealing with experienced users like you I only edit when I am 100% certain and rather default to a comment. The high reputation tells me you generally know what you are doing.Repetitive
L
0

Can you check if this is working in postgres?

update 
  (select 
     a, b, c, genid() as val
  from mytable
  where package_id = 10
  )
set a = first_part(val),
       b = second_part(val),
       c = third_path(val);

Update: just for the ideea: Can you try it with a cursor?

DECLARE c1 CURSOR FOR 
      select 
         a, b, c, genid() as val
      from mytable
      where package_id = 10; 
...
UPDATE table SET ... WHERE CURRENT OF c1;
Loafer answered 30/12, 2011 at 8:19 Comment(1)
Doesn't work with any version of PostgreSQL, a.k.a. "wrong answer".Repetitive
E
0

I would suggest to use following -

DECLARE @genID VARCHAR(100) -- or appropriate datatype

SET @genID = genid() -- Or select @genID = genid()    
update mytable    
   set a = first_part(@genID),
       b = second_part(@genID),
       c = third_path(@genID)
where package_id = 10; 
Emeritaemeritus answered 30/12, 2011 at 8:20 Comment(4)
That's probably the path of least resistance, compared to my rather verbose solution... (if pgplsql is allowed in this case)Quits
In this case genid() will be called only once for the whole table update, am I right? I need it to be called once for each row.Specie
Yes that will be called only once for whole table update. And in your case I think User Lukas has suggested most appropriate way to achieve this.Emeritaemeritus
That won't work with PostgreSQL (you cannot declare variables like that)Trilbie

© 2022 - 2024 — McMap. All rights reserved.