How to set value of composite variable field using dynamic SQL
Asked Answered
B

6

14

Given this type:

-- Just for testing purposes:
CREATE TYPE testType as (name text)

I can get the value of a field dynamically with this function:

CREATE OR REPLACE FUNCTION get_field(object anyelement, field text) RETURNS text as
$BODY$
DECLARE
    value text;
BEGIN
    EXECUTE 'SELECT $1."' || field || '"'
      USING object
       INTO value;

    return value;
END;
$BODY$
LANGUAGE plpgsql

Calling get_field('(david)'::testType, 'name') works as expected returning "david".

But how can I set a value of a field in a composite type? I've tried these functions:

CREATE OR REPLACE FUNCTION set_field_try1(object anyelement, field text, value text)
RETURNS anyelement
as
$BODY$
DECLARE
    value text;
BEGIN
    EXECUTE '$1."' || field || '" := $2'
      USING object, value;

    return object;
END;
$BODY$
LANGUAGE plpgsql

CREATE OR REPLACE FUNCTION set_field_try2(object anyelement, field text, value text)
RETURNS anyelement
as
$BODY$
DECLARE
    value text;
BEGIN
    EXECUTE 'SELECT $1 INTO $2."' || field || '"'
      USING value, object;

    return object;
END;
$BODY$
LANGUAGE plpgsql

CREATE OR REPLACE FUNCTION set_field_try3(object anyelement, field text, value text)
RETURNS anyelement
as
$BODY$
DECLARE
    value text;
BEGIN
    EXECUTE 'BEGIN $1."' || field || '" := $2; SELECT $1; END;'
       INTO object
      USING value, object;

    return object;
END;
$BODY$
LANGUAGE plpgsql

and some variations. Calling set_field_tryX doesn't work. I always get "ERROR: syntax error at or near...". How can I accomplish this?

Notes:

  • The parameter is anyelement and the field can be any field in the composite type. I can't just use object.name.
  • I'm concerned about SQL injection. Any advice in this would be appreciated but it is not my question.
Biff answered 10/10, 2011 at 10:25 Comment(0)
A
19

Faster with hstore

Since Postgres 9.0, with the additional module hstore installed in your database there is a very simple and fast solution with the #= operator that ...

replace[s] fields in record with matching values from hstore.

To install the module:

CREATE EXTENSION hstore;

Examples:

SELECT my_record #= '"field"=>"value"'::hstore;  -- with string literal
SELECT my_record #= hstore(field, value);        -- with values

Values have to be cast to text and back, obviously.

Example plpgsql functions with more details:

Now works with json / jsonb, too!

There are similar solutions with json (pg 9.3+) or jsonb (pg 9.4+)

SELECT json_populate_record (my_record, json_build_object('key', 'new-value');

The functionality was undocumented, but it's official since Postgres 13. The manual:

However, if base isn't NULL then the values it contains will be used for unmatched columns.

So you can take any existing row and fill arbitrary fields (overwriting what's in it).

Major advantages of json vs hstore:

  • works with stock Postgres so you don't need an additional module.
  • also works for nested array and composite types.

Minor disadvantage: a bit slower.

See @Geir's added answer for details.

Without hstore and json

If you are on an older version or cannot install the additional module hstore or cannot assume it's installed, here is an improved version of what I posted previously. Still slower than the hstore operator, though:

CREATE OR REPLACE FUNCTION f_setfield(INOUT _comp_val anyelement
                                          , _field text, _val text)
  RETURNS anyelement
  LANGUAGE plpgsql STABLE AS
$func$
BEGIN

EXECUTE 'SELECT ' || array_to_string(ARRAY(
      SELECT CASE WHEN attname = _field
                THEN '$2'
                ELSE '($1).' || quote_ident(attname)
             END AS fld
      FROM   pg_catalog.pg_attribute
      WHERE  attrelid = pg_typeof(_comp_val)::text::regclass
      AND    attnum > 0
      AND    attisdropped = FALSE
      ORDER  BY attnum
      ), ',')
USING  _comp_val, _val
INTO   _comp_val;

END
$func$;

Call:

CREATE TEMP TABLE t( a int, b text);  -- Composite type for testing
SELECT f_setfield(NULL::t, 'a', '1');

Notes

  • An explicit cast of the value _val to the target data type is not necessary, a string literal in the dynamic query would be coerced automatically, obviating the subquery on pg_type. But I took it one step further:

  • Replace quote_literal(_val) with direct value insertion via the USING clause. Saves one function call and two casts, and is safer anyway. text is coerced to the target type automatically in modern PostgreSQL. (Did not test with versions before 9.1.)

  • array_to_string(ARRAY()) is faster than string_agg().

  • No variables needed, no DECLARE. Fewer assignments.

  • No subquery in the dynamic SQL. ($1).field is faster.

  • pg_typeof(_comp_val)::text::regclass
    does the same as
    (SELECT typrelid FROM pg_catalog.pg_type WHERE oid = pg_typeof($1)::oid)
    for valid composite types, just faster.
    This last modification is built on the assumption that pg_type.typname is always identical to the associated pg_class.relname for registered composite types, and the double cast can replace the subquery. I ran this test in a big database to verify, and it came up empty as expected:

    SELECT *
    FROM   pg_catalog.pg_type t
    JOIN   pg_namespace  n ON n.oid = t.typnamespace
    WHERE  t.typrelid > 0  -- exclude non-composite types
    AND    t.typrelid IS DISTINCT FROM
          (quote_ident(n.nspname ) || '.' || quote_ident(typname))::regclass
  • The use of an INOUT parameter obviates the need for an explicit RETURN. This is just a notational shortcut. Pavel won't like it, he prefers an explicit RETURN statement ...

Everything put together this is twice as fast as the previous version.


Original (outdated) answer:

The result is a version that's ~ 2.25 times faster. But I probably couldn't have done it without building on Pavel's second version.

In addition, this version avoids most of the casting to text and back by doing everything within a single query, so it should be much less error prone.
Tested with PostgreSQL 9.0 and 9.1.

CREATE FUNCTION f_setfield(_comp_val anyelement, _field text, _val text)
  RETURNS anyelement
  LANGUAGE plpgsql STABLE AS
$func$
DECLARE
   _list text;
BEGIN
_list := (
   SELECT string_agg(x.fld, ',')
   FROM  (
      SELECT CASE WHEN a.attname = $2
              THEN quote_literal($3) || '::'|| (SELECT quote_ident(typname)
                                                FROM   pg_catalog.pg_type
                                                WHERE  oid = a.atttypid)
              ELSE quote_ident(a.attname)
             END AS fld
      FROM   pg_catalog.pg_attribute a 
      WHERE  a.attrelid = (SELECT typrelid
                           FROM   pg_catalog.pg_type
                           WHERE  oid = pg_typeof($1)::oid)
      AND    a.attnum > 0
      AND    a.attisdropped = false
      ORDER  BY a.attnum
      ) x
   );

EXECUTE 'SELECT ' || _list || ' FROM  (SELECT $1.*) x'
USING  $1
INTO   $1;

RETURN $1;
END
$func$;
Acey answered 16/10, 2011 at 5:54 Comment(10)
@Pavel: thanks, that's worth like ten upvotes, coming from you. :)Acey
Thank you @Erwin. I'm still on this. I've found your solution fails with this: CREATE TYPE a as (a1 int); CREATE TYPE b as (b1 a); SELECT setfield3(null::b, 'b1', '(2)');. In this case setfield2 from @Pavel works, but setfield2 fails in some cases where setfield3 works :-?Biff
@DavidEG: I believe I have fixed it for the special case of nested composite types. In fact, I believe I have fixed it for any composite type. Please try to break it again. :)Acey
@DavidEG: Actually, had to protect that last bit against SQLi, too. See amended version.Acey
Thank you very much @Erwin, it works in that case. I'll do further tests to break it :) (BTW the whole thing is a procedure to convert a json into a composite type. If you can take a look to the code would be really appreciated. The code is on github.com/DavidEGx)Biff
@DavidEG: What is a.atttypid <> 0 for?Acey
Dropped attributes have atttypid = 0 so basically is redundant with attisdropped = falseBiff
@DavidEG: So I removed the redundant line. Looks good otherwise!Acey
good micro optimization. I have nothing against INOUT variables. Just I like a explicit usage of RETURN statement (like ADA requires). Explicit returning value by usage of RETURN statement is little bit more robust to human errors, but somewhere is not possible (in PL/pgSQL).Checkerberry
json_populate_record is a lifesaver. I had already designed a lot my api around a composite type that needs one value overwritten by trigger. Thanks @ErwinBrandstetter you're a legend. I'm still considering generating the trigger code rather than parametrizing it and doing it dynamically, but at least now I can leave that for a time when the pressure is lifted a bit (i.e., probably, you know, like, never). I'll upvote the original answer too.Witkin
C
8

I wrote a second version of setfield function. It work on postgres 9.1 I didn't test it on older versions. It's not a miracle (from performance view), but it is more robust and about 8 times faster than the previous.

CREATE OR REPLACE FUNCTION public.setfield2(anyelement, text, text)
 RETURNS anyelement
 LANGUAGE plpgsql
AS $function$
DECLARE 
  _name text;
  _values text[];
  _value text;
  _attnum int;
BEGIN
  FOR _name, _attnum
     IN SELECT a.attname, a.attnum
          FROM pg_catalog.pg_attribute a 
         WHERE a.attrelid = (SELECT typrelid
                               FROM pg_type
                              WHERE oid = pg_typeof($1)::oid)
           AND a.attnum > 0 
  LOOP
    IF _name = $2 THEN
      _value := $3;
    ELSE
      EXECUTE 'SELECT (($1).' || quote_ident(_name) || ')::text' INTO _value USING $1;
    END IF;
    _values[_attnum] :=  COALESCE('"' || replace(replace(_value, '"', '""'), '''', '''''') || '"', ''); 
  END LOOP;
  EXECUTE 'SELECT (' || quote_ident(pg_typeof($1)::text) || ' ''(' || array_to_string(_values,',') || ')'').*' INTO $1; 
  RETURN $1;
END;
$function$;
Checkerberry answered 11/10, 2011 at 7:18 Comment(10)
Thank you very much. Maybe not a miracle but good enough for me.Biff
I fixed two bugs. Hope that is ok with you? 1) No quote_ident() for the type! It would break in pg 9.0 with a type in a schema ("myschema.mytype"). 2) Double up single quotes, too! Plus, replaced \' with '' to avoid escape strings.Acey
I realize now, that the type needs a quote_ident() or it is not safe against SQLi. So I undid that edit. However, this will fail with a schema-qualified type. You would have to quote_ident() schema and typename separately: "myschema"."mytype".Acey
I'm still working on this. I've found an error, given the types CREATE TYPE a as (a1 int); CREATE TYPE b as (b1 a); CREATE TYPE c as (c1 b[]); running SELECT setfield2(null::c, 'c1', '{"(\"(2)\")"}'); fails.Biff
I am not able to call this function from a trigger. When I call it using a table as suggested, i.e. NULL::table_name, I get a column "tableoid" not found in data type "table_name" when calling the function. Any idea? I am using postgresql v. 9.1.Barbicel
@Barbicel - can you write the query?Checkerberry
SELECT setfield2(NULL::"MY_TABLE", MY_FIELD, "new_value");Barbicel
@Barbicel - aha - it didn't work on table types - I fixed code. Check it nowCheckerberry
Ok, your change did the trick - the function is now working. However, in my trigger, I do SELECT setfield2(NEW, MY_FIELD, "new_value") INTO _tmp (note the INTO _tmp to avoid error query has no destination), but after the call, NEW.MY_FIELD still has initial value. If I put the result into NEW, I get that the error 42804: returned row structure does not match the structure of the triggering table. What am I doing wrong here?Barbicel
Do you have some dropped columns in your table? in this case try to run VACUUM FULL on this tableCheckerberry
R
6

UPDATE/caution: Erwin points out that this is currently undocumented, and the manual indicates it should not be possible to alter records this way.

Use hstore or Pavel's solution instead.

This simple json based solution is almost as fast as hstore, and requires only Postgres 9.3 or newer. This should be a good option if you can't use the hstore extension, and the performance difference should be negligible. Benchmarks: https://mcmap.net/q/28915/-how-to-set-value-of-composite-variable-field-using-dynamic-sql

a) We can either do it inline by cast/concat. Json function requires Postgres 9.3:

SELECT json_populate_record( 
     record
    , ('{"'||'key'||'":"'||'new-value'||'"}')::json
);

b) or inline by using functions from Postgres 9.4.

SELECT json_populate_record (
      record
     ,json_object(ARRAY['key', 'new-value'])
);

Note: I chose json_object(ARRAY[key,value]) since it was a bit faster than json_build_object(key,value):

To hide the casting details you can use a) in a function, with little overhead.

CREATE FUNCTION x.setfield_json(in_element anyelement, key text, value text)
    RETURNS anyelement AS
$BODY$
    SELECT json_populate_record( in_element, ('{"'||key||'":"'||value||'"}')::json);
$BODY$ LANGUAGE sql;
Registration answered 23/2, 2015 at 11:47 Comment(1)
Note that this is now documented since Postgres 13! :)Acey
C
3

"SELECT INTO" outside plpgsql (in dynamic SQL context) has different sense than you expect - it store a result of query to table.

Modification of any field is possible, but not simple

CREATE OR REPLACE FUNCTION public.setfield(a anyelement, text, text)
RETURNS anyelement
LANGUAGE plpgsql
AS $function$
begin
  create temp table aux as select $1.*;
  execute 'update aux set ' || quote_ident($2) || ' = ' || quote_literal($3);
  select into $1 * from aux;
  drop table aux;
  return $1;
end;
$function$

But this code is not very effective - is not possible to write this well in plpgsql. You can find some C library, that should to do.

Checkerberry answered 10/10, 2011 at 13:28 Comment(4)
to your code - dynamic SQL must to contains only plain SQL - there are not possible to use any PL statement - like :=Checkerberry
Thank you @Pavel. Although maybe it is not the best it is enough for me. But now I have another problem, the column is not always text and when it tries to update i've got column "x" is of type real but expression is of type text. How can I dynamically cast?Biff
it working in my 9.1 - the most simple solution is overloading setfield function for double precission: CREATE OR REPLACE FUNCTION public.setfield(a anyelement, text, doble precision) RETURNS anyelement LANGUAGE plpgsql AS $function$ begin create temp table aux as select $1.*; execute 'update aux set ' || quote_ident($2) || ' = ' || $3; select into $1 * from aux; drop table aux; return $1; end; $function$Checkerberry
In the course of writing another version I ran extensive tests on postgres 9.0. This function maxed out shared memory on a test with a couple of thousand rows. Server had decent resources. So this is not fit for use. WARNING: out of shared memory CONTEXT: SQL statement "create temp table aux as select $1.*" PL/pgSQL function "setfield" line 2 at SQL statementAcey
R
2

Test setup and benchmarks v2

Erwin encouraged to reproduce his benchmark in this thread (https://mcmap.net/q/28915/-how-to-set-value-of-composite-variable-field-using-dynamic-sql), so I modified his code with synthetic test data and added both the hstore solution and the json-solution from my answer (and a json solution by Pavel found in another thread) The benchmark is now run as one query, making it easier to capture the results.

DROP SCHEMA IF EXISTS x CASCADE;
CREATE SCHEMA x;


-- Pavel 1:
--------------------------------------------------------------------------------------------------
CREATE OR REPLACE FUNCTION x.setfield(anyelement, text, text)
RETURNS anyelement
LANGUAGE plpgsql
AS $function$
begin
  create temp table aux as select $1.*;
  execute 'update aux set ' || quote_ident($2) || ' = ' || quote_literal($3);
  select into $1 * from aux;
  drop table aux;
  return $1;
end;
$function$;


-- Pavel 2 (with patches)
--------------------------------------------------------------------------------------------------
CREATE OR REPLACE FUNCTION x.setfield2(anyelement, text, text)
 RETURNS anyelement
 LANGUAGE plpgsql
AS $function$
DECLARE
  _name text;
  _values text[];
  _value text;
  _attnum int;
BEGIN
  FOR _name, _attnum
     IN SELECT a.attname, a.attnum
           FROM pg_catalog.pg_attribute a
          WHERE a.attrelid = (SELECT typrelid
                                 FROM pg_type
                                WHERE oid = pg_typeof($1)::oid)
  LOOP
    IF _name = $2 THEN
      _value := $3;
    ELSE
      EXECUTE 'SELECT (($1).' || quote_ident(_name) || ')::text' INTO _value USING $1;
    END IF;
    _values[_attnum] :=  COALESCE('"' || replace(replace(_value, '"', '""'), '''', '''''') || '"', '');
  END LOOP;
  EXECUTE 'SELECT (' || pg_typeof($1)::text || '''(' || array_to_string(_values,',') || ')'').*' INTO $1;
  RETURN $1;
END;
$function$;


-- Erwin 1
--------------------------------------------------------------------------------------------------
CREATE OR REPLACE FUNCTION x.setfield3(anyelement, text, text)
RETURNS anyelement
AS $body$
DECLARE
 _list text;

BEGIN
_list := (
   SELECT string_agg(x.fld, ',')
   FROM   (
      SELECT CASE WHEN a.attname = $2
              THEN quote_literal($3)
              ELSE quote_ident(a.attname)
             END AS fld
      FROM   pg_catalog.pg_attribute a
      WHERE  a.attrelid = (SELECT typrelid
                           FROM   pg_type
                           WHERE  oid = pg_typeof($1)::oid)
      ORDER BY a.attnum
   ) x
);

EXECUTE '
SELECT ' || _list || '
FROM   (SELECT $1.*) x'
USING  $1
INTO   $1;

RETURN $1;
END;
$body$ LANGUAGE plpgsql;


-- Erwin 2
--------------------------------------------------------------------------------------------------
CREATE OR REPLACE FUNCTION x.setfield4(INOUT _comp_val anyelement
                                       , _field text, _val text)
  RETURNS anyelement AS
$func$
BEGIN

EXECUTE 'SELECT ' || array_to_string(ARRAY(
      SELECT CASE WHEN attname = _field
                THEN '$2'
                ELSE '($1).' || quote_ident(attname)
             END AS fld
      FROM   pg_catalog.pg_attribute
      WHERE  attrelid = pg_typeof(_comp_val)::text::regclass
      AND    attnum > 0
      AND    attisdropped = FALSE
      ORDER  BY attnum
      ), ',')
USING  _comp_val, _val
INTO   _comp_val;

END
$func$ LANGUAGE plpgsql;


-- Pavel 3: json. (Postgres 9.4)
-- Found here: https://mcmap.net/q/28940/-postgresql-modifying-fields-dynamically-in-new-record-in-a-trigger-function
--------------------------------------------------------------------------------------------------
CREATE OR REPLACE FUNCTION x.setfield5(r anyelement, fn text, val text,OUT result anyelement)
 RETURNS anyelement
 LANGUAGE plpgsql
AS $function$
declare jo json;
begin
  jo := (select json_object(array_agg(key), 
                            array_agg(case key when fn then val
                                               else value end)) 
            from json_each_text(row_to_json(r)));
  result := json_populate_record(r, jo);
end;
$function$;


-- Json. Use built-in json functions (Postgres 9.3)
-- This is available from 9.3 since we create json by casting 
-- instead of using json_object/json_build_object only available from 9.4
--------------------------------------------------------------------------------------------------
CREATE FUNCTION x.setfield_json(in_element anyelement, key text, value text)
    RETURNS anyelement AS
$BODY$
    SELECT json_populate_record( in_element, ('{"'||key||'":"'||value||'"}')::json);
$BODY$ LANGUAGE sql;



--------------------------------------------------------------------------------------------------
-- Test setup
--------------------------------------------------------------------------------------------------

-- composite type for tests.
CREATE TYPE x.t_f as (
 id       int
,company  text
,sort     text
,log_up   timestamp
,log_upby smallint
);

-- Create temp table with synthetic test data
DROP TABLE IF EXISTS tmp_f;
CREATE TEMP table tmp_f AS
   SELECT ROW(i, 'company'||i, NULL, NULL, NULL)::x.t_f AS f
   FROM generate_series(1, 5000) S(i);



-- Run the benchmark
DO $$  DECLARE  start_time timestamptz; test_count integer; test_description TEXT; BEGIN

    test_count := 200;
    test_description := 'setfield, Pavel 1: temptable';
    start_time := clock_timestamp();    
    PERFORM x.setfield (f, 'company','new-value-'||md5(random()::text)) FROM tmp_f LIMIT test_count;
    RAISE NOTICE 'Test took: % ms (for % rows) Name: %', extract(MILLISECONDS FROM (clock_timestamp() - start_time))::INTEGER, test_count, test_description;

    test_count := 5000;
    test_description := 'setfield2, Pavel 2: reflection';
    start_time := clock_timestamp();
    PERFORM x.setfield2 (f, 'company','new-value-'||md5(random()::text)) FROM tmp_f LIMIT test_count;
    RAISE NOTICE 'Test took: % ms (for % rows) Name: %', extract(MILLISECONDS FROM (clock_timestamp() - start_time))::INTEGER, test_count, test_description;

    test_count := 5000;
    test_description := 'setfield3, Erwin 1: reflection';
    start_time := clock_timestamp();
    PERFORM x.setfield3 (f, 'company','new-value-'||md5(random()::text)) FROM tmp_f LIMIT test_count;
    RAISE NOTICE 'Test took: % ms (for % rows) Name: %', extract(MILLISECONDS FROM (clock_timestamp() - start_time))::INTEGER, test_count, test_description;

    test_count := 5000;
    test_description := 'setfield4, Erwin 2: reflection';
    start_time := clock_timestamp();
    PERFORM x.setfield4 (f, 'company','new-value-'||md5(random()::text)) FROM tmp_f LIMIT test_count;
    RAISE NOTICE 'Test took: % ms (for % rows) Name: %', extract(MILLISECONDS FROM (clock_timestamp() - start_time))::INTEGER, test_count, test_description;

    test_count := 5000;
    test_description := 'setfield5, Pavel 3: json (PG 9.4)';
    start_time := clock_timestamp();
    PERFORM x.setfield5 (f, 'company','new-value-'||md5(random()::text)) FROM tmp_f LIMIT test_count;
    RAISE NOTICE 'Test took: % ms (for % rows) Name: %', extract(MILLISECONDS FROM (clock_timestamp() - start_time))::INTEGER, test_count, test_description;
    
    test_count := 5000;
    test_description := 'setfield_json, Geir 1: casting (PG 9.3)';
    start_time := clock_timestamp();
    PERFORM x.setfield_json (f, 'company','new-value-'||md5(random()::text)) FROM tmp_f LIMIT test_count;
    RAISE NOTICE 'Test took: % ms (for % rows) Name: %', extract(MILLISECONDS FROM (clock_timestamp() - start_time))::INTEGER, test_count, test_description;

    --json_object(ARRAY(key,value]) is actually faster than json_build_object(key, value)
    test_count := 5000;
    test_description := 'no function/inlined: json_object (PG 9.4)';
    start_time := clock_timestamp();
    PERFORM json_populate_record( f, json_object(ARRAY['company', 'new-value'||md5(random()::text)]  )) FROM tmp_f LIMIT test_count;
    RAISE NOTICE 'Test took: % ms (for % rows) Name: %', extract(MILLISECONDS FROM (clock_timestamp() - start_time))::INTEGER, test_count, test_description;

    test_count := 5000;
    test_description := 'no function/inlined: hstore (PG 9.0)';
    start_time := clock_timestamp();
    PERFORM f #= hstore('company', 'new-value'||md5(random()::text))  FROM tmp_f LIMIT test_count;
    RAISE NOTICE 'Test took: % ms (for % rows) Name: %', extract(MILLISECONDS FROM (clock_timestamp() - start_time))::INTEGER, test_count, test_description;
    
END; $$;

Test results on 9.4.1, win32, i5-4300U

NOTICE:  Test took: 1138 ms (for 200 rows) Name: setfield, Pavel 1: temptable
NOTICE:  Test took: 652 ms (for 5000 rows) Name: setfield2, Pavel 2: reflection
NOTICE:  Test took: 364 ms (for 5000 rows) Name: setfield3, Erwin 1: reflection
NOTICE:  Test took: 275 ms (for 5000 rows) Name: setfield4, Erwin 2: reflection
NOTICE:  Test took: 192 ms (for 5000 rows) Name: setfield5, Pavel 3: json (PG 9.4)
NOTICE:  Test took: 23 ms (for 5000 rows) Name: setfield_json, Geir 1: casting (PG 9.3)
NOTICE:  Test took: 25 ms (for 5000 rows) Name: no function/inlined: json_object (PG 9.4)
NOTICE:  Test took: 14 ms (for 5000 rows) Name: no function/inlined: hstore (PG 9.0)
Registration answered 23/2, 2015 at 12:11 Comment(7)
Interesting! Only saw that now. A couple of notes: 1. hstore solution is available since 9.0 like stated in my answer. 2. hstore function should be even faster taking (anyelement, hstore), saving redundant casts. You don't actually need a function at all, just the expression my_record #= hsore(field, value). 3. You included my first answer (my benchmark is outdated), but not the much improved later version in my answer.Acey
I fixed the 9.1/9.0 typo, and just dropped the hstore function. Whoups! I somehow missed your improved answer in the bench, it is now included:). I also added json as an inline test. It was interesting to see that json_object(ARRAY[key,value]) is faster than json_build_object(key,value)Registration
Excellent post. Very interesting. One more thing: function volatility. Erwin 1, Erwin 2 and Pavel 2 can be STABLE (updated my old answer, too), Pavel3 and Geir 1 can be IMMUTABLE. That could make a difference - if not in this test, then in the context of more complex queries.Acey
Yeah, some complex use-cases should be able to exploit immutable (same input multiple times?) In this synthetic benchmark the function actually got twice as slow with IMMUTABLE.Registration
Slower with IMMUTABLE, that's odd. Which one? Here is a recent question, where IMMUTABLE made all the difference (in a not-so-obvious way): #28899542Acey
Slower indeed! I am getting consistent results, 45ms with IMMUTABLE and 25ms without. This is for the json function "setfield_json". Try it yourself. I have seen this a couple of times before in other scenarios but ignored it, since volatility sometimes feel like black magic;) Looks like some overhead?Registration
Is it plain function call overhead? That Postgres is able to run the sql directly 5000 times instead of running it via the immutable function? Explain: explain.depesz.com/s/RJ3n and explain.depesz.com/s/IdqT . Pavel mentions "sql eval overflow", but i'm not sure what that means exactly. If you also think this is odd, this might be worth exploring a separate question. It probably has more academic value rather than practical - but hey, we're performance geeks;)Registration
A
1

Update March 2015:
Largely outdated now. Consider the new benchmark by @Geir with faster variants.


Test setup and benchmarks

I took the three solutions presented (by Oct. 16th, 2011) and ran a test on PostgreSQL 9.0. You find the complete setup below. Only test data are not included as I used a real life database (not synthetic data). It's all encapsulated in its own schema for non-intrusive use.

I would like to encourage anybody who wants to reproduce the test. Maybe with postgres 9.1? And add your results here? :)

-- DROP SCHEMA x CASCADE;
CREATE SCHEMA x;

-- Pavel 1
CREATE OR REPLACE FUNCTION x.setfield(anyelement, text, text)
RETURNS anyelement
LANGUAGE plpgsql
AS $function$
begin
  create temp table aux as select $1.*;
  execute 'update aux set ' || quote_ident($2) || ' = ' || quote_literal($3);
  select into $1 * from aux;
  drop table aux;
  return $1;
end;
$function$;

-- Pavel 2 (with patches)
CREATE OR REPLACE FUNCTION x.setfield2(anyelement, text, text)
 RETURNS anyelement
 LANGUAGE plpgsql
AS $function$
DECLARE 
  _name text;
  _values text[];
  _value text;
  _attnum int;
BEGIN
  FOR _name, _attnum
     IN SELECT a.attname, a.attnum
           FROM pg_catalog.pg_attribute a 
          WHERE a.attrelid = (SELECT typrelid
                                 FROM pg_type
                                WHERE oid = pg_typeof($1)::oid) 
  LOOP
    IF _name = $2 THEN
      _value := $3;
    ELSE
      EXECUTE 'SELECT (($1).' || quote_ident(_name) || ')::text' INTO _value USING $1;
    END IF;
    _values[_attnum] :=  COALESCE('"' || replace(replace(_value, '"', '""'), '''', '''''') || '"', '');
  END LOOP;
  EXECUTE 'SELECT (' || pg_typeof($1)::text || '''(' || array_to_string(_values,',') || ')'').*' INTO $1; 
  RETURN $1;
END;
$function$;

-- Erwin 1
CREATE OR REPLACE FUNCTION x.setfield3(anyelement, text, text)
RETURNS anyelement
AS $body$
DECLARE
 _list text;

BEGIN
_list := (
   SELECT string_agg(x.fld, ',')
   FROM   (
      SELECT CASE WHEN a.attname = $2
              THEN quote_literal($3)
              ELSE quote_ident(a.attname)
             END AS fld
      FROM   pg_catalog.pg_attribute a 
      WHERE  a.attrelid = (SELECT typrelid
                           FROM   pg_type
                           WHERE  oid = pg_typeof($1)::oid) 
      ORDER BY a.attnum
   ) x
);

EXECUTE '
SELECT ' || _list || '
FROM   (SELECT $1.*) x'
USING  $1
INTO   $1;

RETURN $1;
END;
$body$ LANGUAGE plpgsql;

-- composite type for tests.
CREATE TYPE x.t_f as (
 id       int
,company  text
,sort     text
,log_up   timestamp 
,log_upby smallint
);

-- temp table with real life test data
DROP   TABLE IF EXISTS tmp_f;
CREATE TEMP table tmp_f AS 
   SELECT ROW(firma_id,firma,sort,log_up,log_upby)::x.t_f AS f
   FROM   ef.firma
   WHERE  firma !~~ '"%';

-- SELECT count(*) FROM tmp_f;  -- 5183

-- Quick test: results are identical?
SELECT *,
       x.setfield (f, 'company','test')
      ,x.setfield2(f, 'company','test')
      ,x.setfield3(f, 'company','test')
 FROM tmp_f
LIMIT 10;

Benchmarks

I ran the queries a couple of times to populate the cache. The presented results are the best of five total runtimes with EXPLAIN ANALYZE.

Rirst round with 1000 rows

Pavel's first prototype maxes out shared memory with more rows.

Pavel 1: 2445.112 ms

SELECT x.setfield (f, 'company','test') FROM tmp_f limit 1000;

Pavel 2: 263.753 ms

SELECT x.setfield2(f, 'company','test') FROM tmp_f limit 1000;

Erwin 1: 120.671 ms

SELECT x.setfield3(f, 'company','test') FROM tmp_f limit 1000;

Another test with 5183 rows.

Pavel 2: 1327.429 ms

SELECT x.setfield2(f, 'company','test') FROM tmp_f;

Erwin1: 588.691 ms

SELECT x.setfield3(f, 'company','test') FROM tmp_f;
Acey answered 16/10, 2011 at 6:51 Comment(2)
@Geir: You edit has been rejected, but you can just post your own answer. Seems interesting enough ...Acey
Thanks, I posted a new solution along with a reproducible benchmark based on your code. Should we "merge" the benchmark posts somehow to make it easier to follow for new readers? Any suggestions are appreciated=)Registration

© 2022 - 2024 — McMap. All rights reserved.