Use PostgreSQL plpython3u function to return a table
Asked Answered
E

1

3

I wanna return table. The function get an array(the query is 'select function_name(array_agg(column_name)) from table_name')

I coded below:

create type pddesctype as(
    count float,
    mean float,
    std float,
    min float
);

create function pddesc(x numeric[])
returns pddesctype
as $$
    import pandas as pd
    data=pd.Series(x)
    
    count=data.describe()[0]
    mean=data.describe()[1]
    std=data.describe()[2]
    min=data.describe()[3]
    
    return count, mean, std, min

$$ language plpython3u;

This code results only array on one column. (float, float, float...)

I tried

create function pddesc(x numeric[])
returns table(count float, mean float, std float, min float)
as $$
    import pandas as pd
    data=pd.Series(x)
    
    count=data.describe()[0]
    mean=data.describe()[1]
    std=data.describe()[2]
    min=data.describe()[3]
    
    return count, mean, std, min

$$ language plpython3u;

But there is an error:

ERROR:  key "count" not found in mapping
HINT:  To return null in a column, add the value None to the mapping with the key named after the column.
CONTEXT:  while creating return value.

I want to show the result in columns (like a table) without creating type in advance.

How to change the RETURN / RETURNS syntax?

Eolande answered 19/6, 2019 at 1:29 Comment(5)
if you want to return a table, then use returns table (...)Pensionary
should i create table that a return value updated?Eolande
A function that returns a table must be used like a table select * from pddesc(...)Pensionary
I need to get data from another table, so I tried select * from (select pddesc(array_agg(column_name)) from table_name) as result. However, I get array data on one column. Do you know another way to get table?Eolande
I deal with this question. I used plpy query in the python code. thank youEolande
L
1

Here are the steps I tried to get a table of one row with four columns as the Output. The last step has the solution, the first is another way to reproduce your error.

Check np.array

create or replace function pddesc(x numeric[])
returns table(count float, mean float, std float, min float)
as $$
    import pandas as pd
    import numpy as np
    data=pd.Series(x)

    count=data.describe()[0]
    mean=data.describe()[1]
    std=data.describe()[2]
    min=data.describe()[3]
    
    ## print an INFO of the output:
    plpy.info(np.array([count, mean, std, min]))

    return np.array([count, mean, std, min])

$$ language plpython3u;

Test fails (with the error of the question reproduced):

postgres=# SELECT * FROM pddesc(ARRAY[1,2,3]);
INFO:  [3 3 Decimal('1') 1]
ERROR:  key "count" not found in mapping
HINT:  To return null in a column, add the value None to the mapping with the key named after the column.
CONTEXT:  while creating return value
PL/Python function "pddesc"

Working solution: np.array([...]).reshape(1,-1)

You need to reshape the array so that it is of the dimension that you want to get. In this case, it is of dim (1 row x 4 cols), and the .reshape(1,-1) means 1 row and -1 (= whatever needed) cols

create or replace function pddesc(x numeric[])
returns table(count float, mean float, std float, min float)
as $$
    import pandas as pd
    import numpy as np
    data=pd.Series(x)

    count=data.describe()[0]
    mean=data.describe()[1]
    std=data.describe()[2]
    min=data.describe()[3]

    ## print an INFO of the output:
    plpy.info(np.array([count, mean, std, min]).reshape(1,-1))

    return np.array([count, mean, std, min]).reshape(1,-1)
    ## or with the same result:
    # return np.hstack((count, mean, std, min)).reshape(1,-1)

$$ language plpython3u;

Test:

postgres=# SELECT * FROM pddesc(ARRAY[1,2,3]);
INFO:  [[3 3 Decimal('1') 1]]
 count | mean | std | min
-------+------+-----+-----
     3 |    3 |   1 |   1
(1 row)
Loss answered 4/9, 2021 at 17:4 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.