aggregate function to concatenate strings in Vertica
Asked Answered
S

4

7

have a table in vertica: test like this:

ID     |   name
1      |    AA
2      |    AB
2      |    AC
3      |    AD
3      |    AE
3      |    AF

how could I use an aggregate function or how to write a query to get data like this (vertica syntax)?

ID    |  ag
1     |  AA
2     |  AB, AC
3     |  AD, AE, AF 
Salutary answered 23/9, 2016 at 19:16 Comment(0)
M
2

First, you'll need to compile the udx for agg_concatenate.

-- Shell commands
cd /opt/vertica/sdk/examples/AggregateFunctions/
g++ -D HAVE_LONG_INT_64 -I /opt/vertica/sdk/include -Wall -shared -Wno-unused-value -fPIC -o Concatenate.so Concatenate.cpp /opt/vertica/sdk/include/Vertica.cpp

-- vsql commands
CREATE LIBRARY AggregateFunctionsConcatenate AS '/opt/vertica/sdk/examples/AggregateFunctions/Concatenate.so';
CREATE AGGREGATE FUNCTION agg_concatenate AS LANGUAGE 'C++' NAME 'ConcatenateFactory' LIBRARY AggregateFunctionsConcatenate;

Then you can do a query like:

select id, rtrim(agg_concatenate(name || ', '),', ') ag
from mytable
group by 1
order by 1

Uses rtrim to get rid of the last ', '.

If you need the aggregate to be sorted a certain way, you may need to select/sort in an inline view or with first.

Movement answered 23/9, 2016 at 19:25 Comment(3)
Sorry for the late response. Is there a place where I can learn to add the functions generally? I mean for the command you provide I want to learn so that I can do for other uses next time. Thanks!Salutary
If you look in the /opt/vertica/sdk/examples directory, you'll find a lot of other functions. There is also a general make file and a few sql files that show you how to install and test them.Movement
Could agg_concatenate be used with this problem here? Where can you download it?Competition
C
1
SELECT id,
    MAX(DECODE(row_number, 1, a.name)) ||
    NVL(MAX(DECODE(row_number, 2, ',' || a.name)), '') ||
    NVL(MAX(DECODE(row_number, 3, ',' || a.name)), '') ||
    NVL(MAX(DECODE(row_number, 4, ',' || a.name)), '') ||
    NVL(MAX(DECODE(row_number, 5, ',' || a.name)), '') ||
    NVL(MAX(DECODE(row_number, 6, ',' || a.name)), '') ||
    NVL(MAX(DECODE(row_number, 7, ',' || a.name)), '') ||
    NVL(MAX(DECODE(row_number, 8, ',' || a.name)), '') ||
    NVL(MAX(DECODE(row_number, 9, ',' || a.name)), '') ||
    NVL(MAX(DECODE(row_number, 10, ',' || a.name)), '')||
    NVL(MAX(DECODE(row_number, 11, ',' || a.name)), '') ||
    NVL(MAX(DECODE(row_number, 12, ',' || a.name)), '') ag
FROM
    (SELECT id, name, ROW_NUMBER() OVER(PARTITION BY name ORDER BY id) row_number FROM test) a
GROUP BY a.id
ORDER BY a.id;
Chirrup answered 23/9, 2016 at 19:30 Comment(1)
Not the cleanest, but works for issues where I cannot install new code.Saxecoburggotha
M
1

The other method is to use GROUP_CONCAT from the strings package on github.

select id, group_concat(name) over (partition by id order by name) ag
from mytable

However there are some limitations with this method since analytical udx won't allow you to include other aggregates (and you'll have to inline it or use with to add more data to it).

Movement answered 23/9, 2016 at 19:34 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.