Concat GROUP BY in Vertica SQL
Asked Answered
J

4

5

I need to get a comma separated list of ids as a field for a messy third party api :s This is a simplified version of what I am trying to achieve.

| id | name |
|====|======|
| 01 | greg |
| 02 | paul |
| 03 | greg |
| 04 | greg |
| 05 | paul |

SELECT name, {some concentration function} AS ids
FROM table
GROUP BY name

Returning

| name | ids        |
|======|============|
| greg | 01, 03, 04 |
| paul | 02, 05     |

I know MySQL has the CONCAT_GROUP function and I was hoping to solve this problem without installing more functions because of the environment. Maybe I can solve this problem using an OVER statement?

Jaimie answered 10/1, 2014 at 20:6 Comment(2)
Vertica is a Postgres fork if I'm not mistaken. So maybe string_agg or array_agg is available there.Genarogendarme
@a_horse_with_no_name Unfortunately neither are.Wing
F
1

A really old question, but as an update, you can use listagg() function

id name
01 greg
02 paul
03 greg
04 greg
05 paul

SELECT name, listagg(id) AS ids FROM table GROUP BY name

That will return the desire output:

name ids
greg 01, 03, 04
paul 02, 05
Filamentary answered 1/9, 2023 at 10:17 Comment(0)
W
9

You'll have to use OVER() with NVL() (you'll have to extend the concatenation for more than 10 instances per name):

CREATE TABLE t1 (
  id int,
  name varchar(10)
);

INSERT INTO t1
SELECT 1 AS id, 'greg' AS name
UNION ALL
SELECT 2, 'paul'
UNION ALL
SELECT 3, 'greg'
UNION ALL
SELECT 4, 'greg'
UNION ALL
SELECT 5, 'paul';

COMMIT;

SELECT name,
    MAX(DECODE(row_number, 1, a.id)) ||
    NVL(MAX(DECODE(row_number, 2, ',' || a.id)), '') ||
    NVL(MAX(DECODE(row_number, 3, ',' || a.id)), '') ||
    NVL(MAX(DECODE(row_number, 4, ',' || a.id)), '') ||
    NVL(MAX(DECODE(row_number, 5, ',' || a.id)), '') ||
    NVL(MAX(DECODE(row_number, 6, ',' || a.id)), '') ||
    NVL(MAX(DECODE(row_number, 7, ',' || a.id)), '') ||
    NVL(MAX(DECODE(row_number, 8, ',' || a.id)), '') ||
    NVL(MAX(DECODE(row_number, 9, ',' || a.id)), '') ||
    NVL(MAX(DECODE(row_number, 10, ',' || a.id)), '') id
FROM
    (SELECT name, id, ROW_NUMBER() OVER(PARTITION BY name ORDER BY id) row_number FROM t1) a
GROUP BY a.name
ORDER BY a.name;

Result

 name |  id
------+-------
 greg | 1,3,4
 paul | 2,5
Wing answered 10/1, 2014 at 20:42 Comment(1)
This seems like a good real world solution. @Gregology, is there any chance that you need it to be an arbitrary length concatenation?Paresthesia
I
4

Have look at Concatenate UDAF in vertica examples which comes with vertica installation that's the mysql equivalent. you can just directly install it.

more /opt/vertica/sdk/examples/AggregateFunctions/Concatenate.cpp

-- Shell comppile
cd /opt/vertica/sdk/examples/AggregateFunctions/
g++ -D HAVE_LONG_INT_64 -I /opt/vertica/sdk/include -Wall -shared -Wno-unused-value \
-fPIC -o Concatenate.so Concatenate.cpp /opt/vertica/sdk/include/Vertica.cpp

-- Create LIBRARY
CREATE LIBRARY AggregateFunctionsConcatenate AS '/opt/vertica/sdk/examples/AggregateFunctions/Concatenate.so';
CREATE AGGREGATE FUNCTION agg_group_concat AS LANGUAGE 'C++' NAME 'ConcatenateFactory' LIBRARY AggregateFunctionsConcatenate;


in the Concatenate.cpp
replace : input_len*10
with : 65000

there is two place you have to replace this value in the code.

65000 is the max length you can get with varchar. and since vertica doesnt uses all of 65000 for the values smaller than 65000 character you are fine.

Irritate answered 1/10, 2014 at 12:21 Comment(4)
Concatenate does not have the ability to use a separator, as stated in: community.dev.hp.com/t5/Vertica-Forum/… as well as in Guillaume's answer, you must instead use github.com/vertica/Vertica-Extension-Packages/tree/master/… which is a 3rd part UDF that is capable of doing thisConcerning
@Concerning This works fine, just do rtrim(agg_group_concat(field || ', '),', '). The rtrim will remove the separator at the end and this method does not require analytical functions.Scrimpy
@Scrimpy Interesting workaround, definitely didn't realize you could append within the function call like that. You may also want to add that to the vertica community thread i posted since that has not been proposed there.Concerning
@Concerning OkScrimpy
R
2

The easiest on the long term is to use one of the official Vertica UDFs to be found on github at https://github.com/vertica/Vertica-Extension-Packages/tree/master/strings_package which provides a group_concat function. The installation procedure is to found in the README, and examples are even provided.

Ripieno answered 11/1, 2014 at 18:11 Comment(0)
F
1

A really old question, but as an update, you can use listagg() function

id name
01 greg
02 paul
03 greg
04 greg
05 paul

SELECT name, listagg(id) AS ids FROM table GROUP BY name

That will return the desire output:

name ids
greg 01, 03, 04
paul 02, 05
Filamentary answered 1/9, 2023 at 10:17 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.