Split column into multiple rows in Postgres
Asked Answered
S

3

86

Suppose I have a table like this:

subject flag
this is a test 2

subject is of type text, and flag is of type int. I would like to transform this table to something like this in Postgres:

token flag
this 2
is 2
a 2
test 2

Is there an easy way to do this?

Sigismund answered 2/4, 2015 at 18:31 Comment(0)
K
130

Use a LATERAL join - with string_to_table() in Postgres 14+.
Minimal form:

SELECT token, flag
FROM   tbl, string_to_table(subject, ' ') token
WHERE  flag = 2;

The comma in the FROM list is (almost) equivalent to CROSS JOIN, LATERAL is automatically assumed for set-returning functions (SRF) in the FROM list. Why "almost"? See:

The alias "token" for the derived table is also assumed as column alias for a single anonymous column, and we assumed distinct column names across the query. Equivalent, more verbose and less error-prone:

SELECT s.token, t.flag
FROM   tbl t
CROSS  JOIN LATERAL string_to_table(subject, ' ') AS s(token)
WHERE  t.flag = 2;

Or move the SRF to the SELECT list, which is allowed in Postgres (but not in standard SQL), to (almost) the same effect:

SELECT string_to_table(subject, ' ') AS token, flag
FROM   tbl
WHERE  flag = 2;

The last one seems acceptable since SRF in the SELECT list have been sanitized in Postgres 10. See:

If string_to_table() does not return any rows (empty or null subject), the (implicit) join eliminates the row from the result. Use LEFT JOIN ... ON true to keep qualifying rows from tbl. See:

We could also use regexp_split_to_table(), but that's slower. Regular expressions are powerful but expensive. See:

In Postgres 13 or older use unnest(string_to_array(subject, ' ')) instead of string_to_table(subject, ' ').

Kaoliang answered 2/4, 2015 at 18:34 Comment(4)
I’m not very familiar with either the LATERAL join or with the unnest() function. How would you express this as a lateral join?Voidance
@Manngo: This is a lateral join, just with short syntax. Verbose equivalent: SELECT * FROM tbl t CROSS JOIN LATERAL unnest(string_to_array(t.subject, ' ')) AS s(token); Ample explanation in the linked answers.Kaoliang
Thanks. I always understood the … , … syntax to be a simple cross join.Voidance
For functions, it's cross join lateral automatically.Kaoliang
B
42

I think it's not necessary to use a join, just the unnest() function in conjunction with string_to_array() should do it:

SELECT unnest(string_to_array(subject, ' ')) as "token", flag FROM test;

token | flag                                                                                                   
-------+-------                                                                                                  
this   |     2                                                                                                   
is     |     2                                                                                                   
a      |     2                                                                                                   
test   |     2                                                                                                   
Brandwein answered 12/5, 2019 at 18:47 Comment(0)
D
1

Using regex split to table function including lateral join,

SELECT s.token, flag
FROM   tbl t, regexp_split_to_table(t.subject, ' ') s(token)
WHERE  flag = 2;

Refer to https://www.postgresql.org/docs/9.3/functions-string.html for the function details

Dorcia answered 18/10, 2021 at 12:47 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.