NULL emements lost when casting result of unnest()
Asked Answered
C

1

6

I stumbled upon very odd behavior with unnest(), when casting after expanding an array.

Introduction

There are three basic syntax variants to use unnest():

1) SELECT unnest('{1,NULL,4}'::int[]) AS i;
2) SELECT i FROM unnest('{2,NULL,4}'::int[]) AS i;
3) SELECT i FROM (SELECT unnest('{3,NULL,4}'::int[])) AS t(i);

All of them include a row with NULL in the result as expected

 i
---
 1
(null)
 4

To cast the array elements to a different type, one can cast the elements to a basic type right after expanding the array, or cast the array itself to a different array type before expanding. The first variant seemed slightly simpler and shorter to me:

A) SELECT unnest('{4,NULL,1}'::int[])::text;
B) SELECT unnest('{4,NULL,2}'::int[]::text[]);

 i
---
 4
(null)
 1

Odd behaviour

All combinations possible except 2A)

For some reason one cannot combine 2) with A)

SELECT * FROM unnest('{2,NULL,1}'::int[])::text;

ERROR: syntax error at or near "::"

I can accept that. A rare corner case that has not been implemented for some reason.
All other combinations fly, though:

1A) SELECT unnest('{1,NULL,1}'::int[])::text AS i;
2A) SELECT i FROM unnest('{2,NULL,1}'::int[])::text AS i;
3A) SELECT i FROM (SELECT unnest('{3,NULL,1}'::int[])::text) AS t(i);
1B) SELECT unnest('{1,NULL,2}'::int[]::text[]) AS i;
2B) SELECT i FROM unnest('{2,NULL,2}'::int[]::text[]) AS i;
3B) SELECT i FROM (SELECT unnest('{3,NULL,2}'::int[]::text[])) AS t(i);

Same result as above.

Really odd behaviour

The following observations concern A) exclusively. One can avoid the problem by substituting with B).

As expected, we have seen the NULL element in the array resulting in a row with a NULL value in all queries so far. However, this is not the case when casting the results from some array types to some base types.

Here the row with the NULL value suddenly disappears (!):

SELECT unnest('{1,NULL,4}'::int[])::int8;

 i
---
 1
 4

Examples

I went to see how deep the rabbit hole goes. Here are some examples:

NULL disappears:

SELECT unnest('{1,NULL,1}'::int[])::int2;
SELECT unnest('{1,NULL,2}'::int[])::int8;
SELECT unnest('{1,NULL,3}'::int[])::real;
SELECT unnest('{1,NULL,4}'::int[])::float8;
SELECT unnest('{1,NULL,5}'::int[])::numeric;
SELECT unnest('{1,NULL,6}'::numeric[])::int2;
SELECT unnest('{1,NULL,7}'::numeric[])::int8;
SELECT unnest('{1,NULL,8}'::numeric[])::real;
SELECT unnest('{1,NULL,9}'::numeric[])::float8;
SELECT unnest('{1,NULL,a}'::text[])::char;
SELECT unnest('{1,NULL,b}'::text[])::char(1);
SELECT unnest('{1,NULL,c}'::text[])::varchar(10);      -- !!!
SELECT unnest('{1,NULL,d}'::varchar[])::varchar(10);   -- !!!
SELECT unnest('{2013-1-1,NULL,2013-1-1}'::date[])::timestamp;
SELECT unnest('{2013-1-1,NULL,2013-1-1}'::timestamp[])::date;
SELECT unnest('{23:11,NULL,23:11}'::time[])::interval;
SELECT unnest('{23:11,NULL,23:11}'::interval[])::time;

NULL stays:

SELECT unnest('{1,NULL,1}'::int[])::int4;    -- is really from int to int
SELECT unnest('{1,NULL,2}'::int[])::text;
SELECT unnest('{1,NULL,3}'::int8[])::text;
SELECT unnest('{1,NULL,4}'::numeric[])::text;
SELECT unnest('{1,NULL,5}'::text[])::int;
SELECT unnest('{1,NULL,6}'::text[])::int8;
SELECT unnest('{1,NULL,7}'::text[])::numeric;
SELECT unnest('{1,NULL,8}'::text[])::varchar;    -- !!!
SELECT unnest('{1,NULL,9}'::varchar[])::text;    -- !!!
SELECT unnest('{2013-1-1,NULL,2013-1-1}'::date[])::text;
SELECT unnest('{2013-1-1,NULL,2013-1-1}'::text[])::date;
SELECT unnest('{23:11,NULL,23:11}'::time[])::text;
SELECT unnest('{23:11,NULL,23:11}'::text[])::time;

This seems unacceptable.

After testing quite a few combinations, the pattern seems to be:

Cast between related types results in NULL elements being lost.
Cast between unrelated types results in NULL elements being preserved.
Except that varchar[] -> text and vice versa trashes this little hypothesis of mine. Or varchar and text differ more than I thought.

Tested with PostgreSQL 9.1 and 9.2. Identical results.
-> SQLfiddle

Questions

Am I missing something here? Can someone explain this behaviour?
If not, the question becomes: Should I go ahead an file a bug report?

Cypher answered 9/5, 2013 at 0:38 Comment(0)
A
4

Casting SRF function (in FROM clause) is not supported - you cannot use any operator there. Only function call is allowed.

a cast is possible only in column list:

postgres=# SELECT * FROM unnest('{2,NULL,1}'::int[])::text;
ERROR:  syntax error at or near "::"
LINE 1: SELECT * FROM unnest('{2,NULL,1}'::int[])::text;
                                                 ^
postgres=# SELECT v::text FROM unnest('{2,NULL,1}'::int[]) g(v);
   v    
────────
      2
 [null]
      1
(3 rows)

Missing row from NULL is probably bug and should be reported

postgres=# SELECT unnest('{1,NULL,4}'::int[])::text;
 unnest 
────────
      1
 [null]
      4
(3 rows)

postgres=# SELECT unnest('{1,NULL,4}'::int[])::numeric;
 unnest 
────────
      1
      4
(2 rows)

There is not reason, why NULL rows should be dropped, I think

Adlee answered 9/5, 2013 at 3:37 Comment(5)
Thanks, Pavel. I'll file a bug report for the second part.Cypher
I reported the bug and Tom Lane immediately found and fixed the problem. Postgres 9.3 will not exhibit this bug, but they did not back-patch so not to break old code.Cypher
This bug appears to have reemerged in 12.6! I am using unnest to insert XML data into a table using CREATE TABLE tbl AS SELECT unnest(xpath('//Record/@symbol', xml_string)::varchar[24][]) as symbol; @ErwinBrandstetterEzra
@Kostas - can you send example?Adlee
@PavelStehule Thanks! While trying to make a reproducible example, I found out that the issue is not with unnest but with xpath. It is discussed here: #50123013Ezra

© 2022 - 2024 — McMap. All rights reserved.