generate_series function in Amazon Redshift
Asked Answered
G

5

9

I tried the below:

SELECT * FROM generate_series(2,4);
generate_series
-----------------
           2
           3
           4
(3 rows)

SELECT * FROM generate_series(5,1,-2);                                                             
generate_series
-----------------
           5
           3
           1
(3 rows)

But when I try,

select * from generate_series('2011-12-31'::timestamp, '2012-12-31'::timestamp, '1 day');

It generated error.

ERROR:  function generate_series(timestamp without time zone, timestamp without time zone, "unknown") does not exist
HINT:  No function matches the given name and argument types. You may need to add explicit type casts.

I use PostgreSQL 8.0.2 on Redshift 1.0.757.
Any idea why it happens?

UPDATE:

generate_series is working with Redshift now.

SELECT CURRENT_DATE::TIMESTAMP  - (i * interval '1 day') as date_datetime 
FROM generate_series(1,31) i 
ORDER BY 1

This will generate last 30 days date

Goods answered 21/3, 2014 at 9:7 Comment(2)
For a workaround see: https://mcmap.net/q/394340/-generate_series-method-fails-in-redshiftMellins
Because you report this to be working with Redshift, and redshift is no versioned and thus the older version can not ever be used by anyone I can't see why this question should stay open. I'm voting to close.Harney
G
6

generate_series is working with Redshift now.

SELECT CURRENT_DATE::TIMESTAMP  - (i * interval '1 day') as date_datetime 
FROM generate_series(1,31) i 
ORDER BY 1

This will generate last 30 days date

Goods answered 5/6, 2018 at 13:29 Comment(1)
It seems that the generate_series() function is only supported on the leader node. A select from generate_series() works, but if I try to insert the result into a table, it fails. User defined tables operations would be run on the compute nodes.Lionfish
S
20

The version of generate_series() that supports dates and timestamps was added in Postgres 8.4.

As Redshift is based on Postgres 8.0, you need to use a different way:

select timestamp '2011-12-31 00:00:00' + (i * interval '1 day')
from  generate_series(1, (date '2012-12-31' - date '2011-12-31')) i;

If you "only" need dates, this can be abbreviated to:

select date '2011-12-31' + i
from  generate_series(1, (date '2012-12-31' - date '2011-12-31')) i;
Silvana answered 21/3, 2014 at 9:17 Comment(6)
your first script gives same output 2011-12-31 00:00:00, 366 timesGoods
@dhanishjose: interesting. It seems that older Postgres versions don't support the ANSI style interval notation. See my editSilvana
Works great for me, on Redshfift.Local
Are you able to insert the resultset from generate_series to a table ? I get the following error when i try to save it to a table or while using it in a join condition - [Amazon](500310) Invalid operation: Specified types or functions (one per INFO message) not supported on Redshift tablesFurther
@Further this is because generate_series() function is only available for lead node. Hence you can not use it while creating temp or permanent table creation.Lithograph
@BahadirUyarer What does it mean? So how can I join this generate_series() query to a table or a subquery?Kevinkevina
G
6

generate_series is working with Redshift now.

SELECT CURRENT_DATE::TIMESTAMP  - (i * interval '1 day') as date_datetime 
FROM generate_series(1,31) i 
ORDER BY 1

This will generate last 30 days date

Goods answered 5/6, 2018 at 13:29 Comment(1)
It seems that the generate_series() function is only supported on the leader node. A select from generate_series() works, but if I try to insert the result into a table, it fails. User defined tables operations would be run on the compute nodes.Lionfish
L
3

I found a solution here for my problem of not being able to generate a time dimension table on Redshift using generate_series(). You can generate a temporary sequence by using the following SQL snippet.

with digit as (
    select 0 as d union all 
    select 1 union all select 2 union all select 3 union all
    select 4 union all select 5 union all select 6 union all
    select 7 union all select 8 union all select 9        
),
seq as (
    select a.d + (10 * b.d) + (100 * c.d) + (1000 * d.d) as num
    from digit a
        cross join
        digit b
        cross join
        digit c
        cross join
        digit d
    order by 1        
)
select (getdate()::date - seq.num)::date as "Date"
from seq;

The generate_series() function, it seems, is not supported completely on Redshift yet. If I run the SQL mentioned in the answer by DJo, it works, because the SQL runs only on the leader node. If I prepend insert into dim_time to the same SQL it doesn't work.

Lionfish answered 26/9, 2019 at 10:26 Comment(0)
H
0

There is no generate_series() function in Redshift for Date Range but you can generate the series with below steps...

Step 1: Created a table genid and insert constant value as 1 for number of times you need to generate the series. If you need the series to be generated for 12 month you can insert 12 times. Better you can insert for more number of times like 100, so that you do not face any issue.

create table genid(id int)

------------ for number of months insert into genid values(1)

Step 2: The table for which you need to generate the series.

create table pat(patid varchar(10),stdt timestamp, enddt timestamp);

insert into pat values('Pat01','2018-03-30 00:00:00.0','2018-04-30 00:00:00.0')

insert into pat values('Pat02','2018-02-28 00:00:00.0','2018-04-30 00:00:00.0')

insert into pat values('Pat03','2017-10-28 00:00:00.0','2018-04-30 00:00:00.0')

Step 3: This query will generate the series for you.

with cte as 
(
select max(enddt) as maxdt
from pat
) ,
cte2 as(
select dateadd('month', -1 * row_number() over(order by 1),  maxdt::date ) as gendt  
from  genid , cte
) select * 
from pat, cte2
where gendt between stdt and enddt
Humic answered 9/5, 2018 at 7:23 Comment(0)
P
0

generate_series() that supports dates and timestamps was added in Postgres 8.4. And other thing is currently, generate_series() function in AWS Redshift is only can use in SELECT statements only. So my approach as follow,

WITH RECURSIVE date_series(d) AS (
  SELECT '2023-01-01'::timestamp -- Replace with your desired start date
  UNION ALL
  SELECT DATE_TRUNC('month', d + INTERVAL '31 day') FROM date_series WHERE d < '2023-08-01'::timestamp -- Replace with your desired end date
)
SELECT d FROM date_series;
Prier answered 5/8, 2023 at 3:58 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.