Is there any way to find table creation date in redshift?
Asked Answered
E

5

21

I am having trouble with finding table creation date in Amazon Redshift. I know svv_table_info will give all the info about the table but the creation date.Can anyone help please?

Epicarp answered 25/4, 2016 at 17:29 Comment(0)
E
17

In Redshift the other ways you can get the create time of your table by searching for the start and stop time of any create table sql run in the svl_qlog. There are other tables you can look at to get similar data but the problem with this way is that it's only kept for a couple of days (3 - 5). Although everyone would like metadata stored along with the table itself to query. Amazon recommends to keep this data to export the data to S3 from the logs you want to retain to S3. Then in my opinion you could import these s3 files back into a permanent table you want called aws_table_history or something so that this special data you keep forever.

select * from svl_qlog where substring ilike 'create table%' order by starttime desc limit 100;

select * from stl_query a, stl_querytext b where a.query = b.query and b.text ilike 'create table%' order by a.starttime desc limit 100; 

Or get just the Table name and date like this:

select split_part(split_part(b.text,'table ', 2), ' ', 1) as tablename, 
starttime as createdate 
from stl_query a, stl_querytext b 
where a.query = b.query and b.text ilike 'create table%' order by a.starttime desc;

Export the Create Table data history you want to your created S3 bucket with your keys. The below select statement will output the table name created and the datetime it was created.

Create a temp table with the data you want to export to S3.

create table temp_history as 
(select split_part(split_part(b.text,'table ', 2), ' ', 1) as tablename, starttime as createdate 
from stl_query a, stl_querytext b 
where a.query = b.query 
and b.text ilike 'create table%' order by a.starttime desc);

Then upload this table to S3.

unload ('select * from temp_history') 
to 's3://tablehistory' credentials 'aws_access_key_id=myaccesskey;aws_secret_access_key=mysecretkey' 
DELIMITER '|' NULL AS '' ESCAPE ALLOWOVERWRITE;

Create a new table in AWS Redshift.

CREATE TABLE aws_table_history
(
tablename VARCHAR(150),
createdate DATETIME
);

Then import it back in to your custom table.

copy aws_table_history from 's3://tablehistory' credentials 'aws_access_key_id=MYKEY;aws_secret_access_key=MYID'
emptyasnull
blanksasnull
removequotes
escape
dateformat 'YYYY-MM-DD'
timeformat 'YYYY-MM-DD HH:MI:SS'
maxerror 20;
delimiter '|';

I tested all this and it works for us. I hope this helps some people. Lastly a simpler method would be to use Talend Big Data Open Studio and create a new job grab the component tRedshiftRow and paste the following SQL into it. Then build the job and you can schedule to run the .bat (windows) or .sh (unix) in any environment you want.

INSERT INTO temp_history 
(select split_part(split_part(b.text,'table ', 2), ' ', 1) as tablename, starttime as createdate 
from stl_query a, stl_querytext b 
where a.query = b.query 
and b.text ilike 'create table%' order by a.starttime desc);
COMMIT;
insert into historytable
select distinct s.* 
from temp_history s;
COMMIT;
--remove  duplicates 
DELETE FROM historytable USING historytable a2 
WHERE historytable.tablename = a2.tablename AND
historytable.createdate < a2.createdate;
COMMIT;
---clear everything from prestage
TRUNCATE temp_history;
COMMIT;
Educational answered 18/7, 2016 at 12:30 Comment(1)
you could create a Talend Big Data job and schedule it's BAT or SH job to run in the environment of your choosing. Create a tRedShiftRow component and paste the following SQL in and it would help retain create table permanently.Educational
S
30

There is a proper way to get table creation date and time in Redshift, that is not based on query log:

SELECT
TRIM(nspname) AS schema_name,
TRIM(relname) AS table_name,
relcreationtime AS creation_time
FROM pg_class_info
LEFT JOIN pg_namespace ON pg_class_info.relnamespace = pg_namespace.oid
WHERE reltype != 0
AND TRIM(nspname) = 'my_schema';

For some reason it does not work for very old tables. The oldest date I could find on a cluster of mine was in November 2018. Maybe the creation date of tables was not recorded in pg_class_info before that date.

Scriptorium answered 24/7, 2020 at 14:10 Comment(2)
pg_class_info was launched as a Redshift-specific addition on top of the standard PostgreSQL system tables/views. It's possible that objects created prior to the public launch of pg_class_info by Redshift will not list relcreationtime values within the view, because creation times weren't being captured at that point in time.Nonstandard
Does this work for views, it looks like mine all have an empty creation_time for the views?Pachyderm
E
17

In Redshift the other ways you can get the create time of your table by searching for the start and stop time of any create table sql run in the svl_qlog. There are other tables you can look at to get similar data but the problem with this way is that it's only kept for a couple of days (3 - 5). Although everyone would like metadata stored along with the table itself to query. Amazon recommends to keep this data to export the data to S3 from the logs you want to retain to S3. Then in my opinion you could import these s3 files back into a permanent table you want called aws_table_history or something so that this special data you keep forever.

select * from svl_qlog where substring ilike 'create table%' order by starttime desc limit 100;

select * from stl_query a, stl_querytext b where a.query = b.query and b.text ilike 'create table%' order by a.starttime desc limit 100; 

Or get just the Table name and date like this:

select split_part(split_part(b.text,'table ', 2), ' ', 1) as tablename, 
starttime as createdate 
from stl_query a, stl_querytext b 
where a.query = b.query and b.text ilike 'create table%' order by a.starttime desc;

Export the Create Table data history you want to your created S3 bucket with your keys. The below select statement will output the table name created and the datetime it was created.

Create a temp table with the data you want to export to S3.

create table temp_history as 
(select split_part(split_part(b.text,'table ', 2), ' ', 1) as tablename, starttime as createdate 
from stl_query a, stl_querytext b 
where a.query = b.query 
and b.text ilike 'create table%' order by a.starttime desc);

Then upload this table to S3.

unload ('select * from temp_history') 
to 's3://tablehistory' credentials 'aws_access_key_id=myaccesskey;aws_secret_access_key=mysecretkey' 
DELIMITER '|' NULL AS '' ESCAPE ALLOWOVERWRITE;

Create a new table in AWS Redshift.

CREATE TABLE aws_table_history
(
tablename VARCHAR(150),
createdate DATETIME
);

Then import it back in to your custom table.

copy aws_table_history from 's3://tablehistory' credentials 'aws_access_key_id=MYKEY;aws_secret_access_key=MYID'
emptyasnull
blanksasnull
removequotes
escape
dateformat 'YYYY-MM-DD'
timeformat 'YYYY-MM-DD HH:MI:SS'
maxerror 20;
delimiter '|';

I tested all this and it works for us. I hope this helps some people. Lastly a simpler method would be to use Talend Big Data Open Studio and create a new job grab the component tRedshiftRow and paste the following SQL into it. Then build the job and you can schedule to run the .bat (windows) or .sh (unix) in any environment you want.

INSERT INTO temp_history 
(select split_part(split_part(b.text,'table ', 2), ' ', 1) as tablename, starttime as createdate 
from stl_query a, stl_querytext b 
where a.query = b.query 
and b.text ilike 'create table%' order by a.starttime desc);
COMMIT;
insert into historytable
select distinct s.* 
from temp_history s;
COMMIT;
--remove  duplicates 
DELETE FROM historytable USING historytable a2 
WHERE historytable.tablename = a2.tablename AND
historytable.createdate < a2.createdate;
COMMIT;
---clear everything from prestage
TRUNCATE temp_history;
COMMIT;
Educational answered 18/7, 2016 at 12:30 Comment(1)
you could create a Talend Big Data job and schedule it's BAT or SH job to run in the environment of your choosing. Create a tRedShiftRow component and paste the following SQL in and it would help retain create table permanently.Educational
C
6

It looks like no way to get the creation timestamp of tables in Redshift. One workaround is using STL_DDLTEXT table which records a history of DDLs including CREATE TABLE.

Here is an example (test_table is a table name):

dev=> select starttime, endtime, trim(text) as ddl from stl_ddltext where text ilike '%create%table%test_table%' order by endtime desc limit 1;
         starttime          |          endtime           |                                                               ddl
----------------------------+----------------------------+----------------------------------------------------------------------------------------------------------------------------------
 2016-04-25 05:38:11.666338 | 2016-04-25 05:38:11.674947 | CREATE TABLE "test_table" (id int primary key, value varchar(24));
(1 row)

In the above case, starttime or endtime will be a timestamp of the test_table table creation.

NOTE:

  • Redshift does not keep STL_DDLTEXT for a long time, so you cannot use this way permanently.
  • You cannot use this way if the table is created through other ways like renaming a table name.
Comeau answered 25/4, 2016 at 19:34 Comment(2)
Thanks for your suggestion. However, redshift stores information in stl.* tables not for a long time (3 to 5 days at max). So it is unlikely that one could get creation timestamp of all the tables from that. Though we can dump these tables into another permeant table on a daily/weekly basis and hence ensure that we have this info at our will anytime. I was looking for something more concrete like one mentioned here.. https://mcmap.net/q/81158/-how-to-find-table-creation-time..But I couldn't find an alternative for pg_ls_dir in redshift.Epicarp
You are right. Redshift does not store stl_* data in a long time, so this was can be used in a limited situation.Comeau
C
2

There is a very simple way to do so:

SELECT relname, relcreationtime 
FROM pg_class_info 
WHERE relname = 'your_table_name';
Conjunctive answered 16/7, 2022 at 10:44 Comment(0)
M
0

you can use svv_table_info table for that:

SELECT database, schema, "table", create_time FROM SVV_TABLE_INFO
Methylnaphthalene answered 31/7 at 12:16 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.