How do I get table and columns information from Redshift?
Asked Answered
E

4

27

pg_tables provides a list of tables. Is there a pg_columns or its equivalent to provide the list of columns?

In DB2, I would query sysibm.systables/columns to get such information. What is the equivalent in redshift?

Equanimous answered 22/1, 2014 at 3:9 Comment(0)
I
7

Pg_table_def can provide some useful information, but it doesn't tell you column order, default, or character field sizes. Here's a query that can show you all that (note that I've updated this query since the original post and it now includes column encoding, diststyle/distkey, sortkey, and primary key as well as printing out the statement that shows the table owner):

select pk.pkey, tm.schemaname||'.'||tm.tablename, 'create table '||tm.schemaname||'.'||tm.tablename
  ||' ('
  ||cp.coldef
  -- primary key
  ||decode(pk.pkey,null,'',pk.pkey)
  -- diststyle and dist key
  ||decode(d.distkey,null,') diststyle '||dist_style||' ',d.distkey)
  --sort key 
  || (select decode(skey,null,'',skey) from  (select 
    ' sortkey(' ||substr(array_to_string(
                     array( select ','||cast(column_name as varchar(100))  as str from
                           (select column_name from information_schema.columns col where  col.table_schema= tm.schemaname and col.table_name=tm.tablename) c2
                            join 
                            (-- gives sort cols
                              select attrelid as tableid, attname as colname, attsortkeyord as sort_col_order from pg_attribute pa where 
                              pa.attnum > 0  AND NOT pa.attisdropped AND pa.attsortkeyord > 0
                            ) st on tm.tableid=st.tableid and c2.column_name=st.colname   order by sort_col_order
                          )
                    ,'')
                  ,2,10000) || ')' as skey
   ))
  ||';'
  -- additional alter table queries here to set owner
  || 'alter table '||tm.schemaname||'.'||tm.tablename||' owner to "'||tm.owner||'";'   
   from 
-- t  master table list
(
SELECT substring(n.nspname,1,100) as schemaname, substring(c.relname,1,100) as tablename, c.oid as tableid ,use2.usename as owner, decode(c.reldiststyle,0,'EVEN',1,'KEY',8,'ALL') as dist_style
FROM pg_namespace n, pg_class c,  pg_user use2 
WHERE n.oid = c.relnamespace 
  AND nspname NOT IN ('pg_catalog', 'pg_toast', 'information_schema')
  AND c.relname <> 'temp_staging_tables_1'
  and c.relowner = use2.usesysid
) tm 
-- cp  creates the col params for the create string
join
(select 
  substr(str,(charindex('QQQ',str)+3),(charindex('ZZZ',str))-(charindex('QQQ',str)+3)) as tableid
  ,substr(replace(replace(str,'ZZZ',''),'QQQ'||substr(str,(charindex('QQQ',str)+3),(charindex('ZZZ',str))-(charindex('QQQ',str)+3)),''),2,10000) as coldef
 from
 ( select array_to_string(array(
  SELECT  'QQQ'||cast(t.tableid as varchar(10))||'ZZZ'|| ','||column_name||' '|| decode(udt_name,'bpchar','char',udt_name) || decode(character_maximum_length,null,'', '('||cast(character_maximum_length as varchar(9))||')'   )
  -- default
  || decode(substr(column_default,2,8),'identity','',null,'',' default '||column_default||' ')
  -- nullable
  || decode(is_nullable,'YES',' NULL ','NO',' NOT NULL ') 
  -- identity 
  || decode(substr(column_default,2,8),'identity',' identity('||substr(column_default,(charindex('''',column_default)+1), (length(column_default)-charindex('''',reverse(column_default))-charindex('''',column_default)   ) )  ||') ', '')
  -- encoding
  || decode(enc,'none','',' encode '||enc)
   as str 
   from  
  -- ci  all the col info
  (
    select cast(t.tableid as int), cast(table_schema as varchar(100)), cast(table_name as varchar(100)), cast(column_name as varchar(100)), 
    cast(ordinal_position as int), cast(column_default as varchar(100)), cast(is_nullable as varchar(20)) , cast(udt_name as varchar(50))  ,cast(character_maximum_length as int),
     sort_col_order  , decode(d.colname,null,0,1) dist_key , e.enc
    from 
    (select * from information_schema.columns c where  c.table_schema= t.schemaname and c.table_name=t.tablename) c
    left join 
    (-- gives sort cols
    select attrelid as tableid, attname as colname, attsortkeyord as sort_col_order from pg_attribute a where 
     a.attnum > 0  AND NOT a.attisdropped AND a.attsortkeyord > 0
    ) s on t.tableid=s.tableid and c.column_name=s.colname
    left join 
    (-- gives encoding
    select attrelid as tableid, attname as colname, format_encoding(a.attencodingtype::integer) AS enc from pg_attribute a where 
     a.attnum > 0  AND NOT a.attisdropped 
    ) e on t.tableid=e.tableid and c.column_name=e.colname
    left join 
    -- gives dist col
    (select attrelid as tableid, attname as colname from pg_attribute a where
     a.attnum > 0 AND NOT a.attisdropped  AND a.attisdistkey = 't'
    ) d on t.tableid=d.tableid and c.column_name=d.colname
    order by ordinal_position
  ) ci 
  -- for the working array funct
  ), '') as str
 from 
 (-- need tableid
 SELECT substring(n.nspname,1,100) as schemaname, substring(c.relname,1,100) as tablename, c.oid as tableid 
 FROM pg_namespace n, pg_class c
 WHERE n.oid = c.relnamespace 
   AND nspname NOT IN ('pg_catalog', 'pg_toast', 'information_schema')
 ) t 
)) cp on tm.tableid=cp.tableid
-- primary key query here
left join 
 (select c.oid as tableid, ', primary key '|| substring(pg_get_indexdef(indexrelid),charindex('(',pg_get_indexdef(indexrelid))-1 ,60) as pkey
  from pg_index i , pg_namespace n, pg_class c 
  where i.indisprimary=true 
  and i.indrelid =c.oid
  and n.oid = c.relnamespace
 )  pk on tm.tableid=pk.tableid
-- dist key
left join
(  select 
  -- close off the col defs after the primary key 
  ')' ||
  ' distkey('|| cast(column_name as varchar(100)) ||')'  as distkey, t.tableid
  from information_schema.columns c
  join 
  (-- need tableid
  SELECT substring(n.nspname,1,100) as schemaname, substring(c.relname,1,100) as tablename, c.oid as tableid 
  FROM pg_namespace n, pg_class c
  WHERE n.oid = c.relnamespace 
    AND nspname NOT IN ('pg_catalog', 'pg_toast', 'information_schema')
  ) t on c.table_schema= t.schemaname and c.table_name=t.tablename
  join 
  -- gives dist col
  (select attrelid as tableid, attname as colname from pg_attribute a where
   a.attnum > 0 AND NOT a.attisdropped  AND a.attisdistkey = 't'
  ) d on t.tableid=d.tableid and c.column_name=d.colname

) d on tm.tableid=d.tableid 
where tm.schemaname||'.'||tm.tablename='myschema.mytable'
Impregnate answered 30/1, 2014 at 2:19 Comment(3)
see alberton.info/postgresql_meta_info.html for a bunch of simpler ways to extract column info, including orderPrime
For a simpler and cleaner way, check the answer here: #21273939Prime
A few months ago, AWS published some views to assist with some of the more esoteric components of Redshift's internal catalog. I'd recommend this one as the definitive answer to obtaining table DDLs: github.com/awslabs/amazon-redshift-utils/blob/master/src/…Impregnate
K
25

Use PG_TABLE_DEF table for getting that information:

It looks like this:

select * from pg_table_def where tablename = 't2';
schemaname|tablename|column|  type   | encoding | distkey |sortkey| notnull 
----------+---------+------+---------+----------+---------+-------+---------
public    | t2      | c1   | bigint  | none     | t       |     0 | f
public    | t2      | c2   | integer | mostly16 | f       |     0 | f
public    | t2      | c3   | integer | none     | f       |     1 | t
public    | t2      | c4   | integer | none     | f       |     2 | f
(4 rows)
Knighterrantry answered 22/1, 2014 at 9:19 Comment(5)
is that returned in column order ??Prime
works only if the user has access to the tables. Does not list all tables in the databaseEulau
make sure the schema you're looking for is in your search path.Graff
What if i have a schema for a table? Doesn't works if provided a schema.Longlimbed
to support @Graff comment, here is the command to run SET search_path TO my_schema. In my case @Tomasz query returned me an empty result because the table I was looking for was not in public. After running this command the query showed me what I wantedInspirit
P
13

From http://www.postgresonline.com/journal/archives/20-The-Anatomy-of-PostgreSQL-Part-2-Database-Objects.html :

The information_schema is a very important schema and is part of the ANSI standard, but is not quite so standard. It would be nice if all relational databases supported it, but they don't all do - MySQL 5, SQL Server (2000+), and PostgreSQL (7.4+) support them. Oracle and DB2 evidentally still don't, but there is hope. For the DBMS that support the information_schema, there are varying levels, but in all you can be pretty much assured to find tables, views, columns with same named fields that contain the full listings of all the tables in a database, listings of views and view definition DDL and all the columns, sizes of columns and datatypes.

The pg_catalog schema is the standard PostgreSQL meta data and core schema. You will find pre-defined global postgres functions in here as well as useful meta data about your database that is very specific to postgres. This is the schema used by postgres to manage things internally. A lot of this information overlaps with information found in the information_schema, but for data present in the information_schema, the information_schema is much easier to query and requires fewer or no joins to arrive at basic information.

So for basic queries, you're better off with information_schema.

This page (http://www.alberton.info/postgresql_meta_info.html) shows a beautiful list of queries to get information about your schema. Here is what is relevant to this question:

SELECT a.attname
  FROM pg_class c, pg_attribute a, pg_type t
 WHERE c.relname = 'test2'
   AND a.attnum > 0
   AND a.attrelid = c.oid
  AND a.atttypid = t.oid

-- with INFORMATION_SCHEMA:

SELECT column_name
  FROM information_schema.columns
 WHERE table_name = 'test2';

For detailed infos, there is:

SELECT a.attnum AS ordinal_position,
         a.attname AS column_name,
         t.typname AS data_type,
         a.attlen AS character_maximum_length,
         a.atttypmod AS modifier,
         a.attnotnull AS notnull,
         a.atthasdef AS hasdefault
    FROM pg_class c,
         pg_attribute a,
         pg_type t
   WHERE c.relname = 'test2'
     AND a.attnum > 0
     AND a.attrelid = c.oid
     AND a.atttypid = t.oid
ORDER BY a.attnum;

-- with INFORMATION_SCHEMA:

  SELECT ordinal_position,
         column_name,
         data_type,
         column_default,
         is_nullable,
         character_maximum_length,
         numeric_precision
    FROM information_schema.columns
   WHERE table_name = 'test2'
ORDER BY ordinal_position;
Prime answered 21/4, 2015 at 14:52 Comment(1)
If your use case is to create a new table or to use SET operations between the table column metadata and another table, you must use pg_class, pg_attribute, and pg_type. Otherwise, you will get Invalid operation: Specified types or functions (one per INFO message) not supported on Redshift tables. when you attempt to do either of these operations on information_schema or the pg_table_def tables.Microseism
I
7

Pg_table_def can provide some useful information, but it doesn't tell you column order, default, or character field sizes. Here's a query that can show you all that (note that I've updated this query since the original post and it now includes column encoding, diststyle/distkey, sortkey, and primary key as well as printing out the statement that shows the table owner):

select pk.pkey, tm.schemaname||'.'||tm.tablename, 'create table '||tm.schemaname||'.'||tm.tablename
  ||' ('
  ||cp.coldef
  -- primary key
  ||decode(pk.pkey,null,'',pk.pkey)
  -- diststyle and dist key
  ||decode(d.distkey,null,') diststyle '||dist_style||' ',d.distkey)
  --sort key 
  || (select decode(skey,null,'',skey) from  (select 
    ' sortkey(' ||substr(array_to_string(
                     array( select ','||cast(column_name as varchar(100))  as str from
                           (select column_name from information_schema.columns col where  col.table_schema= tm.schemaname and col.table_name=tm.tablename) c2
                            join 
                            (-- gives sort cols
                              select attrelid as tableid, attname as colname, attsortkeyord as sort_col_order from pg_attribute pa where 
                              pa.attnum > 0  AND NOT pa.attisdropped AND pa.attsortkeyord > 0
                            ) st on tm.tableid=st.tableid and c2.column_name=st.colname   order by sort_col_order
                          )
                    ,'')
                  ,2,10000) || ')' as skey
   ))
  ||';'
  -- additional alter table queries here to set owner
  || 'alter table '||tm.schemaname||'.'||tm.tablename||' owner to "'||tm.owner||'";'   
   from 
-- t  master table list
(
SELECT substring(n.nspname,1,100) as schemaname, substring(c.relname,1,100) as tablename, c.oid as tableid ,use2.usename as owner, decode(c.reldiststyle,0,'EVEN',1,'KEY',8,'ALL') as dist_style
FROM pg_namespace n, pg_class c,  pg_user use2 
WHERE n.oid = c.relnamespace 
  AND nspname NOT IN ('pg_catalog', 'pg_toast', 'information_schema')
  AND c.relname <> 'temp_staging_tables_1'
  and c.relowner = use2.usesysid
) tm 
-- cp  creates the col params for the create string
join
(select 
  substr(str,(charindex('QQQ',str)+3),(charindex('ZZZ',str))-(charindex('QQQ',str)+3)) as tableid
  ,substr(replace(replace(str,'ZZZ',''),'QQQ'||substr(str,(charindex('QQQ',str)+3),(charindex('ZZZ',str))-(charindex('QQQ',str)+3)),''),2,10000) as coldef
 from
 ( select array_to_string(array(
  SELECT  'QQQ'||cast(t.tableid as varchar(10))||'ZZZ'|| ','||column_name||' '|| decode(udt_name,'bpchar','char',udt_name) || decode(character_maximum_length,null,'', '('||cast(character_maximum_length as varchar(9))||')'   )
  -- default
  || decode(substr(column_default,2,8),'identity','',null,'',' default '||column_default||' ')
  -- nullable
  || decode(is_nullable,'YES',' NULL ','NO',' NOT NULL ') 
  -- identity 
  || decode(substr(column_default,2,8),'identity',' identity('||substr(column_default,(charindex('''',column_default)+1), (length(column_default)-charindex('''',reverse(column_default))-charindex('''',column_default)   ) )  ||') ', '')
  -- encoding
  || decode(enc,'none','',' encode '||enc)
   as str 
   from  
  -- ci  all the col info
  (
    select cast(t.tableid as int), cast(table_schema as varchar(100)), cast(table_name as varchar(100)), cast(column_name as varchar(100)), 
    cast(ordinal_position as int), cast(column_default as varchar(100)), cast(is_nullable as varchar(20)) , cast(udt_name as varchar(50))  ,cast(character_maximum_length as int),
     sort_col_order  , decode(d.colname,null,0,1) dist_key , e.enc
    from 
    (select * from information_schema.columns c where  c.table_schema= t.schemaname and c.table_name=t.tablename) c
    left join 
    (-- gives sort cols
    select attrelid as tableid, attname as colname, attsortkeyord as sort_col_order from pg_attribute a where 
     a.attnum > 0  AND NOT a.attisdropped AND a.attsortkeyord > 0
    ) s on t.tableid=s.tableid and c.column_name=s.colname
    left join 
    (-- gives encoding
    select attrelid as tableid, attname as colname, format_encoding(a.attencodingtype::integer) AS enc from pg_attribute a where 
     a.attnum > 0  AND NOT a.attisdropped 
    ) e on t.tableid=e.tableid and c.column_name=e.colname
    left join 
    -- gives dist col
    (select attrelid as tableid, attname as colname from pg_attribute a where
     a.attnum > 0 AND NOT a.attisdropped  AND a.attisdistkey = 't'
    ) d on t.tableid=d.tableid and c.column_name=d.colname
    order by ordinal_position
  ) ci 
  -- for the working array funct
  ), '') as str
 from 
 (-- need tableid
 SELECT substring(n.nspname,1,100) as schemaname, substring(c.relname,1,100) as tablename, c.oid as tableid 
 FROM pg_namespace n, pg_class c
 WHERE n.oid = c.relnamespace 
   AND nspname NOT IN ('pg_catalog', 'pg_toast', 'information_schema')
 ) t 
)) cp on tm.tableid=cp.tableid
-- primary key query here
left join 
 (select c.oid as tableid, ', primary key '|| substring(pg_get_indexdef(indexrelid),charindex('(',pg_get_indexdef(indexrelid))-1 ,60) as pkey
  from pg_index i , pg_namespace n, pg_class c 
  where i.indisprimary=true 
  and i.indrelid =c.oid
  and n.oid = c.relnamespace
 )  pk on tm.tableid=pk.tableid
-- dist key
left join
(  select 
  -- close off the col defs after the primary key 
  ')' ||
  ' distkey('|| cast(column_name as varchar(100)) ||')'  as distkey, t.tableid
  from information_schema.columns c
  join 
  (-- need tableid
  SELECT substring(n.nspname,1,100) as schemaname, substring(c.relname,1,100) as tablename, c.oid as tableid 
  FROM pg_namespace n, pg_class c
  WHERE n.oid = c.relnamespace 
    AND nspname NOT IN ('pg_catalog', 'pg_toast', 'information_schema')
  ) t on c.table_schema= t.schemaname and c.table_name=t.tablename
  join 
  -- gives dist col
  (select attrelid as tableid, attname as colname from pg_attribute a where
   a.attnum > 0 AND NOT a.attisdropped  AND a.attisdistkey = 't'
  ) d on t.tableid=d.tableid and c.column_name=d.colname

) d on tm.tableid=d.tableid 
where tm.schemaname||'.'||tm.tablename='myschema.mytable'
Impregnate answered 30/1, 2014 at 2:19 Comment(3)
see alberton.info/postgresql_meta_info.html for a bunch of simpler ways to extract column info, including orderPrime
For a simpler and cleaner way, check the answer here: #21273939Prime
A few months ago, AWS published some views to assist with some of the more esoteric components of Redshift's internal catalog. I'd recommend this one as the definitive answer to obtaining table DDLs: github.com/awslabs/amazon-redshift-utils/blob/master/src/…Impregnate
I
2

In case @Tomasz Tybulewicz's query returns an empty result is very likely that the table you are querying is not in public schema.

Hence run first SET search_path TO my_schema and the query

select * 
from pg_table_def 
where tablename = 't2';

will show the desired result.

Inspirit answered 28/1, 2022 at 12:18 Comment(1)
+1 this is useful info for me (I'm used to explicitly specifying schemas on tables and try not to rely on search_path)Transfigure

© 2022 - 2024 — McMap. All rights reserved.