How can i get a count(*) of all the columns in a table? Using PostgreSql
Asked Answered
S

7

10

I have bunch of tables where several of them have hundreds of columns. I need to get a count of non-null values for each column and I've been doing it manually. I would like to figure out a way to get all the counts for all the columns in a table. I looked up stackoverflow and google, but unable to find the answer.

I tried this but it's just returning a value of 1 for each column. I know it's just counting the number of column and not the values in each column. Any suggestions?

select count(COLUMN_NAME)
from information_schema.columns
where table_schema = 'schema_name'
and table_name = 'table_name'
group by COLUMN_NAME
Sikorsky answered 31/10, 2018 at 16:22 Comment(0)
D
8

COUNT(column_name) always gives you the count of NON NULL values.

Create a generic function like this which can take schema name and table name as arguments.

Here I am constructing select statements joined together by UNION ALLs each returning the value of the column_name and it's count for all columns when executed dynamically.

CREATE OR REPLACE FUNCTION public.get_count( TEXT, TEXT )
RETURNS  TABLE(t_column_name  TEXT, t_count BIGINT )
LANGUAGE plpgsql
AS $BODY$
DECLARE
p_schema        TEXT := $1;
p_tabname       TEXT := $2;
v_sql_statement TEXT;

BEGIN

SELECT STRING_AGG( 'SELECT ''' 
       || column_name 
       || ''',' 
       || ' count(' 
       || column_name 
       || ')  FROM ' 
       || table_schema 
       || '.' 
       || table_name 
         ,' UNION ALL ' ) INTO v_sql_statement
FROM   information_schema.columns 
WHERE  table_schema   = p_schema 
       AND table_name = p_tabname; 

    IF v_sql_statement IS NOT NULL THEN
     RETURN QUERY EXECUTE   v_sql_statement;
    END IF;
END
$BODY$;

Execution

knayak=# select c.col, c.count from 
 public.get_count( 'public', 'employees' ) as c(col,count);
      col       | count
----------------+-------
 employee_id    |   107
 first_name     |   107
 last_name      |   107
 email          |   107
 phone_number   |   107
 hire_date      |   107
 job_id         |   107
 salary         |   107
 commission_pct |    35
 manager_id     |   106
 department_id  |   106
(11 rows)
Dispersion answered 31/10, 2018 at 17:27 Comment(5)
This is an amazingly coded answer. It does not explain how it works but I'm impressed.Wholesale
@Wholesale : Thanks!Dispersion
Just because I need to tinker -- RETURN QUERY EXECUTE v_sql_statement||' UNION ALL SELECT ''total'', count(*) FROM '||table_schema||'.'||table_name';Wholesale
@Wholesale : Or even select c.col, c.count from public.get_count( 'public', 'employees' ) as c(col,count) UNION ALL select 'Total' , count(*) as total from employees; ?Dispersion
wow. this is a lot to digest considering I'm fairly new to sql but i'll check this out. thanksSikorsky
X
2

Works like a charm:

select COUNT(*) from information_schema.columns
where table_name = 'your_table_name'

Example:

select COUNT(*) from information_schema.columns
where table_name = 'pg_class'

returns 33 for me

Source: http://www.sqlnosql.com/how-to-determine-number-of-columns-in-a-postgresql-table/

Read about information schema: https://www.postgresql.org/docs/16/information-schema.html

Xylotomy answered 9/12, 2023 at 15:48 Comment(1)
Not sure why this is not the selected answer. Works on Snowflake too!Corrie
W
1

There's not really a magic way to do this. If you need to check each of 100 different columns to see how many non-null values there are, you'll have to specify each of the columns of the table.

About the best you can do is use the system catalogs to help write your queries:

select 'SUM(CASE WHEN ' + column_name + ' IS NULL THEN 1 ELSE 0 END) AS ' + column_name
from information_schema.columns
where table_schema = 'schema_name'
    and table_name = 'table_name'
    and is_nullable = 'YES' 

You may need to add quoted identifiers if you've got spaces or other special characters in your column names.

Then you can copy that output to another query and add the missing parts of the query. I've added and is_nullable = 'YES' because it's a waste of time to check NOT NULL columns. As far as I know, that column is present in PostgreSQL.

Whitver answered 31/10, 2018 at 17:10 Comment(0)
L
1

Much easier and simpler solution:

select table_name, count(*) as column_count 
from information_schema."columns"
where table_schema = 'public'
GROUP by table_name order by column_count desc;
Luing answered 18/5, 2023 at 17:32 Comment(0)
P
0

Try this

SELECT 
sum(case when column1 is not null then 1 else 0 end) as col1_not_null_count
sum(case when column2 is not null then 1 else 0 end) as col2_not_null_count
sum(case when column3 is not null then 1 else 0 end) as col3_not_null_count
FROM information_schema.columns
WHERE table_schema = 'schema_name'
AND table_name = 'table_name'
Pagurian answered 31/10, 2018 at 17:0 Comment(0)
E
0

The best way I've found to do this is to write a case statement to make a non-null value in a column become a 1 and a null become a 0. Then I sum the case to get a count of the non-null values:

SELECT SUM(CASE WHEN COLUMN_NAME1 IS NULL THEN 1 ELSE 0 END) AS COL1_COUNT
     , SUM(CASE WHEN COLUMN_NAME2 IS NULL THEN 1 ELSE 0 END) AS COL2_COUNT
  FROM TABLE_NAME

I see in your select that you are looking at the information_schema.columns table. You can dynamically generate the code above by a select from that table:

SELECT ', SUM(CASE WHEN ' + column_name + ' IS NULL THEN 1 ELSE 0 END) AS ' + column_name + '_COUNT'
FROM information_schema.columns
WHERE table_schema = 'schema_name'
AND table_name = 'table_name'

You can also dynamically create a different select for every column in the table in question:

SELECT 'SELECT SUM(CASE WHEN ' + column_name + ' IS NULL THEN 1 ELSE 0 END) AS ' + column_name + '_COUNT FROM ' + table_schema + '.' + table_name
FROM information_schema.columns
WHERE table_schema = 'schema_name'
AND table_name = 'table_name'
Emad answered 31/10, 2018 at 17:5 Comment(2)
Is this for seeing whether there is a column name or not? Or is this adding up the number of rows within each column that is not null?Sikorsky
The first query is for determining is for adding up the number of rows where the column is not null. The next two queries are provided to aid in creating the first query programatically (their result is SQL which you copy/paste into a SQL window and run again).Emad
P
0

I came here looking to answer the same question, but I wanted to do this without making a function, as I don't always have the ability to do that in the databases I'm working with. But the databases I work with have the tblfunc module installed, which has the crosstab function, that takes an sql string as input. I was able to produce sql that I could use in that function to get all the column counts from any table. Here is the code I used, where I put in the schema_name and table_name of what I wanted:

select * from crosstab('select column_name, ''num'' as category, sum(num) from (' || (
    select
        string_agg(sql, '') as sql_string
    from (
        select
            case when row_number() OVER () = 1 then '' else ' union all ' end ||
            'select ''' || column_name || ''' as column_name, ' || 'count(' || column_name || ') as num from ' || table_schema || '.' || table_name as sql
        from information_schema.columns
        where table_schema = 'schema_name'
            and table_name = 'table_name') as sql_query limit 1 
    ) || ') as column_counts group by column_name, category') 
                        AS t(column_name text, num numeric) order by num asc
Prolocutor answered 22/4, 2021 at 18:47 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.