How do I get a list of column names from a psycopg2 cursor?
Asked Answered
S

11

230

I would like a general way to generate column labels directly from the selected column names, and recall seeing that python's psycopg2 module supports this feature.

Soinski answered 20/4, 2012 at 19:18 Comment(0)
S
365

From "Programming Python" by Mark Lutz:

curs.execute("Select * FROM people LIMIT 0")
colnames = [desc[0] for desc in curs.description]
Soinski answered 20/4, 2012 at 19:20 Comment(6)
If you just want the column names, don't select all of the rows in the table. This is more efficient: curs.execute("SELECT * FROM people LIMIT 0")Catharine
It may be worth adding that this works for views as well as tables, whereas it’s not (easily) possible to get column names for views from information_schema.Briolette
Might be more intuitive to get the name as an attribute: colnames = [desc.name for desc in curs.description]Detergent
Important to note that column names read from the cursor description function come out in lowercase. curs.execute("Select userId FROM people") colnames = [desc[0] for desc in curs.description] assert colnames == ['userid']Kenway
curs.description for me is None, and I'm not sure why... I'm querying against Redshift, if that makes a difference.Ascendant
Or in one line : pd.read_sql("Select * FROM <table_name> LIMIT 1", con=engine).columnsBuddleia
M
95

Another thing you can do is to create a cursor with which you will be able to reference your columns by their names (that's a need which led me to this page in the first place):

import psycopg2
from psycopg2.extras import RealDictCursor

ps_conn = psycopg2.connect(...)
ps_cursor = psql_conn.cursor(cursor_factory=RealDictCursor)

ps_cursor.execute('select 1 as col_a, 2 as col_b')
my_record = ps_cursor.fetchone()
print (my_record['col_a'],my_record['col_b'])

>> 1, 2
Michaud answered 1/9, 2017 at 12:20 Comment(2)
please reference the source of the infoCommunicant
In my case the "as col_a" was unnecessary. I omitted it and the records returned by the cursor used the column names from my table.Peterson
V
35

To get the column names in a separate query, you can query the information_schema.columns table.

#!/usr/bin/env python3

import psycopg2

if __name__ == '__main__':
  DSN = 'host=YOUR_DATABASE_HOST port=YOUR_DATABASE_PORT dbname=YOUR_DATABASE_NAME user=YOUR_DATABASE_USER'

  column_names = []

  with psycopg2.connect(DSN) as connection:
      with connection.cursor() as cursor:
          cursor.execute("select column_name from information_schema.columns where table_schema = 'YOUR_SCHEMA_NAME' and table_name='YOUR_TABLE_NAME'")
          column_names = [row[0] for row in cursor]

  print("Column names: {}\n".format(column_names))

To get column names in the same query as data rows, you can use the description field of the cursor:

#!/usr/bin/env python3

import psycopg2

if __name__ == '__main__':
  DSN = 'host=YOUR_DATABASE_HOST port=YOUR_DATABASE_PORT dbname=YOUR_DATABASE_NAME user=YOUR_DATABASE_USER'

  column_names = []
  data_rows = []

  with psycopg2.connect(DSN) as connection:
    with connection.cursor() as cursor:
      cursor.execute("select field1, field2, fieldn from table1")
      column_names = [desc[0] for desc in cursor.description]
      for row in cursor:
        data_rows.append(row)

  print("Column names: {}\n".format(column_names))
Vassalage answered 4/3, 2014 at 12:48 Comment(0)
L
28

If you want to have a named tuple obj from db query you can use the following snippet:

from collections import namedtuple

def create_record(obj, fields):
    ''' given obj from db returns named tuple with fields mapped to values '''
    Record = namedtuple("Record", fields)
    mappings = dict(zip(fields, obj))
    return Record(**mappings)

cur.execute("Select * FROM people")
colnames = [desc[0] for desc in cur.description]
rows = cur.fetchall()
cur.close()
result = []
for row in rows:
    result.append(create_record(row, colnames))

This allows you to access record values as if they were class properties i.e.

record.id, record.other_table_column_name, etc.

or even shorter

from psycopg2.extras import NamedTupleCursor
with cursor(cursor_factory=NamedTupleCursor) as cur:
   cur.execute("Select * ...")
   return cur.fetchall()
Leslee answered 2/8, 2017 at 8:20 Comment(0)
P
27

If you're looking to get a pandas data frame with column headers already associated, try this:

import psycopg2, pandas

con=psycopg2.connect(
    dbname=DBNAME, 
    host=HOST, 
    port=PORT, 
    user=USER, 
    password=PASSWORD
)

sql = """
select * from x
"""

d = pandas.read_sql_query(sql,con)

con.close()

print(type(d))

print(pandas.DataFrame.head(d))
Psychometry answered 15/9, 2020 at 6:55 Comment(4)
This seems to me the easiest. I'm surprised you would have to do a separate query to get column names. Worked for me.Harass
Totally agreed, @ChuckPedroPsychometry
easiest method. this should have more upvotesUpcast
This is the easiest by far, but is quite slow. The fastest way to do this is using pd.DataFrame(np.array(cur.fetchall())), which comes with a sequence of numbers as column names.Courser
T
8

After executing SQL query write following python script written in 2.7

total_fields = len(cursor.description)    
fields_names = [i[0] for i in cursor.description   
    Print fields_names
Taphouse answered 29/5, 2017 at 8:4 Comment(0)
M
2

I have noticed that you must use cursor.fetchone() after the query to get the list of columns in cursor.description (i.e in [desc[0] for desc in curs.description])

Minx answered 8/10, 2015 at 8:52 Comment(0)
B
1
 # You can use this function
 def getColumns(cursorDescription):
     columnList = []
     for tupla in cursorDescription:
         columnList.append(tupla[0])
     return columnList 
Banns answered 20/11, 2019 at 19:9 Comment(0)
U
1

If you want to turn all your data in a Pandas Dataframe with column names:

cur.execute("select * from tablename")
datapoints = cur.fetchall()
cols = [desc[0] for desc in cur.description]
df = pd.DataFrame((datapoints) , columns=[cols])
Urethra answered 19/5, 2022 at 12:31 Comment(1)
df = pd.DataFrame(datapoints, columns=cols) cols is already a list, so adding square brackets is creating a multi-indexIm
I
0

I also used to face similar issue. I use a simple trick to solve this. Suppose you have column names in a list like

col_name = ['a', 'b', 'c']

Then you can do following

for row in cursor.fetchone():
    print zip(col_name, row)
Ivetteivetts answered 12/4, 2017 at 23:7 Comment(0)
A
-7
#!/usr/bin/python
import psycopg2
#note that we have to import the Psycopg2 extras library!
import psycopg2.extras
import sys

def main():
    conn_string = "host='localhost' dbname='my_database' user='postgres' password='secret'"
    # print the connection string we will use to connect
    print "Connecting to database\n ->%s" % (conn_string)

    # get a connection, if a connect cannot be made an exception will be raised here
    conn = psycopg2.connect(conn_string)

    # conn.cursor will return a cursor object, you can use this query to perform queries
    # note that in this example we pass a cursor_factory argument that will
    # dictionary cursor so COLUMNS will be returned as a dictionary so we
    # can access columns by their name instead of index.
    cursor = conn.cursor(cursor_factory=psycopg2.extras.DictCursor)

    # tell postgres to use more work memory
    work_mem = 2048

    # by passing a tuple as the 2nd argument to the execution function our
    # %s string variable will get replaced with the order of variables in
    # the list. In this case there is only 1 variable.
    # Note that in python you specify a tuple with one item in it by placing
    # a comma after the first variable and surrounding it in parentheses.
    cursor.execute('SET work_mem TO %s', (work_mem,))

    # Then we get the work memory we just set -> we know we only want the
    # first ROW so we call fetchone.
    # then we use bracket access to get the FIRST value.
    # Note that even though we've returned the columns by name we can still
    # access columns by numeric index as well - which is really nice.
    cursor.execute('SHOW work_mem')

    # Call fetchone - which will fetch the first row returned from the
    # database.
    memory = cursor.fetchone()

    # access the column by numeric index:
    # even though we enabled columns by name I'm showing you this to
    # show that you can still access columns by index and iterate over them.
    print "Value: ", memory[0]

    # print the entire row 
    print "Row: ", memory

if __name__ == "__main__":
    main()
Arista answered 28/6, 2016 at 8:52 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.