How to show row numbers in PostgreSQL query?
Asked Answered
B

2

82

I'd like to show the observation number for each record returned by a PostgreSQL query.

I think in 8.4 windowing functions can perform this capability.

Balzac answered 3/8, 2010 at 13:49 Comment(3)
I think I have to favorite my own question so I can come back to this in the future :)Balzac
+1 This is the first question I've seen which consists entirely of questions, answers, and dialog from a single person.Riot
:) Xeon, you just made me laugh. I still come back to this question every once in a while.Balzac
B
117
select   row_number() over (order by <field> nulls last) as rownum, *
from     foo_tbl
order by <field>

If order is not necessary, this answer may also be simplified:

select row_number() over(), *  -- notice: no fields are needed
from   foo_tbl

SQL Fiddle Proof of Concept

Balzac answered 3/8, 2010 at 13:52 Comment(6)
It appears that if you do over() then it always gives rownumbers incrementally, like 1 2 3 4... in order of that particular result (if there are outer queries that rearrange results, then rownum could become out of order ref: https://mcmap.net/q/169475/-sql-server-2005-row_number-without-order-by, so adding an order by might be useful in those cases (or if you want to not count nulls, as in the first example). FWIW.Spinous
This is really cool -- for us newbies, could some dissect how this works?Quinlan
@zthomas.nc its a window function. Think of a familiar glass window. If you wanted to, you could divide that window into smaller panes (frames), all the results are still uniquely there, but divided across the frames. This division is what is known as a partition, which is what the over() does above. If you haven't supplied it any conditions, there'd be one pane for the entire window. Window functions are unique in that they can do calculations across the rows of a frame, instead of a whole result set. So if you wanted to do row_number by sex, you can use your over to partition by sex.Balzac
Seems to substantially slow down a large queryInterpose
It shouldn’t have that much of an impact. Perhaps you’re hitting other limitations with the large query, like reduced working memory. If it is such a factor, and both row numbers and performance are necessary consider creating the numbers in your presentation or post-processing layer.Balzac
@KairatDoshekenov Is that a fact or is that your belief? The imperative is throwing me off and making it seem like you're an expert. If that's the case, please show a working example; but I suspect you might be doing something else wrong. Here's a quick proof of concept sqlfiddle.com/#!17/da7a3/5Balzac
B
5

Postgres < 8.4

For versions prior to 8.4:

SELECT    count(*) rownum, foo.*
FROM      datatable foo
JOIN      datatable bar
          ON (foo.pk_id <= bar.pk_id)
GROUP BY  foo.pk_id, foo.a, foo.b
ORDER BY  rownum
;

-- if there isn't a single unique/primary key field, you can concatenate fields
--    Example: ON (foo.a||foo.b||foo.c <= bar.a||bar.b||bar.c)

Hope this helps someone.

SQL Fiddle Proof of Concept


Another Approach

I suggest avoiding this at all cost, but wanted to include it for posterity. It is somewhat expensive and I imagine does not scale well, but when a primary key does not exist on a table (bad db-design), your options may be limited. In most of those cases the recommendation is to perform numbering at the application layer.

-- Based on basic table w/o primary key
-- CREATE TABLE names ( name as text );

SELECT num, name[num]
FROM (
  select generate_series( 1, (select count(*) from names) ) as num
) _nums,
(
  select array_agg(name) as name from names
) _names

SQL Fiddle Proof of Concept

Reasons it doesn't scale:

  • packing your tuples into an array for each row is a bad idea
Balzac answered 3/8, 2010 at 22:57 Comment(5)
This method should work on any SQL Standard compliant databaseBalzac
I think it's important to note that null values should concatentate to null. Thus, a coalesce() may need to be used.Balzac
Window functions are part of the SQL:2003 standard.Costmary
This wont work, this is rank not row, consider 10,6,6,2 , rows 1,2,3,4,, but rank 1,3,3,4 with your query.Improvident
@Arthur: what is your primary key that's what pk stands for. A PK by definition must be unique unless it's a CK. Also, read the comment below the query; though there are conditions where it won't work (for example, I'm not sure you can concatenate null values)Balzac

© 2022 - 2024 — McMap. All rights reserved.