ROW_NUMBER with partition by returns result with duplicate rows
Asked Answered
S

3

7

When I run this query I get more rows than from a similar query without the ROW_NUMBER () line:

SELECT DISTINCT id, value,  
    ROW_NUMBER ()  OVER (PARTITION BY  (id)
                         ORDER BY  value  DESC NULLS LAST ) max  
FROM TABLE1 
WHERE id like '%1260' ORDER BY id ASC

VS

SELECT DISTINCT id, value
FROM TABLE1 
WHERE id like '%1260' ORDER BY id ASC

Why does it happen and how to fix it?

Shulins answered 11/9, 2012 at 5:15 Comment(0)
D
5

Think of it this way: if you have two rows with the same id and value, the second query gives you one row with the distinct id, value pair. The first gives you two rows, one with row_number() of 1 and the other with row_number() of 2.

For the following data:

ID   VALUE
--   -----
1    XXX
1    XXX

Query 1 would return

ID  VALUE   MAX
--  -----   ---
1   XXX      1
1   XXX      2

Query 2 would return

ID  VALUE
--  -----
1   XXX
Durden answered 11/9, 2012 at 5:19 Comment(2)
I thought that first distinct is evaluated and only after that row_number()Shulins
Distinct is on the whole row. Row_number() is part of the result set, so it is subject to the distinct as well.Durden
J
5

The rows are no longer the same because you added a row number, so DISTINCT doesn't do anything. One way to avoid this is to add your row number after you've used DISTINCT.

SELECT id, value, ROW_NUMBER ()   over (partition by  (id)   ORDER BY  value  desc NULLS LAST  ) max
FROM (
    SELECT DISTINCT id, value
    FROM TABLE1 
    WHERE id like '%1260'
) AS subquery
ORDER BY id ASC

(I'm not sure if the syntax is right for Oracle, it may need minor tweaking.)

Jubbulpore answered 11/9, 2012 at 5:20 Comment(1)
This will will cause a query to get all rows and only then execute ROW_NUMBER(). This will cost a lot time if inner query is big enough.Liquor
B
0

If you are using Oracle 11g R2, try this.

WITH DistinctRow 
AS
(
     SELECT DISTINCT id, value
     FROM TABLE1 
     WHERE id like '%1260'
)
SELECT id, value,  
       ROW_NUMBER() over (partition by (id) ORDER BY value desc NULLS LAST) max  
FROM DistinctRow 
ORDER BY max ASC
Bidentate answered 11/9, 2012 at 5:26 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.