In SQL, how can I delete duplicate rows based on multiple columns?
Asked Answered
K

2

7

I know I can run the following query below to find "duplicate" rows based on multiple columns doing something like this:

SELECT      PosId, OrgId
FROM        PosOrg
GROUP BY    PosId, OrgId
HAVING COUNT(*) > 1

but now I want to delete the duplicate rows so the above query ends of returning zero rows. I don't care which of the rows that I delete (just as long as only one rows remains based on the uniqueness of those two columns.

What is the correct way to delete these duplicates in SQL?

Katabasis answered 24/8, 2014 at 23:42 Comment(1)
What are the other columns on that table? Is there a PK constraint?Lyris
A
12

If you have another unique id column you can do

delete from PosOrg
where id not in
(
  SELECT      min(id)
  FROM        PosOrg
  GROUP BY    PosId, OrgId
)
Aludel answered 24/8, 2014 at 23:45 Comment(3)
wouldn't i need a "where > 1" somewhere on this?Katabasis
No, you delete all records that are not min(id) for every group, meaning all duplicatesAludel
@Katabasis no, not if there is a unique field called ID on the table. You should add to your question whether or not there is such a field.Lyris
C
7
;WITH CTE
AS (
    SELECT PosId
          ,OrgId
          ,ROW_NUMBER() OVER (PARTITION BY PosId , OrgId ORDER BY PosId , OrgId) rn
    FROM   PosOrg
   )
DELETE FROM CTE
WHERE rn > 1
Cerement answered 24/8, 2014 at 23:50 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.