How do I Compare columns of records from the same table?
Asked Answered
E

4

6

Here is my testing table data:

Testing

ID      Name            Payment_Date   Fee                Amt
1       BankA           2016-04-01     100                20000
2       BankB           2016-04-02     200                10000
3       BankA           2016-04-03     100                20000
4       BankB           2016-04-04     300                20000

I am trying to compare fields Name, Fee and Amt of each data records to see whether there are the same values or not. If they got the same value, I'd like to mark something like 'Y' to those record. Here is the expected result

ID      Name            Payment_Date   Fee                Amt      SameDataExistYN
1       BankA           2016-04-01     100                20000    Y
2       BankB           2016-04-02     200                10000    N
3       BankA           2016-04-03     100                20000    Y
4       BankB           2016-04-04     300                20000    N

I have tried these two methods below. but I am looking for any other solutions so I can pick out the best one for my work.

Method 1.

select t.*, iif((select count(*) from testing where name=t.name and fee=t.fee and amt=t.amt)=1,'N','Y') as SameDataExistYN from testing t

Method 2.

select t.*, case when ((b.Name = t.Name)
                        and (b.Fee = t.Fee) and (b.Amt = t.Amt)) then 'Y' else 'N' end as SameDataExistYN
from testing t
left join ( select Name,  Fee, Amt
            from testing
            Group By Name,  Fee, Amt
            Having count(*)>1  ) as b on b.Name = t.Name
                                      and b.Fee = t.Fee
                                      and b.Amt = t.Amt
Eleanor answered 22/4, 2016 at 1:22 Comment(4)
Are you just trying to find duplicate records?Fetlock
Yeah! i try to find duplicate record and mark sth on itEleanor
I think your first method is not correct since it's only comparing by name.Protestation
Oh sorry! i made mistake in method1 now editedEleanor
F
1

Here is another method, but I think you have to run tests on your data to find out which is best:

SELECT
  t.*,
  CASE WHEN EXISTS(
    SELECT * FROM testing WHERE id <> t.id AND Name = t.Name AND Fee = t.Fee AND Amt = t.Amt
  ) THEN 'Y' ELSE 'N' END SameDataExistYN
FROM
  testing t 
;
Favorable answered 22/4, 2016 at 1:49 Comment(1)
Wow! this one work perfect and faster than my precious methods. Thank youEleanor
E
3

There are several approaches, with differences in performance characteristics.

One option is to run a correlated subquery. This approach is best suited if you have a suitable index, and you are pulling a relatively small number of rows.

SELECT t.id
     , t.name
     , t.payment_date
     , t.fee
     , t.amt
     , ( SELECT 'Y' 
           FROM testing s
          WHERE s.name = t.name
            AND s.fee  = t.fee
            AND s.amt  = t.amt
            AND s.id  <> t.id
          LIMIT 1
        ) AS SameDataExist
  FROM testing t
 WHERE ...
 LIMIT ...

The correlated subquery in the SELECT list will return a Y when there is at least one "matching" row found. If no "matching" row is found, SameDataExist column will have a value of NULL. To convert the NULL to an 'N', you could wrap the subquery in an IFULL() function.


Your method 2 is a workable approach. The expression in the SELECT list doesn't need to do all those comparisons, those have already been done in the join predicates. All you need to know is whether a matching row was found... just testing one of the columns for NULL/NOT NULL is sufficient.

SELECT t.id
     , t.name
     , t.payment_date
     , t.fee
     , t.amt
     , IF(s.name IS NOT NULL,'Y','N') AS SameDataExists
  FROM testing t
  LEFT
  JOIN ( -- tuples that occur in more than one row
         SELECT r.name, r.fee, r.amt
           FROM testing r
          GROUP BY r.name, r.fee, r.amt
         HAVING COUNT(1) > 1
       ) s
    ON s.name = t.name
   AND s.fee  = t.fee
   AND s.amt  = t.amt
 WHERE ...

You could also make use of an EXISTS (correlated subquery)

Euphrasy answered 22/4, 2016 at 1:59 Comment(0)
F
2

Check this out

Select statement to find duplicates on certain fields

Not sure how to mark this as a dupe...

Fetlock answered 22/4, 2016 at 1:33 Comment(0)
F
1

Here is another method, but I think you have to run tests on your data to find out which is best:

SELECT
  t.*,
  CASE WHEN EXISTS(
    SELECT * FROM testing WHERE id <> t.id AND Name = t.Name AND Fee = t.Fee AND Amt = t.Amt
  ) THEN 'Y' ELSE 'N' END SameDataExistYN
FROM
  testing t 
;
Favorable answered 22/4, 2016 at 1:49 Comment(1)
Wow! this one work perfect and faster than my precious methods. Thank youEleanor
C
1

Select t.name ,t.fee,t.amt,if(count(*)>1),'Y','N') from testing t group by t.name,t.fee,t.amt

Connatural answered 22/4, 2016 at 19:42 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.