Find which rows have different values for a given column in Teradata SQL
Asked Answered
U

4

24

I am trying to compare two addresses from the same ID to see whether they match. For example:

Id  Adress Code     Address
1   1               123 Main
1   2               123 Main
2   1               456 Wall
2   2               456 Wall
3   1               789 Right
3   2               100 Left

I'm just trying to figure out whether the address for each ID matches. So in this case I want to return just ID 3 as having a different address for Address Code 1 and 2.

Unfounded answered 4/12, 2012 at 17:7 Comment(6)
Cool! What have you tried?Pneuma
Which RDBMS? If SQL Server, try looking at this question.Volvox
Sorry, I'm using this within Teradata.Unfounded
You have to read about joinsCeramics
I would recommend you look long and hard at your tables. One of the principles behind RDBMS is the idea of uniqueness, which makes your life a lot easier. The idea is that each row represents a distinct entity and no row should be duplicated; uniqueness would have made this issue non-existent. If not too late, I recommend you revisit your table design.Infuriate
As @Infuriate suggests look at the structure of you table. Normalise the data.Iritis
G
31

Join the table with itself and give it two different aliases (A and B in the following example). This allows to compare different rows of the same table.

SELECT DISTINCT A.Id
FROM
    Address A
    INNER JOIN Address B
        ON A.Id = B.Id AND A.[Adress Code] < B.[Adress Code]
WHERE
    A.Address <> B.Address

The "less than" comparison < ensures that you get 2 different addresses and you don't get the same 2 address codes twice. Using "not equal" <> instead, would yield the codes as (1, 2) and (2, 1); each one of them for the A alias and the B alias in turn.

The join clause is responsible for the pairing of the rows where as the where-clause tests additional conditions.


The query above works with any address codes. If you want to compare addresses with specific address codes, you can change the query to

SELECT A.Id
FROM
    Address A
    INNER JOIN Address B
        ON A.Id = B.Id
WHERE                     
    A.[Adress Code] = 1 AND
    B.[Adress Code] = 2 AND
    A.Address <> B.Address

I imagine that this might be useful to find customers having a billing address (Adress Code = 1 as an example) differing from the delivery address (Adress Code = 2) .

Grenadines answered 4/12, 2012 at 17:15 Comment(0)
N
4

This works for PL/SQL:

select count(*), id,address from table group by id,address having count(*)<2
Nahshun answered 4/12, 2012 at 17:18 Comment(0)
V
3

You can do this using a group by:

select id, addressCode
from t
group by id, addressCode
having min(address) <> max(address)

Another way of writing this may seem clearer, but does not perform as well:

select id, addressCode
from t
group by id, addressCode
having count(distinct address) > 1
Valerie answered 4/12, 2012 at 17:17 Comment(0)
F
0

Personally, I would print them to a file using Perl or Python in the format

<COL_NAME>:  <COL_VAL>

for each row so that the file has as many lines as there are columns. Then I'd do a diff between the two files, assuming you are on Unix or compare them using some equivalent utilty on another OS. If you have multiple recordsets (i.e. more than one row), I would prepend to each file row and then the file would have NUM_DB_ROWS * NUM_COLS lines

Follicle answered 4/12, 2012 at 17:14 Comment(7)
-1. Sorry. Dumping SQL data out to an external file to diff with Perl or Python is just wrong; "I know how to use this hammer, so let's treat everything like a nail". The solution doesn't address the question asked at all, which was how to do it in SQL.Ridge
The advantage of writing a script to do it is that it can be made work for any table without you having to specify column names as they can be read from the system tables. I have written a couple of scripts like this for QAs who had to compare recordsets across different databases where only SQL wouldn't suffice and even if it did, they would have to write a custom query for each different table to reflect different columns. The only table-specific data in my scripts, in addition to, of course, the table particulars such as the name, schema, and server, would be the criteria string.Follicle
But compared to having to spell out every column name, that is relatively little custom configuration. SQL is a great tool but has firm limitations that are better overcome outside it (in things like scripting languages) rather than SQL statements that look like a Rube Goldberg machine. I like cleanliness, structure and simplicity. You can keep your downvote, I am not asking you to change this, this elaboration is more of my philosophy to the rest of the worldFollicle
No. SQL is made for working with data, and the data is already there in the database. There is absolutely ZERO advantage to moving it outside the database to find duplicates (as the other answers have shown). Again, use the proper tools for the job, and your answer does not. If you were talking about a "script" meaning a stored procedure, that's fine. The physical export of the data to an external file, and then running an external scripted task in another language (which you create by "reading system tables"), is simply wrong. (I was simply being courteous when I explained my downvote.)Ridge
SQL is made for working with data ... well, everything is data. This comment I am posting now is data. Why bother extract it into Java and display it in HTML when you can give users some sort of R.O. database access so the only tool used to manipulate data online is SQL ? That is what your principle can be translated to. You seem to be in-the-box thinker but that's fine ... the problem is you are confident (read adamant) in your position, where, in fact, you should be less soFollicle
You know quite well what I meant, but to clarify so you're not confused: Of course, I meant to say SQL is made for working with data you already have stored in the database. Perhaps if you were to look outside trying to do everything in Perl or Python (your apparent "one size fits all" solution), you might realize there's more to programming than manipulating text files. IOW, there are more tools in the toolbox than just a hammer. "Take perfectly good data out of the DB, extract it to a separate text file, play with the data there, and then update the database"??? Really???Ridge
@KenWhite Yes, really. SQL is very good at many things, but it is not a good hammer, and not all data problems are nails. One particular weakness (which this solution addresses) is SQL's inability to detect the presence of unspecified change. 'Have any attributes got different values?' is best handled externally to SQL.Dupin

© 2022 - 2024 — McMap. All rights reserved.