How do I merge two tables in Access while removing duplicates?
Asked Answered
I

5

7

I have read through about every possible solution online, and I get a different result every time.

I have two tables: Clients and Patrons. They both have the same structure: LastName, FirstName, Address, City, State, and Zip. Clients has 108,000 entries while Patrons has only 42,000 entries. And some of those entries are duplicated between the two as I don't have 150,000 clients.

I need one coherent list. The problem I am running into is that some of my clients reside at the same address, so I can't simply remove duplicate addresses as that will remove a legitimate client. And I have some clients with very common names, say Jane Doe, where there are a couple of them at different addresses, so I can't just filter out duplicate last or first names.

I am using Microsoft Access 2010.

Simply turning unique values to YES isn't helping.

I have scoured the Microsoft help files, and I have gotten results of 2 to 168,000 and most everything in between.

How can I get a single list without duplicates without having to alphabetize it and go line by line for 150,000 entries??

Illustrious answered 30/9, 2011 at 20:12 Comment(1)
The way our business is set up, some are clients and patrons while most others are just one or the other.Illustrious
C
9

A UNION query returns only distinct rows. (There is also UNION ALL, but that would include duplicate rows, so you don't want it here.)

Try this query. If it doesn't return what you want, please explain why if falls short.

SELECT LastName, FirstName, Address, City, State, Zip
FROM Clients
UNION
SELECT LastName, FirstName, Address, City, State, Zip
FROM Patrons
ORDER BY LastName, FirstName;

You probably want another field or fields in the ORDER BY. I just offered something to start with.

Calesta answered 30/9, 2011 at 21:34 Comment(1)
This actually gave me the same number I got using a query-on-query I did. So I am very happy! I have two different methods that produced the same result now. Thank you very much!Illustrious
M
2

One way to do this is to do a FULL OUTER JOIN and COALESCE the values. This would allow you to know if its in the client table, the patron table or both

Unfortunately AFAIK Access doesn't have FULL OUTER so you'll need to simulate it instead.

SELECT a.LastName, a.FirstName, a.Address, a.City, a.State, a.Zip , "Both" as type
FROM Clients a INNER JOIN Patrons b 
ON a.LastName = b.LastName 
   AND a.Address = b.Address
   AND a.City = b.City
   AND a.State = b.State
   AND a.Zip = b.Zip
UNION ALL
SELECT a.LastName, a.FirstName, a.Address, a.City, a.State, a.Zip , "Client" as type
FROM Clients a LEFT JOIN Patrons b 
ON a.LastName = b.LastName 
   AND a.Address = b.Address
   AND a.City = b.City
   AND a.State = b.State
   AND a.Zip = b.Zip
WHERE 
    b.PatronID is null (Or whatever the PK is)
UNION ALL
SELECT b.LastName, b.FirstName, b.Address, b.City, b.State, b.Zip , "Patron" as type
FROM Clients a RIGHT JOIN Patrons b 
ON a.LastName = b.LastName 
   AND a.Address = b.Address
   AND a.City = b.City
   AND a.State = b.State
   AND a.Zip = b.Zip
WHERE 
    a.ClientID is null (Or whatever the PK is)

If you just need a list though you should just use HansUp's answer

Mayst answered 30/9, 2011 at 22:10 Comment(1)
I wish I had enough rep to bump this. I loved playing with this although in the end I just needed a list ^^Illustrious
C
0

I am not sure that building a fully automated solution is worth the job: you will never be able to build a code that will consider Doe, Jane, 1234 Sunset Boulevard and Doe, Jane, 1234 Sunset Bd as the same person, though these are really the same person!

If I were you, I'd build a 4 steps semi-automated solution:

  1. Merge both tables in one unique table, add an 'isDuplicate' boolean field
  2. Display, through a query, all similar names, and handpick duplicates to be deleted
  3. Display, through a query, all similar (as similar as possible) addresses and handpick dupllicates to be deleted
  4. Delete all records where 'isDuplicate' is set to True

Of course, this method is interesting only if duplicate names\addresses are limited! I guess that your filterings will give you a few hundred records to consider. How long will it take? one hour or two? I think it's worth the job! By automating this process you will never be able to make sure all duplicates are eliminated, neither wil you be sure that no legitimate client was deleted. By doing the job this way, you will be sure of your result.

Cartilage answered 1/10, 2011 at 9:45 Comment(0)
G
0

I am looking for a better way to do this as well, but I was surprised that the answer here is kind of "difficult". Given no simple way to do that join automagically, there is an easy way using Access native functions.

Use the Query wizard to create an "Unmatch" Query. This will create a list of participants who exist on one, but not both tables (you specify which during the wizard). Then you can append those records or create a new table as you please.

I do not know a way of blending record data in this step as that is much more complicated.

Gumboil answered 19/3, 2014 at 14:8 Comment(1)
Welcome to stackoverflow! This is not an answer to the question, would you like to make a new question and refer to this one instead.Publicspirited
N
0

Try using their mobile phone or their email address as their unique id. That should avoid too many accidental merging or duplicates.

And instead of 2 tables, keep them in 1 table but just add 2 columns Yes/No (one for Clients and another for Patrons).

Nittygritty answered 20/8, 2022 at 12:43 Comment(1)
Your answer could be improved with additional supporting information. Please edit to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers in the help center.Ascribe

© 2022 - 2024 — McMap. All rights reserved.