Explanation of self-joins
Asked Answered
G

15

99

I don't understand the need for self-joins. Can someone please explain them to me?

A simple example would be very helpful.

Guernsey answered 16/3, 2010 at 22:7 Comment(1)
What is a self join for? (in english)Fourposter
K
109

You can view self-join as two identical tables. But in normalization, you cannot create two copies of the table so you just simulate having two tables with self-join.

Suppose you have two tables:

Table emp1

Id Name Boss_id            
1   ABC   3                   
2   DEF   1                   
3   XYZ   2                   

Table emp2

Id Name Boss_id            
1   ABC   3                   
2   DEF   1                   
3   XYZ   2                   

Now, if you want to get the name of each employee with his or her boss' names:

select c1.Name , c2.Name As Boss
from emp1 c1
    inner join emp2 c2 on c1.Boss_id = c2.Id

Which will output the following table:

Name  Boss
ABC   XYZ
DEF   ABC
XYZ   DEF
Kohinoor answered 16/3, 2010 at 22:39 Comment(3)
In this example, I cannot figure out who is the boss. Though the exp is good and easy to understand.Recent
left join I think would be better as not to leave out the employee (or boss) that does not have a boss; the top dog!Tubbs
Should it be from emp c1 instead of emp1 c1 ? @KohinoorWoodchuck
T
23

It's quite common when you have a table that references itself. Example: an employee table where every employee can have a manager, and you want to list all employees and the name of their manager.

SELECT e.name, m.name
FROM employees e LEFT OUTER JOIN employees m
ON e.manager = m.id
Teufert answered 16/3, 2010 at 22:15 Comment(0)
H
21

A self join is a join of a table with itself.

A common use case is when the table stores entities (records) which have a hierarchical relationship between them. For example a table containing person information (Name, DOB, Address...) and including a column where the ID of the Father (and/or of the mother) is included. Then with a small query like

SELECT Child.ID, Child.Name, Child.PhoneNumber, Father.Name, Father.PhoneNumber
FROM myTableOfPersons As Child
LEFT OUTER JOIN  myTableOfPersons As Father ON Child.FatherId = Father.ID
WHERE Child.City = 'Chicago'  -- Or some other condition or none

we can get info about both child and father (and mother, with a second self join etc. and even grand parents etc...) in the same query.

Hofmannsthal answered 16/3, 2010 at 22:18 Comment(0)
C
5

Let's say you have a table users, set up like so:

  • user ID
  • user name
  • user's manager's ID

In this situation, if you wanted to pull out both the user's information and the manager's information in one query, you might do this:

SELECT users.user_id, users.user_name, managers.user_id AS manager_id, managers.user_name AS manager_name INNER JOIN users AS manager ON users.manager_id=manager.user_id
Carrera answered 16/3, 2010 at 22:15 Comment(0)
G
5

Imagine a table called Employee as described below. All employees have a manager which is also an employee (maybe except for the CEO, whose manager_id would be null)

Table (Employee): 

int id,
varchar name,
int manager_id

You could then use the following select to find all employees and their managers:

select e1.name, e2.name as ManagerName
from Employee e1, Employee e2 where
where e1.manager_id = e2.id
Gesellschaft answered 16/3, 2010 at 22:17 Comment(0)
C
4

They are useful if your table is self-referential. For example, for a table of pages, each page may have a next and previous link. These would be the IDs of other pages in the same table. If at some point you want to get a triple of successive pages, you'd do two self-joins on the next and previous columns with the same table's id column.

Chickaree answered 16/3, 2010 at 22:16 Comment(0)
B
4

Without the ability for a table to reference itself, we'd have to create as many tables for hierarchy levels as the number of layers in the hierarchy. But since that functionality is available, you join the table to itself and sql treats it as two separate tables, so everything is stored nicely in one place.

Billbillabong answered 16/3, 2010 at 22:25 Comment(1)
but now you (hopefully) understand what would happen if the self-reference was not available.Billbillabong
D
4

Apart from the answers mentioned above (which are very well explained), I would like to add one example so that the use of Self Join can be easily shown. Suppose you have a table named CUSTOMERS which has the following attributes: CustomerID, CustomerName, ContactName, City, Country. Now you want to list all those who are from the "same city" . You will have to think of a replica of this table so that we can join them on the basis of CITY. The query below will clearly show what it means:

SELECT A.CustomerName AS CustomerName1, B.CustomerName AS CustomerName2, 
A.City
FROM Customers A, Customers B
WHERE A.CustomerID <> B.CustomerID
AND A.City = B.City 
ORDER BY A.City;
Dictatorial answered 7/10, 2017 at 6:30 Comment(2)
+1 This answer is very important because there are so many SQL questions on SO to which the answer is "use a self join", which people tend to not see when they don't have an explicit (hierarchical) self reference.Mathematician
Even tho this is copy pasta from w3schools, I think the above answer doesn't explain self-join but inner-join which is different.Happen
I
3

There are many correct answers here, but there is a variation that is equally correct. You can place your join conditions in the join statement instead of the WHERE clause.

SELECT e1.emp_id AS 'Emp_ID'
  , e1.emp_name AS 'Emp_Name'
  , e2.emp_id AS 'Manager_ID'
  , e2.emp_name AS 'Manager_Name'
FROM Employee e1 RIGHT JOIN Employee e2 ON e1.emp_id = e2.emp_id

Keep in mind sometimes you want e1.manager_id > e2.id

The advantage to knowing both scenarios is sometimes you have a ton of WHERE or JOIN conditions and you want to place your self join conditions in the other clause to keep your code readable.

No one addressed what happens when an Employee does not have a manager. Huh? They are not included in the result set. What if you want to include employees that do not have managers but you don't want incorrect combinations returned?

Try this puppy;

SELECT e1.emp_id AS 'Emp_ID'
   , e1.emp_name AS 'Emp_Name'
   , e2.emp_id AS 'Manager_ID'
   , e2.emp_name AS 'Manager_Name'
FROM Employee e1 LEFT JOIN Employee e2 
   ON e1.emp_id = e2.emp_id
   AND e1.emp_name = e2.emp_name
   AND e1.every_other_matching_column = e2.every_other_matching_column
Ideograph answered 27/3, 2013 at 17:18 Comment(3)
Hm, in the puppy, why do you join on "greater than" instead of "equals"?Annular
Hi. I've seen that some examples use the "FROM xxx, yyy WHERE" and some other "FROM xxx JOIN yyy WHERE". Could you please explain the difference, please?Overindulge
@Skan That's a really good question. The short answer is that it's the old shorthand method and will be deprecated. I used it in school ten+ years ago and rarely see it in practice. Here's the most concise description I could find: bidn.com/blogs/KathiKellenberger/sql-server/2875/…Ideograph
B
2

Self-join is useful when you have to evaluate the data of the table with itself. Which means it'll correlate the rows from the same table.

Syntax: SELECT * FROM TABLE t1, TABLE t2 WHERE t1.columnName = t2.columnName

For example, we want to find the names of the employees whose Initial Designation equals to current designation. We can solve this using self join in following way.

SELECT NAME FROM Employee e1, Employee e2 WHERE e1.intialDesignationId = e2.currentDesignationId
Beauty answered 21/1, 2019 at 15:41 Comment(0)
O
1

One use case is checking for duplicate records in a database.

SELECT A.Id FROM My_Bookings A, My_Bookings B
WHERE A.Name = B.Name
AND A.Date = B.Date
AND A.Id != B.Id
Orthman answered 14/2, 2017 at 21:1 Comment(2)
It's much faster to use a GROUP BY and the a HAVING clause to find duplicates. SELECT name, email, COUNT() FROM My_Bookings GROUP BY name, date HAVING COUNT() > 1Happen
@GeorgeK True. I suppose this is only necessary for a fuzzy match (beyond grouping by TRIM(LOWER(Name))) and not for strict equality.Orthman
A
0

SELF JOIN:

  • Joining a table by itself is called as Self Join.
  • We can perform operations on a single table.
  • When we use self join we should create alias names on a table otherwise we cannot implement self join.
  • When we create alias name on a table internally system is preparing virtual table on each alias name of a table.
  • We can create any number of alias names on a table but each alias name should be different.

Basic Rules of self join:

CASE-I: Comparing a single column values by itself with in the table

CASE-II: Comparing two different columns values to each other with in the table.

Example:

SELECT * from TEST;

ENAME

  1. RICHARD
  2. JOHN
  3. MATHEW
  4. BENNY

LOC

  1. HYDRABAD
  2. MUMBAI
  3. HYDRABAD
  4. CHENNAI

SELECT T1. ENAME, T1. LOC FROM TEST.T1, TEST T2 WHERE T1.LOC=T2.LOC AND T2.ENAME='RICHARD';

Arnaud answered 30/9, 2022 at 11:30 Comment(1)
Your answer explain the concept but didn't prove anything, because you didn't show the real output of your SQL requests and last, your last SQL request contains errors. Provide something you tried yourself with a real example, so it could be practically used by the user.Sangfroid
P
0

One possible data design is to store related data in rows rather than in columns, like this:

Table person_attributes
-----------------------
id   person_id attr_key attr_value
==== ========= ======== ==========
201  51        Hair     Brown
202  51        Eyes     Blue
203  51        Height   175cm
204  52        Hair     Red
205  52        Eyes     Hazel
206  52        Height   163cm

To retrieve the attribute values for a particular person as a single row, you can join the table to itself. Like this:

SELECT pa1.attr_value as hair
  ,pa2.attr_value as eyes
  ,pa3.attr_value as height
FROM person_attributes pa1
  JOIN person_attributes pa2 on pa1.person_id=pa2.person_id
  JOIN person_attributes pa3 on pa1.person_id=pa3.person_id
WHERE pa1.attr_key='Hair'
  AND pa2.attr_key='Eyes'
  AND pa3.attr_key='Height'
  AND pa1.person_id=51

The result of this query would look like this:

Hair   Eyes   Height
====== ====== ======
Brown  Blue   175cm
Paleobiology answered 18/3 at 0:19 Comment(0)
S
-1

It's the database equivalent of a linked list/tree, where a row contains a reference in some capacity to another row.

Semester answered 16/3, 2010 at 22:20 Comment(2)
Actually, given that more than one row can reference a "parent" it can be a tree, too, such as with the oft cited example of employee->manager.Parnassian
I was just trying for a simple analogy, but yes a tree could work, too.Semester
V
-5

Here is the exaplanation of self join in layman terms. Self join is not a different type of join. If you have understood other types of joins (Inner, Outer, and Cross Joins), then self join should be straight forward. In INNER, OUTER and CROSS JOINS, you join 2 or more different tables. However, in self join you join the same table with itslef. Here, we don't have 2 different tables, but treat the same table as a different table using table aliases. If this is still not clear, I would recomend to watch the following youtube videos.

Self Join with an example

Villager answered 31/8, 2012 at 13:25 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.