Relational Algebra - Cartesian Product vs Natural Join?
Asked Answered
P

3

9

I am studying for exams and am failing to find a solid criteria by which I can determine if the Cartesian Product x is to be used or if Natural Join |X| is to be used.

I had come up with a rough guide that:

"If you need to project an attribute that has the same name as an attribute in the table to be joined you must use x and state the table names to be projected: tableA.colname1 = tableB.colname1"

This however doesn't follow some of the solutions in my notes and my lecturer seems to use x with the above convention or |x| interchangeably.

Does anyone have a rule that can be followed to define use of one over the other?


Take for example this schema (only schema related to the question quoted for brevity):

takes(ID, course_id, sec_id, semester, year, grade)
student(ID, name, dept_name, tot_cred)

Q) Find the name of all students who took courses that were taught in either Spring 2011 or Autumn 2011.

My answer attempt:

π name(σ semester="Spring" ^ year=2011(takes ⋈ student)) ∪ π name(σ semester="Autumn" ^ year=2011(takes ⋈ student))

Actual answer:

π name(σ semester="Spring" ^ year=2011 ^ takes.ID=student.ID(takes x student)) ∪ π name(σ semester="Autumn" ^ year=2011 ^ takes.ID=student.ID(takes x student))

Can anyone provide a reason as to why?

In my mind the Natural Join would take care of the takes.ID=student.ID?

Pretender answered 2/1, 2013 at 18:38 Comment(3)
Real world databases almost never use a cartesian product.Schiro
Joel, I disagree. I've often found cartesian product useful "in the real world".Equanimity
@sqlvogel I've used them, too... but it's rare.Schiro
N
10

A natural join, as I understand it, is a projected, filtered Cartesian product:

  • You take the Cartesian product, then
  • select it, so that the values in columns of the same name have the same value, and
  • project it, so that all columns have distinct names.

Under this assumption, your answer is isomorphic to the actual answer.

To see this, you might want to expand the natural join to the above sequence of operators, and float them around using the laws of relational algebra. You'll see that the projection disappears due to the projection to name, and the selection criterion is fused with the selection above. You'll end up with exactly the same tree as the actual answer, even though you never changed the meaning of your own answer!


I can think of one reason why your lecturer uses these concepts interchangeably: your lecturer wants you to understand that these concepts can be used interchangeably, because "the natural join is just a shortcut" (though that's debatable).

Nonmaterial answered 2/1, 2013 at 18:55 Comment(9)
So you mean like so: r ⋈ s == π r.A, r.B, r.C, r.D, s.E (σ r.B = s.B ^ r.D = s.D (r x s)) for say schema: R(A,B,C,D) and S(B,D,E)?Pretender
Yes! :) At least, that's how I understand it.Nonmaterial
Okay that makes sense to me, just worried as the "answers" only show 1 solution to the problem, when in fact there ARE 2 and I would be marked incorrect.Pretender
There are too many possible answers, and only the smallest would be called correct. Your solution is smaller than the actual solution, so if you were allowed to use ⋈, your answer is "more correct". There are even shorter solutions possible, though. Float the projections over the set union, and you'll save yourself one operator.Nonmaterial
I was thinking this: π name(σ (semester="Spring" ^ year=2011) ∨ (semester="Autumn" ^ year=2011)(takes ⋈ student)) but i'm not sure if that's allowed...Pretender
Good point, but I don't know that either. That really depends on what operators you're allowed to use. If you can't use logical disjunction, you simply have to use set union. But if you can, then that's almost the shortest possible answer.Nonmaterial
What IS the shortest answer by way of interest?Pretender
π name(σ (semester="Spring" ∨ semester="Autumn") ^ year=2011 (takes ⋈ student)), I think. Unless you're also allowed to test for set membership, then it would be π name(σ (semester in {"Spring", "Autumn"}) ^ year=2011 (takes ⋈ student)).Nonmaterial
Membership tests are not allowed, however logical disjunction has not been excluded. Thanks for your help @TinctoriusPretender
E
3

Cartesian product is just a special case of natural join where the joined relations don't have any attribute names in common. In Codd's original algebra renaming was a separate operation altogether. To obtain a true cartesian product of two relations that have some attributes in common you would have to rename those attributes before doing the (natural) join.

In the interests of brevity the renamings are sometimes left out of written examples and the product symbol used instead. Unfortunately this obscures the important point, that there is only one kind of join.

Equanimity answered 27/11, 2016 at 18:59 Comment(0)
L
-1

I think there are two extreme situation:

  1. there are no duplicate rows in inner join: inner join is equal to intersect (i mean only the result). distinct inner join ~ intersect

  2. there are no common feature in the inner join: inner join is equal to Cartesian product.

Lanti answered 23/2, 2020 at 21:29 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.