SQL -> Relational Algebra
Asked Answered
T

1

3

Suppose I have the following relations:

Branch (branchNo(PK), street, city, postcode)

Staff (staffNo(PK), fName, lName, sex, branchNo(FK))

Not that it matters for this question, but PK = primary key & FK = foreign key

How would I write the relational algebra for the following query:

List the names of all female staff that work in Glasgow.

My attempt:

σStaff.sex=F & Branch.city = GlasgowfName, lName, sex, branchNo(Staff) x πcity, branchNo(Branch))

I know that my selection (σ) statement (NOT TO BE CONFUSED WITH SELECT) is syntactically incorrect:

σStaff.sex=F & Branch.city = Glasgow

How do I write two selections on different relations? Or in other words, how do I express an SQL statement with two or more conditions in the WHERE clause in relational algebra? I have used '&' but this cannot be right? Do I have to embed one selection within the other?

NOT HOMEWORK

Tshombe answered 24/4, 2011 at 15:44 Comment(4)
If you are not allowed to use "And" then presumably you can just use intersect?Greegree
@martin - I was not sure whether the '&' was permitted as I cannot find any concrete rules, or even examples with two or more predicates within the selection statements...Tshombe
@Tshombe There are no formal rules I think. It just depends on what author/text book you use.Greegree
@user559142: consider picking a meaningful name. One benefit is other users can address you using the '@' operator for replies, and you'll get a notification that someone has responded to you.Chromatophore
C
6

Formal relational algebra uses logical conjunction and disjunction and (typically) the symbols for same ( and , respectively), though authors are free to pick their own syntax. The query could be written as:

πfName, lName(gender=F ∧ city=Glasgow)(Staff ⋈ Branch))

Note that x (rather, ⨯) is the symbol for Cartesian product. For natural joins, you want ⋈ (bowtie).

If you want the Cartesian product rather than natural join, you basically implement a natural join by adding the appropriate condition to the select. You'll also need to deal with the fact that the attribute branchNo is common to both relations, which you can do by using the rename operator (ρ).

πfName, lName(gender=F ∧ city=Glasgow ∧ branchNo=bNum)(Staff ⨯ ρbNum/branchNo(Branch)))

Formally, you can do this because:

R ⋈ S = πα(R),α(S)-α(R)α(R)∩α(S)=t1..k(R ⨯ ρ t1..k/α(R)∩α(S)(S))))

where α(T) are the attribute names for relation T (making α(R) ∩ α(S) the common attribute names) and t1..k ⊈ α(R) ∪ α(S) are new names for the common attributes.

Chromatophore answered 24/4, 2011 at 19:51 Comment(4)
thanks that makes sense - just out of interest, how would I do the above using cartesian product, selection & projection? I mean surely it is still possible to use cartesian instead of joins right? It is just less efficient...Tshombe
@user559142: you'd use renaming (since cartesian products aren't defined if two relations share any attribute names), selection and projection. See update. Efficiency isn't a factor in relational algebra, since it isn't a computational model (the model doesn't define how operations are performed).Chromatophore
@Chromatophore While relational algebra from a pure mathematical perspective is agnostic to performance, it does encode performance information. If my understanding is correct, relational algebra is used as an intermediate step in query optimization within Database Kernels.Disinterest
RA was specifically designed to be a computational model. The operators have certain associated complexities. However we can use it when we don't happen to care about some calling order or complexity associated with an expression we happen to give. The property "order of implementation operations doesn't matter" is associated with the relational calculi.Superscription

© 2022 - 2024 — McMap. All rights reserved.