Lossless Join Decomposition

Asked 29/11, 2010 at 4:25 Answered 30/4, 2022 at 5:59

database join rdbms database-normalization functional-dependencies

I am studying for a test, and this is on the study guide sheet. This is not homework, and will not be graded.

Relation Schema R = (A,B,C,D,E)

Functional Dependencies = (AB->E, C->AD, D->B, E->C)

Is r1 = (A,C,D) r2 = (B,C,E) OR

x1 = (A,C,D) x2 = (A,B,E) a lossless join decomposition? and why?

Tientiena answered 29/11, 2010 at 4:25 Comment(1)

Your "I have these FDs" doesn't make sense. "These are all the FDs that hold"?--Not possible. "These are all the non-trivial FDs that hold"?--Not possible. "These are some FDs that hold"?--Question can't be answered. Find out what a cover is & what the exact conditions are to apply a particular definition/rule/algorithm. To determine CKs & NFs we must be given FDs that form a cover. Sometimes a minimal/irreducible cover. And the set of all attributes must be given. See this answer. – Astronavigation 30/4, 2022 at 6:35

My relational algebra is horribly rusty, but here is how I remember it to go

If r1 ∩ r2 -> r1 - r2 or r1 ∩ r2 -> r2 - r1 in FDs then you have lossless decomposition.

r1 ∩ r2 = C
r1 - r2 = AD

C->AD is in functional dependencies => lossless

for x1 and x2

x1 ∩ x2 = A
x1 - x2 = CD

A->CD is not in FDs now check x2 - x1

x2 - x1 = BE

A->BE is not in FDs either, therefore lossy

references here, please check for horrible mistakes that I might have committed

Corniculate answered 29/11, 2010 at 15:8 Comment(0)

Here is my understanding, basically you look at your decompositions and determine whether the common attributes between the relations is a key to at least one of the relations.

So with R1 and R2 - the only thing common between them is C. C would be a Key to R1, since you are given C -> A,D. So it's lossless.

For X1 and X2, the only thing common is A, which by itself is neither a key for X1 or X2 from the functional dependencies you are given.

Anthracite answered 22/4, 2013 at 6:24 Comment(0)

Functional Dependencies = (AB->E, C->AD, D->B, E->C)

Is r1 = (A,C,D) r2 = (B,C,E) is lossless when you perform the Chase algorithm. It can be seen that both tables agree on 'C' and the dependency C->AD is preserved in the table ACD.

x1 = (A,C,D) x2 = (A,B,E) is lossy as you will conclude after performing Chase Algorithm. Alternately, it can be seen that both the tables only agree on A and there is no such dependency that is fully functionally dependent on A.

Caterwaul answered 16/10, 2013 at 7:0 Comment(0)

-1

As described here, decomposition of R into R1 and R2 is lossless if

Attributes(R1) U Attributes(R2) = Attributes(R)
Attributes(R1) ∩ Attributes(R2) ≠ Φ
Common attribute must be a key for at least one relation (R1 or R2)

EDIT

with the assumption that only the non-trivial cases are considered here, I think OP intended that too (so that (2) holds under this non-trivial assumption):

e.g., not considering the trivial corner case where all tuples of R1 / R2 are unique, i.e., the empty set {} is a key (as @philipxy pointed out), hence any decomposition will be lossless and hence not interesting (since spurious tuples cant be created upon joining) - the corner cases for which decomposition can be lossless despite Attributes(R1) ∩ Attributes(R2) = Φ are ruled out.

We can check the above conditions with the following python code snippet:

def closure(s, fds):
    c = s
    for f in fds:
        l, r = f[0], f[1]
        if l.issubset(c):
            c = c.union(r)
    if s != c:
        c = closure(c, fds)
    return c

def is_supekey(s, rel, fds):
    c = closure(s, fds)
    print(f'({"".join(sorted(s))})+ = {"".join(sorted(c))}')
    return rel.issubset(c) #c == rel

def is_lossless_decomp(r1, r2, r, fds):
    c = r1.intersection(r2)
    if r1.union(r2) != r:
        print('not lossless: R1 U R2 ≠ R!')
        return False
    if r1.union(r2) != r or len(c) == 0:
        print('not lossless: no common attribute in between R1 and R2!')
        return False
    if not is_supekey(c, r1, fds) and not is_supekey(c, r2, fds):
        print(f'not lossless: common attribute {"".join(c)} not a key in R1 or R2!')
        return False
    print('lossless decomposition!')
    return True

To process the given FDs in standard form, to convert to suitable data structure, we can use the following function:

import re

def process_fds(fds):
    pfds = []
    for fd in fds:
        fd = re.sub('\s+', '', fd)
        l, r = fd.split('->')
        pfds.append([set(list(l)), set(list(r))])
    return pfds

Now let's test with the above decompositions given:

R = {'A','B','C','D','E'}
fds = process_fds(['AB->E', 'C->AD', 'D->B', 'E->C'])

R1, R2 = {'A', 'C', 'D'}, {'B', 'C', 'E'} 
is_lossless_decomp(R1, R2, R, fds)
# (C)+ = ACD
# lossless decomposition!

R1, R2 = {'A', 'C', 'D'}, {'A', 'B', 'E'} 
is_lossless_decomp(R1, R2, R, fds)
# (A)+ = A
# not lossless: common attribute A not a key in R1 or R2!

Ossetia answered 30/4, 2022 at 5:59 Comment(2)

This wrong, 2 does not belong. {} can be a key of one or more of Ri & if so a decomposition with {} as set of common attributes is lossless. {} is a key when/iff a relation has at most one one tuple. Then the question is, why did you think it belonged, and what is your justification for this answer? Please include it. Similarly also see my comment on the question, you are assuming without the question saying so that the set of FDs is a cover. – Astronavigation 30/4, 2022 at 6:35

A join is lossless when/iff the shared attributes form a superkey of at least one input. Some of those lossless joins involve no shared attributes. No need for an exception. It is wrong that a lossless join requires common attributes. And it is a bad presentation to say that wrong thing and then say that you're not saying it. And "trivial cases", "all tuples of R1 / R2 are unique" & "the empty set {} is a key" & their paragraph & your "getting rid of" comment are unclear. Also you seem to be trying to say something wrong in the unclear "unique" part. – Astronavigation 30/4, 2022 at 9:57

Recommended topics

Hot tags