Normalisation: Solutions to Examples
Birding Club
SOLUTION: Identify functional dependencies, use synthesis algorithm to split into 3NF relations
- Functional Dependencies:
- Each club member uses a unique name and a single email address: M → E
- ... unique common names, and there is a one-to-one correspondence to the Latin name: C → L
- Each site is identified by its unique name; it is also of a certain type of land: S → T
- Relations:
- R1 { M, E }
- R2 { C, L }
- R3 { S, T }
- Key in original relation:
- The essential information is that a certain bird is spotted at a certain site on a certain date by a certain member
- R { D, C, S, M, L, T, E }
- The key of R does not appear in any of R1, R2, R3; therefore, add R4 { { D, C, S, M }
- BCNF:
- Each of R1, R2, R3, R4 has only one key
- overlapping cannot occur, therefore all also in BCNF
- Independent:
- Can we violate any functional dependency by any update operation?
- Since each of the 3 FDs is implemented in its own 3NF table we cannot.
- The resulting relations are independent. This is always the case when we use the synthesis algorithm.
Software Projects
SOLUTION:
- FDs:
- only one client per project: P → C
- each client is assigned a project manager: C → M
- developers work on one or more software projects at a time,
The amount of hours worked on each project is important: D, P → H
This is the function of this list: to record the hours worked.
C and M do not belong here.
- Synthesis Algorithm to split into 3NF relations:
- R1 { P → C }
- R2 { C → M }
- R3 { P, D → H }
- Key in the original relation: P, D. Is contained in R3. Done.
- BCNF:
- each split relation has only one candidate key, none of them is composite,
overlapping cannot occur.
- Each of R1, R2, R3 is also in BCNF
- Independent?
- Again, since we used the synthesis algorithm this is always the case.
Hospital Treatment
SOLUTION: this example is very similar to SJT. The only difference aside from the names
is the additional column for the hospital.
- FDs:
- Patients do not have more than one doctor per treatment area: P,T → D
- Each doctor specializes in a single area of treatment: D → T
- ... and always uses the same hospital: D → H
- Split into 3NF relations:
- R1 { P,T, D }
- R2 { D, H }
- Relations are independent: no update in R2 can violate FDs in R1, and vice versa.
- R1 is not in BCNF. Same situation as in SJT:
- overlapping candidate keys P,T and P,D
- left side of FD D→ T is not a candidate key
- Go to BCNF:
- Split R1 into R1a and R1b:
- R1a { P,D }
- R1b { D, T }
- Not independent: FD P,T → D is lost and can be violated by updates in R1a.
Sports Events
SOLUTION:
- FDs:
- Players ... remain in each event for a certain number of minutes and achieve a certain score: P,E → M,S
This is the function of this list: to record the minutes and scores. T and C do not belong here.
- Each player belongs to a team: P → T
- ... each team has a coach: T → C
- Synthesis Algorithm to split into 3NF relations:
- R1 { P,E, M, S }
- R2 { P, T }
- R3 { T, C }
- Key P,E of original relation is in R1, done.
- BCNF:
- no difference here, since we used the Synthesis algorithm we arrive at 3NF, and
- since there are no multiple candidate keys all relations are also in BCNF
- Lossless and DP: The decomposition is always lossless and dependency preserving when we use the Synthesis algorithm.
Example Order Form
SOLUTION:
- FDs:
- Each customer has a unique customer number, a name and an address: CNO → CNA, CAD
- Each item has a unique item number, a description, and a price: INO → IDE, IPR
- Orders are identified by a unique number ... Each order is submitted by exactly one customer on a specific date: ORD → CNO, DAT
- Orders ... contain one or more items and associated quantitities. Total is not mentioned, but we assume it is also associated.
- This list combines the our tables ORDR, ITEM, CUST, and PROD
- For each combination of order and item the quantity and the total are stated: ORD,INO → QUA,TOT
- The attributes for each order, customer, and product are also stated, repeatedly and redundantly.
- Synthesis algorithm:
- R1 { CNO, CNA, CAD }
- R2 { INO, IDE, IPR }
- R3 { ORD, CNO, DAT }
- R4 { ORD,INO, QUA, TOT }
- BCNF: all tables are in 3NF, and there are no multiple keys. All tables are also in BCNF.
- Lossless and DP: since we used the Synthesis algorithm the decomposition is lossless and dependency preserving.