This normal form further guards against redundancy. It also helps make sure that our data does not accidentally go away. A nice feature, wouldn't you agree?
Before discussing 3NF, we must first understand the transitive property of functional dependency. Taller is a property that is transitive. If Nancy is taller than Scott and Scott is taller than Lindsey, then we know that it must be true that Nancy is taller than Lindsey. Functional dependency is also transitive. Suppose the following were true:
- A —> B
- B —> C
Then, it follows (by transitivity) that A —> C must always be true.
A table is in 3NF if 1) it is already in 2NF, and 2) there are no transitive dependencies.
When tables are in 2NF but not in 3NF, the solution is to split up offending tables into two (or more) new tables. Consider the table below.
Product # | Name | Mfr | Mfr HQ |
---|---|---|---|
1001 | Walkman | Sony | Japan |
1002 | Camera | Leica | Germany |
1003 | DVD | Sony | Japan |
The functional dependencies here include
- Product # —> Mfr
- Mfr —> Mfr HQ
So, by transitivity we have
Product # —> Mfr HQ
This simply means that for any product number, there is an associated unique headquarters. This is a transitive dependency within this table. So, the table is not in 3NF.
One problem with this redundancy: We would have to say that Sony's headquarters are in Japan 1000 times if we stocked 1000 Sony items. Another problem is disappearing data (otherwise known as deletion anomalies). Namely, if we no longer stock product #1002, we delete the fact that Leica is headquartered in Germany — which we might like to know even if we don't sell Leicas at this moment. The flip side of this problem (called an insertion anomaly) is that we cannot easily say that Apple Computer is headquartered in the US since we're not stocking any Apple products right now. To reinforce this idea, review the example about “disappearing” customers from a stock brokerage.
The solution to these problems is to split this one relation into multiple relations. Each of these new relations should have no functional dependencies. So what you're trying to do is to build relations that just address one “idea” at a time thereby ensuring that there are no transitive functional dependencies.
Here is an important point that some of my past students have had problems with: If there are no arrows in a relation, then all the attributes comprise the primary identifier, the only candidate identifier, and a single determinant. For example, if there is one relation defined as [x, y, z] (which can be thought of as simply shorthand for [x, y, z] —> none), then the primary identifier is [x, y, z], the set of candidate identifiers is [[x, y, z]], and the single determinant is [x, y, z].
Rules to get to 3NF | |
---|---|
1 | In the 2NF relation, find all attributes that depend on attributes that are not the primary identifier. These relations have the form SI —> A where SI is not part of the primary identifier. |
2 | For each one of these dependencies SI —> A, create another relation containing attributes SI and A, define SI as the primary identifier of this new relation, remove attribute A from the 2NF relation. |
Consider the product and manufacturer table above. We found that Mfr HQ depends on Mfr. We already know that Mfr is not part of the primary identifier … so we have one dependency of the form Mfr —> Mfr HQ. Thus, we should create another relation containing Mfr and Mfr HQ. The primary identifier of this table is Mfr. Finally, we need to remove Mfr HQ from the 2NF relation. Thus, the above single relation is replaced by the following two relations:
- Product # —> [Name, Mfr]
- Mfr —> Mfr HQ
The following is another way of representing the above relations:
- Product(Product #, Name, Mfr)
- Company(Mfr, Mfr HQ)
The form of this is tableName(field1, field2,{…)} and the underlined field is the primary identifier of the relation. These are called the table structures.
Rules to generate 3NF relations
- For each grouped list of attributes, go through the following steps to generate a set of 1NF relations:
- Break up all attributes so that they are single-valued.
- Remove all derived data attributes.
- Define a primary identifier for the relation.
- Remove all repeating groups. Put them in another relation. Define a primary identifier for the new relation.
- For each 1NF relation, go through the following steps to generate a set of 2NF relations:
- Remove to another relation all attributes that are dependent on only part of the primary identifier.
- Define a primary identifier for this new relation.
- Ensure that this new relation is in 2NF.
- Group together all 2NF relations that have the same primary identifier.
- For each 2NF relation, go through the following steps to generate a set of 3NF relations:
- Find all attributes that depend on attributes that are not the primary identifier. These relations have the form SI —> A where SI is not part of the primary identifier.
- For each one of these dependencies SI —> A:
- Create another relation containing attributes SI and A.
- Define SI as the primary identifier of this new relation.
- Remove attribute A from the 2NF relation.
- Ensure that this new relation is in 3NF.
- Group together all 3NF relations that have the same primary identifier.
Problem 2.8
Consider the Invoice table shown here.
Invoice table | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Inv # | Date | Cust ID | Name | Part # | Desc | Price | # used | Ext Price | Tax rate | Tax | Total |
14 | 12/63 | 42 | Lee | A38 | Nut | 0.32 | 10 | 3.20 | 0.10 | 1.22 | 13.42 |
14 | 12/63 | 42 | Lee | A40 | Saw | 4.50 | 2 | 9.00 | 0.10 | 1.22 | 13.42 |
15 | 1/64 | 44 | Pat | A38 | Nut | 0.32 | 20 | 6.40 | 0.10 | 0.64 | 7.04 |
- Is this table in 1NF? 2NF? 3NF?
- Use the procedure defined in Rules to generate 3NF relations to define better relations. After doing so, convince yourself that tables with these attributes would make more sense than the one shown above.
Problem 2.9
Consider this table:
Journal title | Volume | Publisher | Pages |
---|---|---|---|
AI Expert | 172 | Miller Freeman | 106 |
CACM | 146 | Assoc Computing Machinery | 152 |
AI Expert | 173 | Miller Freeman | 132 |
Is this table in 1NF? 2NF? 3NF? Justify your answer and fix any problem.
Problem 2.10
Suppose you are given the form below. Assume that a video cannot be rented twice on one receipt.
Moore Videos | |||||
---|---|---|---|---|---|
Person ID: 226 | Name: Lindsey Moore | Receipt: 68395 | |||
Phone: 734 668-4385 | Date: 9/1/08 | ||||
City, ST: Ann Arbor, MI | |||||
Video Information | |||||
ID | Name | Type | Days | Date Due | Cost |
325 | Shrek | N | 1 | 9/2/08 | 3.00 |
548 | Remember the Titans | H | 2 | 9/3/08 | 2.50 |
6437 | The Replacement Killers | H | 2 | 9/3/08 | 2.50 |
Total | 8.00 |
- Write the 1NF relation.
- Write the 2NF relations. Identify the identifiers of these relations.
- Identify any transitive dependencies.
- Write the 3NF relations.
Try each of these before looking at the answer below. It is very important that you do this.
Go on to 4th normal form document.