### Write the 1NF relation.

The attributes are as follows:

Receipt, Date, PID, PName, Phone, City, State, VID, VName, Type, Days, Date Due, Cost, Total

- Derived data
`Date Due`can be calculated from`Date`and`Days`.`Total`can be calculated as the total of the`Cost`attribute. Remove these from the relation.- Repeating groups
- None.
- Multi-value attributes
`City`and`State`should be separated into separate attributes.- Primary identifier
- This is
`[Receipt, VID]`.

Now the list of attributes is

Receipt, Date, PID, PName, Phone, City, State, VID, VName, Type, Days, Cost

and the 1NF relation is

[Receipt, VID] —> [Date, PID, PName, Phone, City, State, VName, Type, Days, Cost]

### Write the 2NF relations.

Now we're looking for all attributes that are dependent on only part of the primary identifier. I propose that the following are true:

`Receipt —> Date`: If you know the receipt, you know what date the receipt was issued. Or, equivalently, given a certain receipt, there is only one date on which that receipt was issued.`Receipt —> [PID, PName, Phone, City, State]`: If you know the receipt, you know information about the person. Or, alternatively, there is only one person associated with any one receipt.`VID —> [VName, Type, Days, Cost]`: If you know the video id, you know information about the video.

After removing the attributes from the right side of the above relations from the original relation, this leaves the original relation as follows:

Receipt, VID

There aren't any attributes on the right side of the arrow in the above relation, so the primary identifier for this relation consists of both attributes.

Grouping together the first and second relations since they have the same identifier, we now have:

`Receipt —> [Date, PID, PName, Phone, City, State]``VID —> [VName, Type, Days, Cost]``Receipt, VID`

The identifier for the first relation is `Receipt`; for the second is `VID`; for the third is `[Receipt, VID]`. Since the relations are in 1NF and there is nothing on the right side of the only relation that has a multi-attribute identifier (the third relation), we are in 2NF.

### Identify any transitive dependencies.

What you are looking for here are attributes on the right side of the above relations that depend on other attributes on the right side. I propose that the following are true:

`PID —> [PName, Phone, City, State]``Type —> [Days, Cost]`

### Write the 3NF relations.

To get to 3NF, remove the transitive dependencies from the 2NF relations. After doing so, the relations are as follows:

`Receipt —> [Date, PID]``PID —> [PName, Phone, City, State]``VID —> [VName, Type]``Type —> [Days, Cost]``Receipt, VID`

#### Q&A with student

How come `type —> [Days, cost]` is valid as 3NF? Couldn't we use days to determine cost as well?

I defined the determinancy this way because of the way I thought of the situation. I thought of `type` as classifying the videos. Then, once you know the `type` of a video, you know how many days it can be rented and how much it costs to rent. Different types of videos can have different combinations of values for `days` and `cost`. A `D` type video (I'm making this up as I go) might have values of `days=2` and `cost=4.00`. Another type of video might have `days=2` and `cost=1.50`. Given this way of thinking about the problem, you can see that if you know the value of `days`, you do not necessarily know the value of `cost`. It is only by knowing `type` that you are able to determine the value of `days`. So that's how I defined the determinancy for this problem.

What purpose does the last relation in the previous problem fill? In other words, what is it that `[Receipt, VID]` tell you that you can't determine from the other relations? (For answer, go here.)

Return to 3rd normal form.