Fourth Normal Form

Fourth normal form (4NF)

Violations of this normal form are much less frequent than of the others. It occurs when there are repetitions of groups of data. Before you can learn how to put a table into 4NF, you must first understand multi-valued determinancy. Consider the table in this figure.


Table that is 3NF but not in 4NF
Car Color Engine
Mustang Red F3.2L
Mustang Red F4.5L
Mustang White F3.2L
Mustang White F4.5L
Mustang Blue F3.2L
Mustang Blue F4.5L
Cirrus Red C2.1L
Cirrus Red C3.0L
Cirrus Green C2.1L
Cirrus Green C3.0L

Verify that this table is in 3NF: The primary identifier is [Car, Color, Engine]. None of the attributes are dependent on any of the other attributes — thus, there is no partial key dependency or transitive dependency.

However, this table seems to have some sort of repetition. Each color for each car is paired with each engine size for each car (and vice versa of course). And this continues with every other car in the rest of the table. This causes redundancy to rear its ugly head. You can delete a row (e.g., row 1) in the table and still know that the Mustang comes in red. The problem here is not that the pairing of all the colors and all the engine sizes happens sometimes (or even frequently) — the problem is that it happens all the time for this table. For every car all of its possible colors will be paired with all of its possible engines. In the real world the only way to know this is to ask the person who enters data or who maintains the data in the table. You cannot tell just by looking at a small subset of the data — you have to know how the attributes are populated (i.e., filled in). This type of redundancy (that is, the type discussed here and demonstrated in the above table) is caused by multi-valued determinancy. The following defines this term and provides an algorithmic (i.e., procedural, mechanical, step-by-step) method of detecting this condition.

Suppose there is a relation R with a set of attributes S that contains a multivalued dependency (MVD) X —» Y. The remaining attributes of the relation R are called Z. The relation X —» Y is an MVD if, whenever R contains a pair of rows that contain duplicate values of X, then R also contains the pair of rows obtained by interchanging the Y values in the original pair. Note that X and Y may contain either single or composite attributes. (Adapted from Database Modeling & Design by Teorey, 2nd edition, p. 110.)

Let's apply this definition to the above table. In this first series of steps (in the list below), we're not going to try to solve the problem of multi-valued determinancy, we're simply going to try to understand this definition. It's a bear.

  • R is the car relation.
  • S is the set of attributes [car, color, engine].
  • X is [car].
  • Y is [color] (or you could have chosen [engine]).
  • Find any pair of rows that have the same value of [car] (i.e., X). To start, look at rows 1 and 6.
  • Exchange the values of [color] in these two rows. Now you have [mustang, blue, F3.2L] and [mustang, red, F4.5L].
  • Verify that two rows exist with these values. They do (rows 5 and 2).
  • Continue this for all other pairs of rows for which the value of [car] is the same.

Since [car] —» [color] in the car relation passes this test (as does [car] —» [engine]), we have verified that this relation does contain an MVD.

Further, it is always the case that if X —» Y, then X —» Z. Because MVDs always go in pairs this way, you will sometimes see them represented as X —» Y | Z.

Problem 3.1

Apply this observation to the car relation to verify that [car] —» [engine] is, indeed, an MVD.

Answer to Problem 3.1

The MVDs in this case can be fixed by breaking up the original table into two tables (as shown in the following two tables).


Tables that are in 4NF
Car color table
Car Color
Mustang Red
Mustang White
Mustang Blue
Cirrus Red
Cirrus Green
Car engine table
Car Engine
Mustang F3.2L
Mustang F4.5L
Cirrus C2.1L
Cirrus C3.0L

With this table structure you can still determine, for example, that a red Mustang can have either an F3.2L or F4.5L engine. However, when you delete a one piece of information (i.e., one row from one of the tables), you actually do lose that information (as you should) as opposed to when a row is eliminated from the original relation.

Disclaimer: These rules do not ensure that you will get rid of all 4NF violations — just some of the common types. For a more complete discussion of this difficult problem, see either Database Modeling &amp; Design by Teorey (2<SUP>nd</SUP> edition) or An Introduction to Database Systems by Date.

Rules to get to 4NF
In the 3NF relation, look at all primary identifiers with at least two attributes. For each one of these identifiers:
1 Determine if there is a multi-valued determinancy of the form F —» M, where all the attributes in F and all the attributes in M are part of the primary identifier.
2 Find any attributes in the relation that are multi-determined by the combination of F &amp; M. Call this attribute (or set of attributes) K.
3 Move F, M, K to another relation. The representation is [F, M] —» K.
4 Go through the process of separating multi-dependencies for this newly-created relation.
5 Drop K from the original relation.
6 Determine if there are any other attributes in the relation that are determined by the combination of F &amp; M. Call this attribute (or set of attributes) D.
7 Create another relation [F, M] —> D.
8 Delete M, D from the original relation.

Problem 3.2

Consider the table shown in this table.


A simple table with a violation of 4NF
Course Teacher Hire date Text Copyright
BIT340 Moore 1992 340 Coursepack 2007
BIT340 Moore 1992 Access 2007
BIT340 Ravishankar 1986 340 Coursepack 2007
BIT340 Ravishankar 1986 Access 2007
BIT301 Moore 1992 301 Coursepack 2007
BIT301 Moore 1992 Excel 2007
BIT301 Moore 1992 being digital 2005
BIT301 Walls 1993 301 Coursepack 2007
BIT301 Walls 1993 Excel 2007
BIT301 Walls 1993 being digital 2005

What normal form is this in? How do you know this? If it is not already, put it in 3NF. Then put it in 4NF.

Answer to Problem 3.2

Problem 3.3

Consider a slightly more complicated car example shown in this table.


Table that is in 3NF but not in 4NF
Car Color Engine Stripe Roof Manufacturer
Mustang Red F3.2L white black Ford
Mustang Red F3.2L white white Ford
Mustang Red F3.2L blue black Ford
Mustang Red F3.2L blue white Ford
Mustang Red F4.5L white black Ford
Mustang Red F4.5L white white Ford
Mustang Red F4.5L blue black Ford
Mustang Red F4.5L blue white Ford
Mustang White F3.2L red white Ford
Mustang White F3.2L red red Ford
Mustang White F3.2L black white Ford
Mustang White F3.2L black red Ford
Mustang White F4.5L red white Ford
Mustang White F4.5L red red Ford
Mustang White F4.5L black white Ford
Mustang White F4.5L black red Ford
Mustang Blue F3.2L white black Ford
Mustang Blue F3.2L white white Ford
Mustang Blue F3.2L yellow black Ford
Mustang Blue F3.2L yellow white Ford
Mustang Blue F4.5L white black Ford
Mustang Blue F4.5L white white Ford
Mustang Blue F4.5L yellow black Ford
Mustang Blue F4.5L yellow white Ford
Cirrus Red C2.1L white black Chrysler
Cirrus Red C2.1L white blue Chrysler
Cirrus Red C2.1L black black Chrysler
Cirrus Red C2.1L black blue Chrysler
Cirrus Red C3.0L white black Chrysler
Cirrus Red C3.0L white blue Chrysler
Cirrus Red C3.0L black black Chrysler
Cirrus Red C3.0L black blue Chrysler
Cirrus Green C2.1L white brown Chrysler
Cirrus Green C2.1L white white Chrysler
Cirrus Green C2.1L brown brown Chrysler
Cirrus Green C2.1L brown white Chrysler
Cirrus Green C3.0L white brown Chrysler
Cirrus Green C3.0L white white Chrysler
Cirrus Green C3.0L brown brown Chrysler
Cirrus Green C3.0L brown white Chrysler

The difference here is that the car and color together determine what color of stripes are available for the car and what color the roof can be. The manufacturer is also included. Put this into 4NF.

Answer to Problem 3.3

Continue with conditional dependencies.

Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License