Advanced topics in ER modeling

The concepts in the previous two sections allow you to model many business situations. The following concepts are needed to round out your repetoire so that you will be ready for almost any situation that comes your way.

Relationships

Degree of a relationship

Relationships can be classified by the number of entity types involved. This is referred to as the degree of a relationship. To this point we have concerned ourselves with relationships between two entity types. This is, by far, the most common type of relationships. The most common degrees of relationships are as follows:

binary
This is a relationship between two entity types.
ternary
This is a relationship between three entity types.
recursive
This is a relationship involving only one entity type.

I will not spend any time on binary relationships now because we have discussed them at length already.

Ternary

In the real world there are relationships other than those involving two things. For example, suppose that we want to capture which employees use which skills on which project. We might try to represent this data in a database as three binary relationships between skills and project, project and employee, and employee and skill.

e6.png

The applies relationship indicates which employee applies which skill. The used on relationship indicates which skill is used on which project. The works on relationship indicates which employee works on which project. But this is not enough to specify which employee uses which skill on which project. Suppose you know the following:

works-on
Lindsey and Mackenzie have worked on projects A and B.
applies
Lindsey has used skills interface design and database design while Mackenzie only used her database design skill.
used on
Both skills have been used on both projects.

Given this information, it is impossible to figure out on which projects Lindsey used which skills. She could have used interface design on project B and database design on project A — or the other way around. Or she might have used both skills on both projects. The database simply does not give us enough information.

In order to capture the necessary information the database needs a ternary relationship. In this case the database needs a relationship, called esp, among employee, skill, and project.

e7.png

The esp relationship captures information three pieces at a time. It stores facts such as:

  1. Lindsey used interface design skill on project A.
  2. Mackenzie used database design skill on project A.
  3. Lindsey used interface design skill on project B.
  4. Lindsey used database design skill on project B.
  5. Mackenzie used database design skill on project B.

Notice that this ternary relationship captures the information represented in the three binary relationships:

  • Lindsey worked on project A (first fact).
  • Lindsey worked on project B (third and fourth facts).
  • Mackenzie worked on project A (second fact).
  • Mackenzie worked on project B (fifth fact).
  • Lindsey used her interface design skill (first and third facts).
  • Lindsey used her database design skill (fourth fact).
  • Mackenzie used her database design skill (second and fifth facts).
  • Both skills have been used on both projects (all the facts).

Implementing ternary relationships does not mean that you have to get rid of the binary relationships. You only get rid of the binary relationships if they capture a subset of the information captured by the ternary relationship. If a binary relationship captures information that differs from the ternary relationship, then the binary relationship should be retained if the information is important to your company. For example, consider the following:

e8.png

The esp relationship stays the same as in the previous ER diagram. The binary relationships are different.

has/held by relationship
An employee has a certain skill. This is different than esp because there are some skills that an employee has that an he or she may not have used on a particular project.
needs/needed on relationship
A project needs a particular skill. This is different than esp because there may be some skills for which employees have not been assigned to the project yet.
manages/managed by relationship
An employee manages a project. This is a completely different dimension than skill so it could not be captured by esp.

Problem 3.1

We have three entities: car, customer, and salesperson. Consider the following situations.

#1

There is a ternary relationship called sell relating all three entities. Sells: ``Customer cust bought car car from salesperson SP.

Sells
Car Cust SP
Cobra Sam Don
Mustang Sam Don
Mustang Sam Sharon
Cobra Jenn Sharon
Mustang Jenn Sharon

Question: Using this data, who sold Sam the Cobra?

#2

We have three binary relations as shown below:

Buys
Cust Car
Sam Mustang
Sam Cobra
Jenn Mustang
Jenn Cobra
Buys from
Cust SP
Sam Don
Sam Sharon
Jenn Sharon
Sells
SP Car
Don Cobra
Don Mustang
Sharon Cobra
Sharon Mustang
buys
Customer cust buys car car.
buys from
Customer cust buys from salesperson SP.
sells
Salesperson SP has sold car car.

Question: Using this data, who sold Sam the Cobra?

Answer to Problem 3.1

Thus far in this section the ER diagrams have not represented the cardinality of ternary relationships. There is a different method for determining cardinalities of higher order relationships:

  • Cover up all but one of the lines going from the rectangle representing the relationship (esp in the above case) to the entities. (For example, begin by covering up the lines from esp to employee and from esp to project.)
  • For the one remaining entity, ask yourself: Can there be only one of these items for any single combination of these other entities, or can there be many? (Continuing this example, ask yourself: Can there be only one skill for a certain employee and certain project, or can an employee use many skills on one project?)
  • If the answer is many, then put a crow's foot on the line going to this entity. If the answer is one, then put a dash across it.
  • Repeat this process until each entity involved in the relationship has been examined.

Problem 3.2

For the esp relationship, what is the cardinality for each entity type?

Answer to Problem 3.2

Recursive

The final, and possibly the most difficult, relationship is the recursive relationship. This is a relationship that an entity has with itself. But it really doesn't have to be difficult if you think about it as you would any ordinary binary relationship. Let's look at an example.

Think of an employee who is the manager of other employees.

e9.png

A manager manages many employees and an employee has exactly one direct manager. This is pretty straightforward. But, now, realize that a manager is really just another name for an employee. So, replace managers with employees in this diagram.

e10.png

Now this diagram has the entity type employees represented twice. To remedy this situation, “pull” the relationship diamond down and slide the two employee rectangles so that they are lying on top of each other. Now the diagram looks like the following:

e11.png

This diagram represents what we want:

  • Reading clockwise starting at employee: An employee may not manage any other employees but may manage many.
  • Reading counter-clockwise starting at employee: All employees are managed by exactly one other employee.

Not everyone in the company has a manager. The president will not have a direct manager. This is handled in the data in the table by indicating that the president's manager is the president. A little trick.

Parallel relationships

Two entities can have more than one type of relationship. This is not surprising; further, it is not difficult to represent in a database or in an ER diagram. Consider the entity types person and insurance policy and the relationships between them of pays for and is insured under.

e16.png

Look at these relationships one at a time.

  • A person pays for zero or more insurance policies. An insurance policy is paid for by exactly one person.
  • A person is insured under zero or more insurance policies. An insurance policy covers one or more persons.

These are two distinct relationships. They mean two different things — that is why they are represented as two separate relationships in the ER diagram.

Other entity types

Associative entity

When we examined attributes earlier, the attributes were exclusively attached to entity types. However, it is also possible for a relationship to have attributes. A many-to-many relationship is, in some respects, a special case of the other relationships. It requires that another entity be created to “hold” information related to the relationship. This new, special type of entity is called an associative entity.

An associative entity is an entity that can only exist between two other entities, and that exists to store information about the relationship between or among those entities.

For example, suppose we are considering the following situation. A person can have membership in many clubs while a club can have many members. This is represented by this figure:

e24.png

Now consider a reformulation of the above ER diagram in which the many-to-many relationship is replaced by two one-to-many relationships with an associative entity in the middle:

e12.png

This second representation of the situation is preferred because it more directly maps to the underlying database that will be created as a result of this relationship. A many-to-many relationship must be represented by a separate table, just as separate entity types must be represented by separate tables. In this case we have an associative entity being used to represent a relationship between two entity types.

This new entity type also allows the database to capture data about the relationship itself. A person can be a member of many clubs and a club can have many members. A natural piece of information to want to store is the date the person joined the club. If the attribute is of the person entity type, then this would indicate when the person joined a club but we would not know which club. If the attribute is of the club entity type, then this would indicate (possibly) when the club was founded or (possibly) when the most recent member joined the club but we would not know the dates on which each person joined. The solution is to make join date an attribute of the membership associative entity (the entity representing the relationship).

Mind you, the basic many-to-many diagram (the first one above) is just fine for the initial stages of drawing the ER diagram. It allows you to draw all the relationships without crowding the diagram with additional entity types. However, as the process continues, at some point you should convert the first type of diagram into the second (for the reasons given above).

Attributive (or weak) entity

Attributive entities are entities, but with a difference — they only exist because some other entity exists.

An attributive entity is completely dependent upon another entity for its existence.

For example, if you were to define two entities employee and salary-history, then the second would be an attributive entity because the record of an employee's salary history could only exist if a record of an employee also exists. Joe Smith's salary history wouldn't make much sense if Joe Smith doesn't exist in the data base. An attributive entity is represented by a rounded rectangle within another rectangle as shown below.

e17.png

Entity subtype partitioning

This section describes two different ways in which subtypes of an entity can be related to one another and to the supertype: the optional versus mandatory question, and the disjoint versus overlapping question. Certainly, entity subtypes should be classified along both dimensions — that is, you should identify whether the subtype is mandatory or optional and whether it is disjoint or overlapping. All four combinations are possible and each is appropriate at different times.

Optional versus mandatory

Assume there is an entity type called person, and entity subtypes called customer and employee. When a person is created, the designer of the database has two options:

mandatory
He/she can demand that the person be classified as one of the subtypes.
optional
He/she can allow a person to be created without classifying the person as any subtype.

Neither one is preferable to the other. The proper one to choose depends on the business situation.

Mandatory subtyping is represented by creating a double line from the supertype (person in the following ER diagram) to line joining the subtypes. Optional subtyping is represented by leaving a single line from the supertype to the circle. In both cases you draw the dash across the line going to the supertype.

e13.png

So, what does this figure tell you? Since it is a mandatory subtype partitioning (you know this from the double line), whenever data for a new person is entered into the database, it must be classified as either a customer or an employee. The database user cannot simply add information about a generic person — she must know whether this person is a customer or an employee. If this had been an optional subtype partitioning, then when that user was entering data about an employee, she had the option of classifying the person as an employee or as a customer — but did not have to classify the person as either.

Disjoint versus overlapping

Consider now the company supertype and the subtypes AR_co and AP_co. As a designer you can specify whether or not an entity of subtype AR_co can also be an entity of type AP_co. Certainly it is not abnormal to think that you can do business with companies that do business with you. Think of being a consultant for Ameritech or IBM.

The following are the two possibilities.

If entities are allowed to be no more than one subtype, then the subtypes are said to be disjoint.

If entities can be classified as several subtypes, then the subtypes are said to be overlapping.

e14.png

Disjoint subtypes are represented by putting a “d” below the supertype line. Overlapping subtypes are represented by putting a “o” there.

The above figure tells us that this is a disjoint entity subtyping. This means that whenever data for a new company is entered into the database, the company can be classified as either AR_co or AP_co but not both. If this had been an overlapping entity subtyping, then when that user was entering data about a company, she would have had the option of classifying the company as both AR_co and AP_co.

Problem 3.3

Suppose there is a student entity type with subtypes CIS majors, finance majors, and accounting majors. Is this a mandatory or optional subtyping? Are these subtypes disjoint or overlapping?

Suppose I didn't tell you that this should be an entity subtype problem. Would you represent it this way? What else would you do?

Answer to Problem 3.3

Aggregation of entity types

Subtypes are generally thought of in terms of X is a Y (which is why these are commonly referred to as is-a relationships). Another type of relationship that needs to be represented in a database is the part-of relationship, more formally called aggregation. When an entity is made up of several different types of other entities, an aggregation relationship may be called for.

Consider the relationship between a car and its engine and body. The engine and body are both part of the car. The relationship is represented as follows in an ER diagram:

aggregate.png

Attributes

Types of attributes

Sometimes it is instructive to classify an attribute by the means in which the value is determined. Here are the three possibilities.

basic
These are values provided to the business. These are the types of attributes that we have been discussing so far. Think of name, address, etc. These values cannot be deduced from the values of other attributes.
designed
This is invented and exists only in the database. An example might be a unique identifier for a department. This value is not changed once it is set.
derived
This is a value that can be calculated from the value of other attributes in the database. An example might be the age of an employee when the birth date is in the database. These attributes should, generally, not be stored in the database but should be calculated when needed.

Attribute optionality

Not all entities have a value for every attribute; however, some attributes must have a value for all entities.

optional
An entity need not have a value associated with an optional attribute.
mandatory
An entity must have a value associated with a mandatory attribute.

For example, assume the employee entity type has attributes hire date and termination date. Hire date would certainly be classified as a mandatory attribute; if the employee didn't have a hire date, then the person couldn't very well be an employee.

Termination date is an optional attribute. You would expect that many people in the database would not have a termination date while others, who are obviously ex-employees, do have a value associated with the termination date attribute.

The optionality of an attribute depends highly on the business situation, how the information is gathered, and how the business updates its database. One company might classify an attribute of an entity type as optional while another company might classify the same attribute of the same type as mandatory. Consider the following example:

Consider the attribute sale price of the catalog item entity type for a computer mail order company. Company A has a policy that they do not put an item into the catalog until it has a price; thus, they do not create a catalog item entity until they can assign a value to the attribute sale price. For this company the sale price attribute is mandatory.
On the other hand, Company B has a policy that they put an item into their catalog as soon as they decide to stock it. This way they can make their product line look as broad as possible. They put Call us for latest quote in the catalog instead of a price. Thus, they do create catalog items even before they have assigned a value to the attribute sale price. For this company the sale price attribute is optional.

Again, in order to classify an attribute as optional or mandatory, you must understand the business situation and practices.

Problem 3.4

For each of the following attributes, determine whether it is optional or mandatory:

  1. Delivery date of a customer order
  2. Order dateof a customer order
  3. Title of a book
  4. Serial number of a television

Answer to Problem 3.4

Other information about attributes

The database designer should also determine miscellaneous other information about each attribute:

default
This is the value that an attribute should take if it is not assigned a value. For example, the state field of an employee table might have the default value of MI.
permitted range
These are the values that an attribute is allowed to take. This ensures that a value that gets put into the database is valid. For example, the sale_price field of the inventory table might have a permitted range of sale_price > 0.
composite
A composite attribute is an attribute made up of many other attributes. The reason for creating a composite attribute is that the attribute itself is referred to as a whole. Think of an employee's address that is made up of the house number, street, city, state, and zip. One way of storing this attribute is to create an attribute address that would contain the whole address. For example, a value of address might be

address = 202 Crest Avenue, Ann Arbor, MI 48103

An alternative to this would be to store these pieces of data in separate fields in the table. For example, the above information might be stored as

street = 202 Crest Avenue
city = Ann Arbor
state = MI
zip = 48103

Separating the attributes in this way allows database users to refer to each field independently. For example, under the second scheme a user could easily and quickly determine the employees who live in Michigan. Under the first scheme this would not be nearly as fast.

Interpreting ER diagrams

ER diagram for interpretation exercise
e18.png

The point of this section is to give you some examples of how ER diagrams are interpreted. I try to give you some of the variations but I certainly do not give you all of them. If your “reading” of a relation is not below, then it is not necessarily wrong. Try to determine if they mean the same thing. If they don't and you cannot figure out the problem, then come by and talk with me during office hours.

  • needed
    1. A skill can be needed by many projects but might not be needed by any.
    2. A project can need one or more skills.
  • manages
    1. An employee can manage many projects but might not manage any. (Or: An employee can manage many projects. There are some employees who don't manage any projects.)
    2. A project must be managed by an employee. (Or: A project is managed by exactly one employee.) (Or: A project is managed by one and only one employee.)
  • has-skill
    1. An employee may have many skills but might not have any.
    2. A skill can be possessed by many employees. There are some skills that no employees possess.
  • used-on : The technique for an n-ary (in this case 3-ary, or ternary) relation is different than for binary relations, but still straight-forward. Hold your hand on n-1 entity types (in this case 2) and determine whether a 1 or an m goes on the remaining arm of the relation. Below, in order, are the project, employee, and skill arms.
    1. An employee uses one skill on many projects.
    2. Many employees can use a skill on one project.
    3. An employee can use one skill on a project.

Notice that the other two entity types are held constant; that is, for the project arm (the first one) you are determining how many projects can be associated with any single pairing of employees and skills. You can think of it the following way: “I have an employee named Fred. He is skilled in woodworking. How many projects can Fred be a woodworker on?” If it's “many”, then put an m on the project arm; if it's “one”, then put a 1 on it.

Continue with the exercises for this topic.

er
Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License