Database Design ~ Database Design and the E-R Model I

Up to this point in the text, we have assumed a given database schema and studied how queries and updates are expressed. We now consider how to design a database schema in the first place. In this section, we focus on the entity-relationship data model (E-R), which provides a means of identifying entities to be represented in the database and how those entities are related. Ultimately, the database design will be expressed in terms of a relational database design and an associated set of constraints. We show in this section how an E-R design can be transformed into a set of relation schemas and how some of the constraints can be captured in that design.

1、Overview of the Design Process

The task of creating a database application is a complex one, involving design of the database schema, design of the programs that access and update the data, and design of a security scheme to control access to data. The needs of the users play a central role in the design process. In this section, we focus on the design of the database schema, although we briefly outline some of the other design tasks later in the section.

The design of a complete database application environment that meets the needs of the enterprise being modeled requires attention to a broad set of issues. These additional aspects of the expected use of the database influence a variety of design choices at the physical, logical and view levels.

1.1 Design Phases

For small applications, it may be feasible for a database designer who understands the application requirements to decide directly on the relations to be created, their attributes, and constraints on the relations. However, such a direct design process is difficult for real-world applications, since they are often highly complex. Often no one person understands the complete data needs of an application. The database designer must interact with users of the application to understand the needs of the application, represent them in a high-level fashion that can be understood by the users, and then translate the requirements into lower levels of the design. A high-level data model serves the database designer by providing a conceptual framework in which to specify, in a systematic fashion, the data requirements of the database users, and a database structure that fulfills these requirements.

  • The initial phase of database design is to characterize fully the data needs of the prospective database users. The database designer needs to interact extensively with domain experts and users to carry out this task. The outcome of this phase is a specification of user requirements. While there are techniques for diagrammatically representing user requirements, in this section we restrict ourselves to textual descriptions of user requirements.
  • Next, the designer chooses a data model and, by applying the concepts of the chosen data model, translates these requirements into a conceptual schema of the database. The schema developed at this conceptual-design phase provides a detailed overview of the enterprise. The entity-relationship model, which we study in the rest of this section, is typically used to represent the conceptual design. Stated in terms of the entity-relationship model, the conceptual schema specifies the entities that are represented in the database, the attributes of the entities, the relationships among the entities, and constraints on the entities and relationships. Typically, the conceptual-design phrase results in the creation of an entity-relationship diagram that provides a graphic representation of the schema. The designer reviews the schema to confirm that all data requirements are indeed satisfied and are not in conflict with one another. She can also examine the design to remove any redundant features. Her focus at this point is on describing the data and their relationships, rather than on specifying physical storage details.
  • A fully developed conceptual schema also indicates the functional requirements of the enterprise. In a specification of functional requirements, users describe the kinds of operations (or transactions) that will be performed on the data. Example operations include modifying or updating data, searching for and retrieving specific data, and deleting data. At this stage of conceptual design, the designer can review the schema to ensure it meets functional requirements.
  • The process of moving from an abstract data model to the implementation of the database proceeds in two final design phases.
    • In the logical-design phase, the designer maps the high-level conceptual schema onto the implementation data model of the database system that will be used. The implementation data model is typically the relational data model, and this step typically consists of mapping the conceptual schema defined using the entity-relationship model into a relation schema.
    • Finally, the designer uses the resulting system-specific database schema in the subsequent physical-design phase, in which the physical features of the database are specified. These features include the form of file organization and choice of index structures, discussed later.

The physical schema of a database can be changed relatively easily after an application has been built. However, changes to the logical schema are usually harder to carry out, since they may affect a number of queries and updates scattered across application code. It is therefore important to carry out the database design phase with care, before building the rest of the database application.

1.2 Design Alternatives

A major part of the database design process is deciding how to represent in the design the various types of “things” such as people, places, products, and the like. We use the term entity to refer to any such distinctly identifiable item. In a university database, examples of entities would include instructors, students, departments, courses, and course offerings. The various entities are related to each other in a variety of ways, all of which need to be captured in the database design. For example, a student takes a course offering , while an instructor teaches a course offering; teaches and takes are examples of relationships between entities.

In designing a database schema, we must ensure that we avoid two major pitfalls:

Redundancy: A bad design may repeat information. For example, if we store the course identifier and title of a course with each course offering, the title would be stored redundantly (that is , multiple times, unnecessarily) with each course offering. It would suffice to store only the course identifier with each course offering, and to associate the title with the course identifier only once, in a course entity. Redundancy can also occur in a relational schema. In the university example we have used so far, we have a relation with section information and a separate relation with course information. Suppose that instead we have a single relation where we repeat all of the course information (course_id, title, dept_name, credits) once for each section (offering) of the course. Clearly, information about courses would then be stored redundantly.

The biggest problem with such redundant representation of information is that the copies of a piece of information can become inconsistent if the information is updated without taking precautions to update all copies of the information. For example, different offerings of a course may have the same course identifier, but may have different titles. It would then become unclear what the correct title of the course is. Ideally, information should appear in exactly one place.

Incompleteness: A bad design may make certain aspects of the enterprise difficult or impossible to model. For example, suppose that, as in case (1) above, we only had entities corresponding to course offering, without having an entity corresponding to courses. Equivalently, in terms of relations, suppose we have a single relation where we repeat all of the course information once for each section that the course is offered. It would then be impossible to represent information about a new course, unless that course is offered. We might try to make do with the problematic design by storing null values for the section information. Such a work-around is not only unattractive, but may be prevented by primary-key constraints.

Avoiding bad designs is not enough. There may be a large number of good designs from which we must choose. As a simple examine, consider a customer who buys a product. Is the sale of this product a relationship between the customer and the product? Alternatively, is the sale itself an entity that is related both to the customer and to the product? This choice, though simple, may make an important difference in what aspects of the enterprise can be modeled well. Considering the need to make choices such as this for the large number of entities and relationships in a real-world enterprise, it is not hard to see that database design can be a challenging problem. Indeed we shall see that it requires a combination of both science and “good taste.”

2、The Entity-Relationship Model

The entity-relationship (E-R) data model was developed to facilitate database design by allowing specification of an enterprise schema that represents the overall logical structure of a database.

The E-R model is very useful in mapping the meanings and interactions of real-world enterprises onto a conceptual schema. Because of this usefulness, many database-design tools draw on concepts from the E-R model. The E-R data model employs there basic concepts: entity sets, relationship sets, and attributes, which we study first. The E-R model also has an associated diagrammatic representation, the E-R diagram, which we study later in this series.

2.1 Entity Sets

An entity is a ‘thing’ or ‘object’ in the real world that is distinguishable from all other objects. For example, each person in a university is an entity. An entity has a set of properties, and the values for some set of properties may uniquely identify and entity. For instance, a person may have have a person_id property whose value uniquely identifies that person. Thus, the value 677-89-9011 for person_id would uniquely identify one particular person in the university. Similarly, courses can be thought of as entities, and course_id uniquely identifies a course entity in the university. An entity may be concrete, such as a person or a book, or it may be abstract, such as a course, a course offering, or a flight reservation.

A entity set is a set of entities of the same type that share the same properties, or attributes. The set of all people who are instructors at a given university, for example, can be defined as the entity set instructor. Similarly, the entity set student might represent the set of all students in the university.

In the process of modeling, we often use the term entity set in the abstract, without referring o a particular set of individual entities. We use the term extension of the entity set to refer to the actual collection of entities belonging to the entity set. Thus, the set of actual instructors in the university forms the extension of the entity set instructor. The above distinction is similar to the difference between a relation and a relation instance. Entity sets do not need to be disjoint. For example, it is possible to define the entity set of all people in a university (person). A person entity may be an instructor entity, a student entity, both or neither.

An entity is represented by a set of attributes. Attributes are descriptive properties possessed by each member of an entity set. The designation of an attribute for an entity set expresses that the database stores similar information concerning each entity in the entity set; however, each entity may have its own value for each attribute. Possible attributes of the instructor entity set are ID, name, dept_name, and salary. In real life, there would be further attributes, such as street number, apartment number, state, postal code, and country, but we omit them to keep our examples simple. Possible attributes of the course entity set are course_id, title, dept_name, and credits.

Each entity has a value for each of its attributes. For instance, a particular instructor entity may have the value 12121 for ID, the value Wu for name, the value Finance for dept_name, and the value 90000 for salary. The ID attribute is used to identify instructors uniquely, since there may be more than one instructor with the same name. In the United States, many enterprises find it convenient to use the social-security number of a person as an attribute whose value uniquely identifies the person. In general the enterprise would have to create and assign a unique identifier for each instructor.

A database thus includes a collection of entity sets, each of which contains any number of entities of the same type. Figure 1 shows part of a university database that consists of two entity sets: instructor and student. To keep the figure simple, only some of the attributes of the two entity sets are shown.

Figure 1    Entity sets instructor and student.

A database for a university may include a number of other entity sets. For example, in addition to keeping track of instructors and students, the university also has information about courses, which are represented by the entity set course with attributes course_id, title, dept_name and credits. In a real setting, a university database may keep dozens of entity sets.

2.2 Relationship Sets

A relationship is an association among several entities. For example, we can define a relationship advisor that associates instructor Katz with student Shankar. This relationship specifies that Katz is an advisor to student Shankar.

A relationship set is a set of relationships of the same type. Formally, it is a mathematical relation on n ≥ 2 (possibly nondistinct) entity sets. If E1, E2, . . ., En are entity sets, then a relationship set R is a subset of where (e1, e2, . . . , en) is a relationship.

Consider the two entity sets instructor and student in Figure 1. We define the relationship set advisor to denote the association between instructors and students. Figure 2 depicts this association. As another example, consider the two entity sets student and section. We can define the relationship set takes to denote the association between a student and the course sections in which that student is enrolled.

Figure 2    Relationship set advisor.

The association between entity sets is referred to as participation; that is , the entity sets E1, E2, . . ., En participate in relationship set R. A relationship instance in an E-R schema represents an association between the named entities in the real-world enterprise that is being modeled. As an illustration, the individual instructor entity Katz, who has instructor ID 45565, and the student entity Shankar, who has student ID 12345, participate in a relationship instance of advisor. This relationship instance represents that in the university, the instructor Katz is advising student Shankar.

The function that an entity plays in a relationship is called that entity’s role. Since entity sets participating in a relationship set are generally distinct, roles are implicit and are not usually specified. However, they are useful when the meaning of a relationship needs clarification. Such is the case when the entity sets of a relationship set are not distinct; that is, the same entity set participates in a relationship set more than once, in different roles. In this type of relationship set, sometimes called a recursive relationship set, explicit role name are necessary to specify how an entity participates in a relationship instance. For example, consider the entity set course that records information about all the courses offered in the university. To depict the situation where one course (C2) is a prerequisite for another course (C1) we have relationship set prereq that is modeled by ordered pairs of course entities. The first course of a pair takes the role of course C1, whereas the second takes the role of prerequisite course C2. In this way, all relationships of prereq are characterized by (C1, C2) pairs; (C2, C1) pairs are excluded.

A relationship may also have attributes called descriptive attributes. Consider  a relationship set advisor with entity sets instructor and student. We could associate the attribute date with that relationship to specify the date when an instructor became the advisor of a student. The advisor relationship among the entities corresponding to instructor Katz and student Shankar has the value “10 June 2007” for attribute date, which means that Katz became Shankar’s advisor on 10 June 2007. Figure 3 shows the relationship set advisor with a descriptive attribute date. Please note that Katz advises two students with two different advising dates.

Figure 3    date as attribute of the advisor relationship set.

As a more realistic example of descriptive attributes for relationships, consider the entity sets student and section, which participate in a relationship set takes. We may wish to store a descriptive attribute grade with the relationship to record the grade that a student got in the class. We may also store a descriptive attribute for_credit to record whether a student has taken the course for credit, or is auditing (or sitting in on) the course.

A relationship instance in a given relationship set must be uniquely identifiable from its participating entities, without using the descriptive attributes. To understand this point, suppose we want to model all the dates when an instructor became an advisor of a particular student. The single-valued attribute date can store a single date only. We cannot represent multiple dates by multiple relationship instances between the same instructor and a student, since the relationship instances would not be uniquely identifiable using only the participating entities. The right way to handle this case is to create a multivalued attribute date, which can store all the dates.

It is possible to have more than one relationship set involving the same entity sets. In our example, the instructor and student entity sets participate in the relationship set advisor. Additionally, suppose each student must have another instructor who serves as a department advisor (undergraduate or graduate). Then the instructor and student entity sets may participate in another relationship set, dept_advisor. The relationship sets advisor and dept_advisor provide examples of a binary relationship set—that is, one that involves two entity sets. Most of the relationship sets in a database system are binary. Occasionally, however, relationship sets involve more than two entity sets.

As an example, suppose that we have an entity set project that represents all the research projects carried out in the university. Consider the entity sets instructor, student, and project. Each project can have multiple associated students and multiple associated instructors. Furthermore, each student working on a project must have an associated instructor who guides the student on the project. For now, we ignore the first two relationships, between project and instructor, and between project and student. Instead, we focus on the information about which instructor is guiding which student on a particular project. To represent this information, we relate the three entity sets through the relationship set proj_guide, which indicates that a particular student is guided by a particular instructor on a particular project.

Note that a student could have different instructors as guides for different projects, which cannot be captured by a binary relationship between students and instructors. The number of entity sets that participate in a relationship set is the degree of the relationship set. A binary relationship set is of degree 2; a ternary relationship set is of degree 3.

2.3 Attributes

For each attributes, there is a set of permitted values, called the domain, or value set, of that attribute. The domain of attribute course_id might be the set of all text strings of a certain length. Similarly, the domain of attribute semester might be strings from the set {Fall, Winter, Spring, Summer}. Formally, an attribute of an entity set is a function that maps from the entity set into a domain. Since an entity set may have several attributes, each entity can be described by a set of (attribute, data value) pairs, one pair for each attribute of the entity set. For example, a particular instructor entity may be described by the set {(ID, 76766), (name, Crick), (dept_name, Biology), (salary, 72000)}, meaning that the entity describes a person named Crick whose instructor ID is 76766, who is a member of the Biology department with salary of $72000. We can see, at this point, an integration of the abstract schema with the actual enterprise being modeled. The attribute values describing an entity constitute a significant portion of the data stored in the database.

An attribute, as used in the E-R model, can be characterized by the following attribute types.

  • Simple and composite attributes. In our examples thus far, the attributes have been simple; that is, they have not been divided into subparts. Composite attributes, on the other hand, can be divided into subparts (that is, other attributes). For example, an attribute name could be structured as a composite attribute consisting of first_name, middle_initial, and last_name. Using composite attributes in a design schema is a good choice if a user will wish to refer to an entire attribute on some occasions, and to only a component of the attribute on other occasions. Suppose we were to add an address to the student entity-set. The address can be defined as the composite attribute address with the attributes street, city, state, and zip_code. Composite attributes help us to group together related attributes, making the modeling cleaner. Note also that a composite attribute may appear as a hierarchy. In the composite attribute address, its component attribute street can be further divided into street_number, street_name, and apartment_number. Figure 4 depicts these examples of composite attributes for the instructor entity set.
  • Single-valued and multivalued attributes. The attributes in our examples all have a single value for a particular entity. For instance, the student_ID attribute for a specific student entity refers to only one student ID. Such attributes are said to be single valued. There may be instances where an attributes has a set of values for a specific entity. Suppose we add to the instructor entity set a phone_number attribute. An instructor may have zero, one, or several phone numbers, and different instructors may have different numbers of phones. This type of attribute is said to be multivalued. As another example, we could add to the instructor entity set an attribute dependent_name listing all the dependents. This attribute would be multivalued, since any particular instructor may have zero, one, or more dependents. To denote that an attribute is multivalued, we enclose it in braces, for example {phone_number} or {dependent_name}. Where appropriate, upper and lower bounds may be placed on the number of values in a multivalued attribute. For example, a university may limit the number of phone numbers recorded for a single instructor to two. Placing bounds in this case expresses that the phone_number attributes of the instructor entity set may have between zero and two values.
  • Derived attribute. The value for this type of attribute can be derived from the values of other related attributes or entities. For instance, let us say that the instructor entity set has an attribute students_advised, which represents how many students an instructor advises. We can derive the value for this attribute by counting the number of student entities associated with that instructor. As another example, suppose that the instructor entity set has an attribute age that indicates the instructor’s age. If the instructor entity set also has an attribute date_of_birth, we can calculate age from date_of_birth and the current date. Thus, age is a derived attribute. In this case, date_of_birth may be referred to as a base attribute, or a stored attribute. The value of a derived attribute is not stored but is computed when required.

An attribute takes a null value when an entity does not have a value for it. The null value may indicate “not applicable”—that is, that the value does not exist for the entity. For example, one may have no middle name. Null can also designate that an attribute value is unknown. An unknown value may be either missing (the value does exist, but we do not have that information) or not known (we do not know whether or not the value actually exists).

For instance, if the name value for a particular instructor is null, we assume that the value is missing, since every instructor must have a name. A null value for the apartment_number attribute could mean that the address does not include an apartment number (not applicable), that an apartment number exists but we do not know what it is (missing), or that we do not know whether or not an apartment number is part of the instructor’s address (unknown).

3、Constraints

  An E-R enterprise schema may define certain constraints to which the content of a database must conform. In this section, we examine mapping cardinalities and participation constraints.

3.1  Mapping Cardinalities

Mapping cardinalities, or cardinality ratios, express the number of entities to which another entity can be associated via a relationship set.

Mapping cardinalities are most useful in describing binary relationship sets, although they can contribute to the description of relationship sets that involve more than two entity sets. In this section, we shall concentrate on only binary relationship set. For a binary relationship set R between entity sets A and B, the mapping cardinality must be one of the following:

  • One-to-one. An entity in A is associated with at most one entity in B, and an entity in B is associated with at most one entity in A. (See Figure 5a.)
  • One-to-many. An entity in A is associated with any number (zero or more) of entities in B. An entity in B, however, can be associated with at most one entity in A. (See Figure 5b.)
  • Many-to-one. An entity in A is associated with at most one entity in B. An entity in B, however, can be associated with any number (zero or more) of entities in A. (See Figure 6a.)
  • Many-to-many. An entity in A is associated with any number (zero or more) of entities in B, and an entity in B is associated with any number (zero or more) of entities in A. (See Figure 6b.)

Figure 5    Mapping cardinalities. (a) One-to-one. (b) One-to-many.Figure 6    Mapping cardinalities. (a) Many-to-one. (b) Many-to-many.

The appropriate mapping cardinality for a particular relationship set obviously depends on the real-world situation that the relationship set is modeling. As an illustration, consider the advisor relationship set. If, in a particular university, a student can be advised by only one instructor, and an instructor can advise several students, then the relationship set from instructor to student is one-to-many. If a student can be advised by several instructors (as in the case of students advised jointly), the relationship set is many-to-many.

3.2 Participation Constraints

The participation of an entity set E in a relationship set R is said to be total if every entity in E participates in at least one relationship in R. If only some entities in E participate in relationship in R, the participation of entity set E in relationship R is said to be partial. In Figure 5a, the participation of B in the relationship set is total while the participation of A in the relationship set is partial. In Figure 5b, the participation of both A and B in the relationship set are total.

For example, we expect every student entity to be related to at least one instructor through the advisor relationship. Therefore the participation of student in the relationship set advisor is total. In contrast, an instructor need not advise any students. Hence, it is possible that only some of the instructor entities are related to the student entity set through the advisor relationship, and the participation of instructor in the advisor relationship set is therefore partial

3.3 Keys

We must have a way to specify how entities within a given entity set are distinguished. Conceptually, individual entities are distinct; from a database perspective, however, the differences among them must be expressed in terms of their attributes. Therefore, the values of the attribute values of an entity must be such that they can uniquely identify the entity. In other words, no two entities in an entity set are allowed to have exactly the same value for all attributes. The notion of a key for a relation schema, as defined in previous section, applies directly to entity sets. That is, a key for an entity is a set of attributes that suffice to distinguish entities from each other. The concepts of superkey, candidate key, and primary key are applicable to entity sets just as they are applicable to relation schemas.

Keys also help to identify relationships uniquely, and thus distinguish relationships from each other. Below, we define the corresponding notions of keys for relationships. The primary key of an entity set allows us to distinguish among the various entities of the set. We need a similar mechanism to distinguish among the various relationships of a relationship set. Let R be a relationship set involving entity sets E1, E2, . . . , En. Let primary-key(Ei) denote the set of attributes that forms the primary key for entity set Ei. Assume for now that the attribute names of all primary keys are unique. The composition of the primary key for a relationship set depends on the set of attributes associated with the relationship set R. If the relationship set R has no attributes associated with it, then the set of attributesdescribes an individual relationship in set R. If the relationship set R has attributes a1, a2, . . . , am associated with it, then the set of attributes describes an individual relationship in set R. In both of the above cases, the set of attributesforms a superkey for the relationship set. If the attribute names of primary keys are not unique across entity sets, the attributes are renamed to distinguish them; the name of the entity set combined with the name of the attribute would form a unique name. If an entity set participates more than once in a relationship set (as in the prereq relationship in previous section), the role name is used instead of the name of the entity set, to form a unique attribute name.

The structure of the primary key for the relationship set depends on the mapping cardinality of the relationship set. As an illustration, consider the entity sets instructor and student, and the relationship set advisor, with attribute date, in previous section. Suppose that the relationship set is many-to-many. Then the primary key of advisor consists of the union of the primary keys of instructor and student. If the relationship is many-to-one from student to instructor—that is, each student can have at most one advisor—then the primary key of advisor is simply the primary key of student. However, if an instructor can advise only one student—that is, if the advisor relationship is many-to-one from instructor to student—then the primary key of advisor is simply the primary key of instructor. For one-to-one relationships either candidate key can be used as the primary key.

For nonbinary relationships, if no cardinality constraints are present then the superkey formed as described earlier in this section is the only candidate key, and it is chosen as the primary key. The choice of the primary key is more complicated if cardinality constraints are present. Since we have not discussed how to specify cardinality constraints on nonbinary relations, we consider the issue in more detail later.

4、Removing Redundant Attributes in Entity Sets

 When we design a database using the E-R model, we usually start by identifying those entity sets that should be included. For example, in the university organization we have discussed thus far, we decided to include such entity sets as student, instructor, etc. Once the entity sets are decided upon, we must choose the appropriate attributes. These attributes are supposed to represent the various values we want to capture in the database. In the university organization, we decided that for the instructor entity set, we will include the attributes ID, name, dept_name, and salary. We could have added the attributes: phone_number, office_number, home_page, etc. The choice of what attributes to include is up to the designer, who has a good understanding of the structure of the enterprise.

Once the entities and their corresponding attributes are chosen, the relationship sets among the various entities are formed. These relationship sets may result in a situation where attributes in the various entity sets are redundant and need to be removed from the original entity sets. To illustrate, consider the entity sets instructor and department:

  • The entity set instructor includes the attributes ID, name, dept_name, and salary, with ID forming the primary key.
  • The entity set department includes the attributes dept_name, building, and budget, with dept_name forming the primary key.

We model the fact that each instructor has an associated department using a relationship set inst_dept relating instructor and department. The attribute dept_name appears in both entity sets. Since it is the primary key for the entity set department, it is redundant in the entity set instructor and needs to be removed.

Removing the attribute dept_name from the instructor entity set may appear rather unintuitive, since the relation instructor that we used in the earlier series had an attribute dept_name. As we shall see later, when we create a relational schema from the E-R diagram, the attribute dept_name in fact gets added to the relation instructor, but only if each instructor has at most one associated department. If an instructor has more than one associated department, the relationship between instructors and departments is recorded in a separate relation inst_dept.

Treating  the connection between  instructors and departments uniformly as a relationship, rather than as an attribute of instructor, makes the logical relationship explicit, and helps avoid a premature assumption that each instructor is associated with only one department. Similarly, the student entity set is related to the department entity set through the relationship set student_dept and thus there is no need for a dept_name attribute in student. As another example, consider course offerings (sections) along with the time slots of the offerings. Each time slot is identified by a time_slot_id, and has associated with it a set of weekly meetings, each identified by a day of the week, start time, and end time. We decide to model the set of weekly meeting times as a multivalued composite attribute. Suppose we model entity sets section and time_slot as follows.

  • The entity set section includes the attributes course_id, sec_id, semester, year, building, room_number, and time_slot_id, with (course_id, sec_id, year, semester) forming the primary key.
  • The entity set time_slot includes the attributes time_slot_id, which is the primary key, and a multivalued composite attribute {(day, start_time, end_time)}.

These entities are related through the relationship set sec_time_slot. The attribute time_slot_id appears in both entity sets. Since it is the primary key for the entity set time_slot, it is redundant in the entity set section and needs to be removed. As a final example, suppose we have an entity set classroom, with attributes building, room_number, and capacity, with building and room_number forming the primary key. Suppose also that we have a relationship set sec_class that relates section to classroom. Then the attributes {building, room_number} are redundant in the entity set section. A good entity-relationship design does not contain redundant attributes. For our university example, we list the entity sets and their attributes below, with bold primary keys:

  • classroom: with attributes (building, room_number, capacity).
  • department: with attributes (dept_name, building, budget).
  • course: with attributes (course_id, title, credits).
  • instructor: with attributes (ID, name, salary).
  • section: with attributes (course_id, sec_id, semester, year).
  • student: with attributes (ID, name, tot_cred).
  • time_slot: with attributes (time_slot_id, {(day, start_time, end_time)}).

The relationship sets in our design are listed below:

  • inst_dept: relating instructors with departments.
  • stud_dept: relating students with departments.
  • teaches: relating instructors with sections.
  • takes: relating students with sections, with a descriptive attribute grade.
  • course_dept: relating courses with departments.
  • sec_course: relating sections with courses.
  • sec_class: relating sections with classrooms.
  • sec_time_slot: relating sections with time slots.
  • advisor: relating students with instructors.
  • prereq: relating courses with prerequisite courses.

You can verify that none of the entity sets has any attribute that is made redundant by one of the relationship sets. Further, you can verify that all the information (other than constraints) in the relational schema for our university database has been captured by the above design, but with several attributes in the relational design replaced by relationships in the E-R design.

发表评论

电子邮件地址不会被公开。 必填项已用*标注