Anda di halaman 1dari 97

Basic Concepts - Purpose of Database Systems3 Schema Architecture and Data Independence- Components of DBMS Data Models, Schemas

and Instances-Data Modeling using the Entity Relationship Model-Entity types, Relationship Types, Weak Entity Types .

Elmsari and Navathe, Fundamentals of Database System, Pearson Education Asia,

Slide 1-2

Types of Databases and Database Applications


Numeric and Textual Databases Multimedia Databases Geographic Information Systems (GIS) Data Warehouses Real-time and Active Databases

Slide 1-3 3

Basic Definitions

Database: A collection of related data. Data: Known facts that can be recorded and have an implicit meaning. Mini-world: Some part of the real world about which data is stored in a database. For example, student grades and transcripts at a university. Database Management System (DBMS): A software package/ system to facilitate the creation and maintenance of a computerized database. Database System: The DBMS software together with the data itself. Sometimes, the applications are also included.
Slide 1-4 4

Typical DBMS Functionality


Define a database : in terms of data types, structures and constraints Construct or Load the Database on a secondary storage medium Manipulating the database : querying, generating reports, insertions, deletions and modifications to its content Concurrent Processing and Sharing by a set of users and programs yet, keeping all data valid and consistent

Slide 1-5 5

Typical DBMS Functionality


Other features:

Protection or Security measures to prevent unauthorized access Active processing to take internal actions on data Presentation and Visualization of data

Slide 1-6 6

Figure 1.1

A simplified database system environment

Slide 1-7

Example of a Database (with a Conceptual Data Model)


Mini-world for the example: Part of a UNIVERSITY environment. Some mini-world entities:

STUDENTs COURSEs SECTIONs (of COURSEs) (academic) DEPARTMENTs INSTRUCTORs

Note: The above could be expressed in the ENTITY-RELATIONSHIP data model.


Slide 1-8 8

Example of a Database (with a Conceptual Data Model)

Some mini-world relationships:

SECTIONs are of specific COURSEs STUDENTs take SECTIONs COURSEs have prerequisite COURSEs INSTRUCTORs teach SECTIONs COURSEs are offered by DEPARTMENTs STUDENTs major in DEPARTMENTs

Note: The above could be expressed in the ENTITY-RELATIONSHIP data model.


Slide 1-9 9

Slide 1-10

Main Characteristics of the Database Approach

Self-describing nature of a database system: A DBMS catalog stores the description of the database. The description is called metadata). This allows the DBMS software to work with different databases. Insulation between programs and data: Called program-data independence. Allows changing data storage structures and operations without having to change the DBMS access programs.
Slide 1-11 11

Main Characteristics of the Database Approach

Data Abstraction: A data model is used to hide storage details and present the users with a conceptual view of the database. Support of multiple views of the data: Each user may see a different view of the database, which describes only the data of interest to that user.

Slide 1-12 12

Main Characteristics of the Database Approach

Sharing of data and multiuser transaction processing : allowing a set of concurrent users to retrieve and to update the database. Concurrency control within the DBMS guarantees that each transaction is correctly executed or completely aborted. OLTP (Online Transaction Processing) is a major part of database applications.

Slide 1-13 13

Database Users
Users may be divided into those who actually use and control the content (called Actors on the Scene) and those who enable the database to be developed and the DBMS software to be designed and implemented (called Workers Behind the Scene).

Slide 1-14 14

Database Users
Actors on the scene

Database administrators: responsible for authorizing access to the database, for coordinating and monitoring its use, acquiring software, and hardware resources, controlling its use and monitoring efficiency of operations. Database Designers: responsible to define the content, the structure, the constraints, and functions or transactions against the database. They must communicate with the end-users and understand their needs. End-users: they use the data for queries, reports and some of them actually update the database content.

Slide 1-15 15

Categories of End-users

Casual : access database occasionally when needed Naive or Parametric : they make up a large section of the end-user population. They use previously well-defined functions in the form of canned transactions against the database. Examples are bank-tellers or reservation clerks who do this activity for an entire shift of operations.
Slide 1-16 16

Categories of End-users

Sophisticated : these include business analysts, scientists, engineers, others thoroughly familiar with the system capabilities. Many use tools in the form of software packages that work closely with the stored database. Stand-alone : mostly maintain personal databases using ready-to-use packaged applications. An example is a tax program user that creates his or her own internal database.
Slide 1-17 17

Advantages of Using the Database Approach

Controlling redundancy in data storage and in development and maintenance efforts. Sharing of data among multiple users. Restricting unauthorized access to data. Providing persistent storage for program Objects (in Object-oriented DBMSs Providing Storage Structures for efficient Query Processing
Slide 1-18 18

Advantages of Using the Database Approach

Providing backup and recovery services. Providing multiple interfaces to different classes of users. Representing complex relationships among data. Enforcing integrity constraints on the database. Drawing Inferences and Actions using rules

Slide 1-19 19

Additional Implications of Using the Database Approach

Potential for enforcing standards: this is very crucial for the success of database applications in large organizations Standards refer to data item names, display formats, screens, report structures, meta-data (description of data) etc. Reduced application development time: incremental time to add each new application is reduced.
Slide 1-20 20

Additional Implications of Using the Database Approach

Flexibility to change data structures: database structure may evolve as new requirements are defined. Availability of up-to-date information very important for on-line transaction systems such as airline, hotel, car reservations. Economies of scale: by consolidating data and applications across departments wasteful overlap of resources and personnel can be avoided.
Slide 1-21 21

Historical Development of Database Technology

Early Database Applications: The Hierarchical and Network Models were introduced in mid 1960s and dominated during the seventies. A bulk of the worldwide database processing still occurs using these models. Relational Model based Systems: The model that was originally introduced in 1970 was heavily researched and experimented with in IBM and the universities. Relational DBMS Products emerged in the 1980s.
Slide 1-22 22

Historical Development of Database Technology

Object-oriented applications: OODBMSs were introduced in late 1980s and early 1990s to cater to the need of complex data processing in CAD and other applications. Their use has not taken off much. Data on the Web and E-commerce Applications: Web contains data in HTML (Hypertext markup language) with links among pages. This has given rise to a new set of applications and E-commerce is using new standards like XML (eXtended Markup Language).
Slide 1-23 23

Extending Database Capabilities

New functionality is being added to DBMSs in the following areas:


Scientific Applications Image Storage and Management Audio and Video data management Data Mining Spatial data management Time Series and Historical Data Management

The above gives rise to new research and development in incorporating new data types, complex data structures, new operations and storage and indexing schemes in database systems.

Slide 1-24 24

When not to use a DBMS

Main inhibitors (costs) of using a DBMS:

High initial investment and possible need for additional hardware. Overhead for providing generality, security, concurrency control, recovery, and integrity functions.
If the database and applications are simple, well defined, and not expected to change. If there are stringent real-time requirements that may not be met because of DBMS overhead. If access to data by multiple users is not required.
Slide 1-25 25

When a DBMS may be unnecessary:

When not to use a DBMS

When no DBMS may suffice:

If the database system is not able to handle the complexity of data because of modeling limitations If the database users need special operations not supported by the DBMS.

Slide 1-26 26

Data Models

Data Model: A set of concepts to describe the structure of a database, and certain constraints that the database should obey. Data Model Operations: Operations for specifying database retrievals and updates by referring to the concepts of the data model. Operations on the data model may include basic operations and user-defined operations.

Slide 2-27

27

Categories of data models


Conceptual (high-level, semantic) data models: Provide concepts that are close to the way many users perceive data. (Also called entity-based or object-based data models.) entity attribute relationship

-Physical (low-level, internal) data models: Provide concepts that describe details of how data is stored in the computer.
- record formats record ordering access paths

-Implementation (record-oriented) data models: Provide concepts that fall between the above two, balancing user views with some computer storage details.

relational

network hierarchical
2-2 28

Schemas, Instances and Database State


Database Schema (meta-data): The description of a database. Includes descriptions of the database structure and the constraints that should hold on the database. Schema Diagram: A diagrammatic display of (some aspects of ) a database schema. (refer to Fig 2.1) Database Instance: The actual data stored in a database at a particular moment in time. Also called database state ( or occurrence, snapshot) (refer to Fig 1.2)

Each schema construct has its own current set of instances.

The database schema changes very infrequently. The database state changes every time the database is updated.
2-3 29

Schemas, Instances and Database State

Schema is also called intension, whereas state is called extension.

The extension of a given relation is the set of tuples appearing in that relation at any given instance.
The extension thus varies with time. It changes as tuples are created, destroyed, and updated The intension of a given relation is independent of time. It is the permanent part of the relation. It corresponds to what is specified in the relational schema. The intension thus defines all permissible extensions. The intension is a combination of two things : a structure and a set of integrity constraints.
2-3 30

Figure 2.1

Schema diagram for UNIVERSITY database

schema construct

Known data: name of record types, data items


2-4a 31

Figure 1.2 UNIVERSITY Database

2-4

32

define

empty state
load initial state update state

valid state satisfy database schema

update

2-3

33

Three-Schema Architecture

Proposed to support DBMS characteristics of:

Program-data independence.
Insulation of programs and data/program and operations (program-data and program-operation independence)

Support of multiple views of the data. Use of catalog (database description)


34

Slide 2-34

FIGURE 2.2 The threeschema architecture.

Slide 2-35

35

Three-Schema Architecture

Defines DBMS schemas at three levels:

Internal schema at the internal level to describe physical storage structures and access paths. Typically uses a physical data model. Conceptual schema at the conceptual level to describe the structure and constraints for the whole database for a community of users. Uses a conceptual or an implementation data model. External schemas at the external level to describe the various user views. Usually uses the same data model as the conceptual level.

Slide 2-36

36

Three-Schema Architecture
Mappings among schema levels are needed to transform requests and data. Programs refer to an external schema, and are mapped by the DBMS to the internal schema for execution.

Slide 2-37

37

Data Independence

Logical Data Independence: The capacity


to change the conceptual schema without having to change the external schemas and their application programs.
By adding or removing a record type or data item to expand the database reduce the database

Physical Data Independence: The


capacity to change the internal schema without having to change the conceptual schema.
Reorganize physical files to improve performance

Slide 2-38

38

Data Independence

When a schema at a lower level is changed, only the mappings between this schema and higher-level schemas need to be changed in a DBMS that fully supports data independence. The higher-level schemas themselves are unchanged. Hence, the application programs need not be changed since they refer to the external schemas.
39

Slide 2-39

40Slide

240

DBMS Languages

Data Definition Language (DDL): Used by the DBA and database designers to specify the conceptual schema of a database. In many DBMSs, the DDL is also used to define internal and external schemas (views). In some DBMSs, separate storage definition language (SDL) and view definition language (VDL) are used to define internal and external schemas.

Slide 2-41

DBMS Languages
Data Manipulation Language (DML): Used to specify database retrievals and updates.
DML commands (data sublanguage) can be embedded in a general-purpose programming language (host language), such as COBOL, C or an Assembly Language. Alternatively, stand-alone DML commands can be applied directly (query language).

Slide 2-42

DBMS Languages
High Level or Non-procedural Languages: e.g., SQL, are set-oriented and specify what data to retrieve than how to retrieve. Also called declarative languages. Low Level or Procedural Languages: record-at-atime; they specify how to retrieve data and include constructs such as looping.

43

DBMS Interfaces

Stand-alone query language interfaces. Programmer interfaces for embedding DML in programming languages:

Pre-compiler Approach Procedure (Subroutine) Call Approach Menu-based, popular for browsing on the web Forms-based, designed for nave users Graphics-based (Point and Click, Drag and Drop etc.) Natural language: requests in written English Combinations of the above

User-friendly interfaces:

44Slide

244

Other DBMS Interfaces

Speech as Input (?) and Output Web Browser as an interface Parametric interfaces (e.g., bank tellers) using function keys. Interfaces for the DBA: Creating accounts, granting authorizations Setting system parameters Changing schemas or access path

45

The Database System Environment


DBMS component modules Buffer management Stored data manager DDL compiler Interactive query interface

Query compiler Query optimizer


Precompiler

46

DBMS component modules


Runtime database processor System catalog Concurrency control system Backup and recovery system

FIGURE 2.3 Component modules of a DBMS and their interactions.


Click to edit Master text styles Second level Third lev Fourth level

47Slide

247

48 FIGURE 2.3 Component modules of a DBMS and their interactions

49

DBMS component modules


The database and the DBMS catalog are usually stored on disk. Access to the disk is controlled primarily by the operating system (OS), which schedules disk input/output.
A higher-level stored data manager module of the DBMS controls access to DBMS information that is stored on disk, whether it is part of the database or the catalog. The stored data manager may use basic OS services for carrying out low-level data transfer between the disk and computer main storage, but it controls other aspects of data transfer, such as handling buffers in main memory. Once the data is in main memory buffers, it can be processed by other DBMS modules, as well as by application programs.

50

DBMS component modules


The DDL compiler processes schema definitions, specified in the DDL, and stores descriptions of the schemas (meta-data) in the DBMS catalog. The catalog includes information such as the names of files, data items, storage details of each file, mapping information among schemas, and constraints, in addition to many other types of information that are needed by the DBMS modules. DBMS software modules then look up the catalog information as needed. The run-time database processor handles database accesses at run time; it receives retrieval or update operations and carries them out on the database. Access to disk goes through the stored data manager.

51

DBMS component modules


The query compiler handles high-level queries that are entered interactively. It parses, analyzes, and compiles or interprets a query by creating database access code, and then generates calls to the run-time processor for executing the code. The pre-compiler extracts DML commands from an application program written in a host programming language. These commands are sent to the DML compiler for compilation into object code for database access. The rest of the program is sent to the host language compiler. The object codes for the DML commands and the rest of the program are linked, forming a canned transactionwhose executable code includes calls to the runtime database processor.

52

DBMS component modules


The DBMS interacts with the operating system when disk accessesto the database or to the catalogare needed. If the computer system is shared by many users, the OS will schedule DBMS disk access requests and DBMS processing along with other processes.

The DBMS also interfaces with compilers for generalpurpose host programming languages.

User-friendly interfaces to the DBMS can be provided to help any of the user types shown in Figure 02.03 to specify their requests.

53Slide

253

Database System Utilities

To perform certain functions such as:

Loading data stored in files into a database. Includes data conversion tools. Backing up the database periodically on tape. Reorganizing database file structures. Report generation utilities. Performance monitoring utilities. Other functions, such as sorting, user monitoring, data compression, etc.

54Slide

254

Centralized and Client-Server Architectures

Centralized DBMS: combines everything into single system including- DBMS software, hardware, application programs and user interface processing software.

Slide 2-55

Classification of DBMSs
Based on the data model used:

Traditional: Relational, Network, Hierarchical. Emerging: Object-oriented, Object-relational.


Other classifications:

Single-user (typically used with microcomputers) vs. multi-user (most DBMSs). Centralized (uses a single computer with one database) vs. distributed (uses multiple computers, multiple databases)

Slide 2-56

Classification of DBMSs
Distributed Database Systems have now come to be known as client server based database systems because they do not support a totally distributed environment, but rather a set of database servers supporting a set of clients.

Slide 2-57

Variations of Distributed Environments:


Homogeneous DDBMS

Heterogeneous DDBMS
Federated or Multidatabase Systems

22/07/13

58

Yan Huang 58 ER

The Entity Relationship Model


ER diagram is widely used in database design Represent conceptual level of a database system

Describe things and their relationships in high level


In conceptual model, the real world consists of a collection of basic objects called entities and the relationships among these objects

22/07/13

59

Yan Huang 59 ER

Basic Concepts
Entity set an abstraction of similar things, e.g. cars, students An entity set contains many entities Attributes - common properties of the entities in a entity sets Relationship specify the relations among entities from two or more entity sets

One nice thing about conceptual model is that you can represent the logical structure of a DB graphically, using an ER diagram.

22/07/13

60

Yan Huang 60 ER

An Example

61Slide

Example ER Diagram
Qu ic kTime a nd a dec omp re sso r are n eed ed to se e th is p i cture. Qu ic kTime a nd a dec omp re sso r are n eed ed to se e th is p i cture.

261

Qu ic k T im e a nd a dec om p re s s o r are n eed ed to s e e th is p i c ture.

Qui ckTi me and a d eco mpressor are n eede d to see t hi s pi ctur e .

Qu ic kTime a nd a dec omp re sso r are n eed ed to se e th is p i cture.

Qu ic kTime a nd a dec omp re sso r are n eed ed to se e th is p i cture.

22/07/13

62

62

Relationship
The degree of a relationship = the number of entity sets that participate in the relationship

Mostly binary relationships


Sometimes more

Chapter 3-63

63

Constraints on Relationships
Maximum Cardinality

One-to-one (1:1) One-to-many (1:N) or Many-to-one (N:1) Many-to-many

Minimum Cardinality (also called participation constraint or existence dependency constraints)


zero (optional participation, not existence-dependent) one or more (mandatory, existence-dependent)

22/07/13

64

Yan Huang 64 ER

One-One and One-Many

22/07/13

65

Yan Huang 65 ER

Many-one and many-many

Chapter 3-66

66

Many-to-one (N:1) RELATIONSHIP


EMPLOYEE WORKS_FOR DEPARTMENT

e1 e2 e3 e4 e5 e6 e7

r1
r2 r3 r4 r5 r6

d1 d2 d3

r7

Chapter 3-67

67

Many-to-many (M:N) RELATIONSHIP


r9

e1
e2 e3 e4 e5 e6 e7

r1
r2 r3

p1
p2 p3

r4
r5 r6

r7
r8

22/07/13

68

Yan Huang 68 ER

Total Participation
When we require all entities to participate in the relationship (total participation), we use double lines to specify

Every loan has to have at least one customer

69

Yan Huang 69 ER

Self Relationship
Sometimes entities in a entity set may relate to other entities in the same set. Thus self relationship Here employees mange some other employees The labels manger and worker are called roles the self relationship

Roles specify how employee entities interact via the works_for relationship set. Role labels are optional, and are used to clarify semantics of
the relationship

Cardinality Constraints
We express cardinality constraints by drawing either a directed line (), signifying one, or an undirected line (), signifying many, between the relationship set and the entity set. One-to-one relationship:

A customer is associated with at most one loan via the relationship borrower
A loan is associated with at most one customer via borrower

One-To-Many Relationship
In the one-to-many relationship a loan is associated with at most one customer via borrower, a customer is associated with several (including 0) loans via borrower

Many-To-One Relationships
In a many-to-one relationship a loan is associated with several (including 0) customers via borrower, a customer is associated with at most one loan via borrower

Many-To-Many Relationship
A customer is associated with several (possibly 0) loans via borrower A loan is associated with several (possibly 0) customers via borrower

Chapter 3-74

74

Example COMPANY Database


Requirements of the Company (oversimplified for illustrative purposes)
The company is organized into DEPARTMENTs. Each department has a name, number and an employee who manages the department. We keep track of the start date of the department manager. Each department controls a number of PROJECTs. Each project has a name, number and is located at a single location.

Chapter 3-75

75

Example COMPANY Database (Cont.)


We store each EMPLOYEEs social security number, address, salary, sex, and birthdate.
Each employee works for one department but may work on several projects. We keep track of the number of hours per week that an employee currently works on each project.

We also keep track of the direct supervisor of each employee.


Each employee may have a number of DEPENDENTs.

For each dependent, we keep track of their name, sex, birthdate, and relationship to employee.

Chapter 3-76

76

ER Model Concepts
Entities and Attributes
Entities are specific objects or things in the mini-world that are represented in the database. For example the EMPLOYEE John Smith, the Research DEPARTMENT, the ProductX PROJECT Attributes are properties used to describe an entity. For example an EMPLOYEE entity may have a Name, SSN, Address, Sex, BirthDate A specific entity will have a value for each of its attributes. For example a specific employee entity may have Name='John Smith', SSN='123456789', Address ='731, Fondren, Houston, TX', Sex='M', BirthDate='09-JAN-55 Each attribute has a value set (or data type) associated with it e.g. integer, string, subrange, enumerated type,

Chapter 3-77

77

Types of Attributes (1)


Simple
Each entity has a single atomic value for the attribute. For example, SSN or Sex.

Composite
The attribute may be composed of several components. For example, Address (Apt#, House#, Street, City, State, ZipCode, Country) or Name (FirstName, MiddleName, LastName). Composition may form a hierarchy where some components are themselves composite.

Multi-valued
An entity may have multiple values for that attribute. For example, Color of a CAR or PreviousDegrees of a STUDENT. Denoted as {Color} or {PreviousDegrees}.

78

Types of Attributes (2)


In general, composite and multi-valued attributes may be nested arbitrarily to any number of levels although this is rare.
For example, PreviousDegrees of a STUDENT is a composite multi-valued attribute denoted by {PreviousDegrees (College, Year, Degree, Field)}. stored & derived attributes Age (derived attribute) is derivable from BirthDate (stored attribute), I.e., Current Date-BirthDate) Null Values
not applicable, e.g., Apartment Number, College Degree unknown the attribute value exists but is missing e.g., Height it is not known whether the value exists e.g., HomePhone

Chapter 3-79

79

Entity Types and Key Attributes


Entities with the same basic attributes are grouped or typed into an entity type. For example, the EMPLOYEE entity type or the PROJECT entity type.
An attribute of an entity type for which each entity must have a unique value is called a key attribute of the entity type. For example, SSN of EMPLOYEE. A key attribute may be composite.
For example, VehicleTagNumber is a key of the CAR entity type with components (Number, State).

An entity type may have more than one key. For example, the CAR entity type may have two keys:
VehicleIdentificationNumber (popularly called VIN) and VehicleTagNumber (Number, State), also known as license_plate number.

Chapter 3-80

80

ENTITY SET corresponding to the ENTITY TYPE CAR


CAR Registration(RegistrationNumber, State), VehicleID, Make, Model, Year, (Color)
car1 ((ABC 123, TEXAS), TK629, Ford Mustang, convertible, 1999, (red, black)) car2 ((ABC 123, NEW YORK), WP9872, Nissan 300ZX, 2-door, 2002, (blue)) car3 ((VSY 720, TEXAS), TD729, Buick LeSabre, 4-door, 2003, (white, blue))

. . .

Chapter 3-81

81

ER Diagram Notations

ER DIAGRAM Entity Types are:


EMPLOYEE, DEPARTMENT, PROJECT, DEPENDENT

Chapter 3-82

82

Chapter 3-83

83

Relationships and Relationship Types


A relationship relates two or more distinct entities with a specific meaning.
For example, EMPLOYEE John Smith works on the ProductX PROJECT or EMPLOYEE Franklin Wong manages the Research DEPARTMENT.

Relationships of the same type are grouped or typed into a relationship type.
For example, the WORKS_ON relationship type in which EMPLOYEEs and PROJECTs participate, or the MANAGES relationship type in which EMPLOYEEs and DEPARTMENTs participate. The degree of a relationship type is the number of participating entity types. Both MANAGES and WORKS_ON are binary relationships.

Chapter 3-84

84

Relationships and Relationship Types (2)


More than one relationship type can exist with the same participating entity types.

For example, MANAGES and WORKS_FOR are distinct relationships between EMPLOYEE and DEPARTMENT, but with different meanings and different relationship instances.

ER DIAGRAM Relationship Types are:

85

WORKS_FOR, MANAGES, WORKS_ON, CONTROLS, SUPERVISION, DEPENDENTS_OF

3-21

Weak Entity Types (vs. Strong Entity types)


- An entity type that does not have a key attribute - A weak entity type must participate in an identifying relationship type with an owner or identifying entity type. - Entities are identified by the combination of :

A partial key of the weak entity type


set of attributes can uniquely identify weak entities that are related to the same owner entity

The particular entity they are related to in the identifying entity type

3-86

3-21

Example: Suppose that a DEPENDENT entity is identified by the dependents first name and birthdate, and the specific EMPLOYEE that the dependent is related to. DEPENDENT is a weak entity type with EMPLOYEE as its identifying entity type via the identifying relationship type DEPENDENT_OF.

A weak entity type always has a total participation constraint with respect to its identifying relationship.

3-87

88 Weak Entity Type is: DEPENDENT Identifying Relationship is: DEPENDENTS_OF

Keys
A super key of an entity set is a set of one or more attributes whose values uniquely determine each entity. A candidate key of an entity set is a minimal super key

Customer_id is candidate key of customer account_number is candidate key of account

Keys
Candidate key is a minimal set of attributes necessary to identify a tuple, this is also called a minimal superkey. Examples of superkeys {employeeID, Name}, {employeeID, Name, job}, and {employeeID, Name, job, departmentID}. The last example is known as trivial superkey, because it uses all attributes of this table to identify the tuple.In a real database we do not need values for all of those attributes to identify a tuple. This is a minimal superkey that is, a minimal set of attributes that can be used to identify a single tuple. So, employeeID is a Candidate key.

We only need, per our example, the set {employeeID}.

Although several candidate keys may exist, one of the candidate keys is selected to be the primary key.

Keys for Relationship Sets


The combination of primary keys of the participating entity sets forms a super key of a relationship set.

(customer_id, account_number) is the super key of depositor

NOTE: this means a pair of entity sets can have at most one relationship in a particular relationship set.
Example: if we wish to track all access_dates to each account by each customer, we cannot assume a relationship for each access. We can use a multivalued attribute though

Must consider the mapping cardinality of the relationship set when deciding what are the candidate keys

Need to consider semantics of relationship set in selecting the primary key in case of more than one candidate key

Weak Entity Sets


For most entity sets, a primary key can be easily identified by looking at its attributes.

Such an entity set, which has a primary key, is referred to as a strong entity set.

For some entities, however, a primary key cannot be identified among its attributes.

Such an entity set, which does not have a primary key, is referred to as a weak entity set.

A weak entity set is typically associated with an identifying entity set (which is usually strong) via a total, one-to-many relationship.

Weak Entity Sets, Cont.


In such a case, the (weak) entity typically has a subset of attributes, called a discriminator (or partial key), that distinguishes among all entities of the weak entity set associated with one identifying entity.

In such a case, a primary key for the weak entity set can be constructed with two parts:
The primary key of the associated identifying entity set The weak entity sets discriminator

Weak Entity Sets (Cont.)


A weak entity set is represented by double rectangles.

The discriminator is underlined with a dashed line.

Primary key for payment is (loan-number, payment-number)

Weak Entity Sets (Cont.)


The primary key of the strong entity set is not explicitly specified in the weak entity set, since it is implicit in the identifying relationship. If loan-number were explicitly specified:
Payment would be a strong entity set The relationship between payment and loan would not be as clear

More Weak Entity Set Examples


In a university, a course is a strong entity and a course-offering can be modeled as a weak entity.

Course = (course-number, name, description) Course-offering = (semester, section-number, instructor)

The discriminator of course-offering would be semester (including year) and section-number (if there is more than one section).
If course-offering were modeled as a strong entity then it would have coursenumber as an attribute.
The relationship with course would be implicit in the course-number attribute.

97

ER diagram for banking enterprise