Anda di halaman 1dari 39

Week 3: Introduction to Databases

The database is the underlying framework of the information system. Relational Database Management Systems (DBMSs) is the predominant system for business applications at present.

MIST811: Week 3

3.1 Introduction
A database is a collection of related data. The Database Management System (DBMS) is the software that manages and controls access to the database. A database application is a program that interacts with the database at some point of its execution. The database system is a collection of application programs that interact with the database along with the DBMS and the database itself. More accurate definitions will be provided later.
MIST811: Week 3 2

Examples
Purchases from a supermarket
Bar code reader to scan each purchase. Bar code reader is linked to an application program that uses the bar code to find out the price of the item from a product database. This program reduces the number of such items in stock and displays the price on the cash register. If the reorder level falls below a specified threshold, the database system may automatically place an order to obtain more stocks of that item.

MIST811: Week 3

Examples ctd
Purchases using your credit card
Assistant normally checks you have sufficient credit left to make the purchase. This can be done on the telephone or may be done automatically by a card reader linked to a computer system. There is a database somewhere that contains information about the purchases you have made on your credit card. To check your credit there is a database application program that uses your credit card number to check that the price of the goods you wish to buy together with the sum of the purchases you have already made this billing period is within your credit limit.
MIST811: Week 3 4

Credit card example ctd


After confirmation of the purchase the details of the purchase are added to this database. The application program also accesses the database to check that the credit card is not on a list of stolen credit cards before authorising the purchase. There are other application programs to send out monthly statements to each credit card holder and to credit accounts when payment is received.
MIST811: Week 3 5

Examples ctd
Booking a holiday using a travel agency
When you make enquiries about a holiday, the travel agent may access several databases containing holiday and flight details. When you book the holiday, the database system has to make all the necessary booking arrangements.
The system has to ensure that two different agents do not book the same holiday or overbook the seats on a flight.

The travel agent may have another, usually separate, database for invoicing.

MIST811: Week 3

3.2 Traditional File-Based Systems


The file-based system is the predecessor of the database system. This approach is mostly obsolete but:
understanding the problems inherent in filebased systems may prevent us from repeating these problems in our database systems, understanding how the file system works is extremely useful when converting a file-based system to a database system.
MIST811: Week 3 7

3.2.1 File-Based Approach


DEFN: The File-based system is a collection of application programs that perform services for the end-users such as the production of reports. Each program defines and manages its own data. These were an early attempt to computerise the manual filing system that most of us are familiar with. Really only works well when the number of items to be stored is small. It works adequately when there are large numbers and we have only to store and retrieve them.
MIST811: Week 3 8

When does the manual filing system break down?


The manual filing system breaks down when we have to cross-reference or process information in the files. Example: A typical real estate agents office might have a separate file for each property for sale or rent, each potential buyer and renter, and each member of staff.
MIST811: Week 3 9

Real Estate Example ctd


How easy would it be to answer these questions using this set up? What three-bedroom properties do you have for sale with a garden and garage? What flats do you have for rent within three kilometres of the city centre? What is the average rent for a two-bedroom flat? What is the total annual salary bill for staff? How does last months turnover compare with the projected figure for this month? What is the expected monthly turnover for next financial year?

MIST811: Week 3

10

Real Estate Example ctd


The file-based system was developed in response to the needs of industry for more efficient data access. A decentralised approach was taken, where each department, with the assistance of Data Processing (DP) staff, stored and controlled its own data. We will illustrate this using the DreamHome example which will be outlined on the following slides.
MIST811: Week 3 11

DreamHome Example The Sales Department is responsible for selling and renting property. The form that landlord has to fill out before property is marketed for rent is given on next slide.
This gives details of rental property as well as owner (landlord) details.
MIST811: Week 3 12

Figure 3.1

Sales Department forms: (a) Property for Rent Details form

MIST811: Week 3

13

DreamHome Example ctd


- Sales Department also handles enquiries from clients (renters) Figure 3.1 ctd (b) Client Details form.
MIST811: Week 3 14

DreamHome Example ctd


With assistance of Data Processing (DP) Department, Sales Department creates an information system to handle renting the property
This consists of three files containing property, owner, and client details (see next slide) For simplicity details relating to staff members, branch offices and business owners are omitted.

MIST811: Week 3

15

Figure 3.2 Sales files used


PropertyForRent

PrivateOwner

MIST811: Week 3

16

Figure 3.2 Sales files used ctd


PrivateOwner

Client

MIST811: Week 3

17

DreamHome Example ctd


Contracts Department is responsible for rental agreements. Whenever a client agrees to rent a property a form is filled out by one of the Sales staff (see next slide) which is passed to the Contracts Department which allocates a lease number and completes the payment and rental period details.
MIST811: Week 3 18

Figure 3.3

Lease Details form used by Contracts Dept.

MIST811: Week 3

19

DreamHome Example ctd


With assistance from the Data Processing (DP) Department the Contracts Department creates an information system to handle lease agreements.
This consists of three files containing lease, property, and client details (see next slide) The data is similar to that held by the Sales Department.
MIST811: Week 3 20

10

Figure 3.4
Lease file

PropertyForRent file

MIST811: Week 3

21

Figure 3.4 ctd


Client file

Lease file

MIST811: Week 3

22

11

DreamHome Example ctd


Figure 3.5 illustrates the situation. Each department accesses their own files through applications programs written specially for them. Each set of application programs handles data entry, file maintenance, and the generation of a fixed set of specific reports. The physical structure and storage of the data files and records are defined in the application code.
MIST811: Week 3 23

Figure 3.5 File-based processing

Sales Files PropertForRent (propertyNo, street, city, postcode, type, rooms, rent, ownerNo) PrivateOwner (ownerNo, fName, lName, address, telNo) Client (clientNo, fName, lName, address, telNo, prefType, maxRent)
MIST811: Week 3 24

12

Figure 3.5 ctd

Contract Files Lease (leaseNo, propertyNo, clientNo, rent, paymentMethod, deposit, paid, rentStart, rentFinish, duration) PropertyForRent (propertyNo, street, city, postcode, rent) Client (clientNo, fName, lName, address, telNo)
MIST811: Week 3 25

DreamHome Example ctd There is a significant amount of duplication of data in the two departments. This duplication of data is generally true of file-based systems.

MIST811: Week 3

26

13

Terminology used in file-based systems


A file is simply a collection of records, which contain logically related data. Each record contains a logically connected set of one or more fields, where each field represents some characteristics of the real-world object that is being modelled.
MIST811: Week 3 27

3.2.2 Limitations of the File-Based Approach


Separation and isolation of data It is more difficult to access data that should be available when the data is isolated in separate files. The difficulty is compounded if we required data from more than two files.
MIST811: Week 3 28

14

Limitations of file-based systems ctd


Duplication of data

Uncontrolled duplication of data is undesirable It is wasteful. It costs time and money to enter the data more than once. It takes up additional storage space which has costs attached. Often duplication can be avoided by sharing files. Duplication can lead to loss of data integrity; the data is no longer consistent.
MIST811: Week 3 29

Limitations of file-based systems ctd


Data dependence The physical structure and storage of the data files and records are defined in the application code. Making changes to an existing structure is difficult.

MIST811: Week 3

30

15

Limitations of file-based systems ctd


Incompatible file formats

The structure of files is embedded in the applications program. The structures are dependent on the application programming language.
Example: The structure of a file generated by a COBOL program may be different from the structure of a file generated by a `C program. The direct incompatability of such files makes them difficult to process jointly. They need to be converted to some common format to facilitate processing.
MIST811: Week 3 31

Limitations of file-based systems ctd


Fixed queries/proliferation of application programs File-based systems are very dependent upon the application developer, who has to write any queries or reports that are required. For some organisations the query or report that could be produced was fixed.
MIST811: Week 3 32

16

Limitations of file-based systems ctd


In other organisations there was a proliferation of files and application programs.
Eventually the DP Department with its current resources could not handle all the work. The pressure put on DP staff resulted in programs that were inadequate or inefficient in meeting the demand of the users, had limited documentation, and maintenance that was difficult.
MIST811: Week 3 33

Limitations of file-based systems ctd


Certain types of data functionality were often sacrificed:
no prevision for security or integrity, In the event of hardware or software failure recovery was limited or non-existent, no provision for shared access, access to files was restricted to one user at a time.

MIST811: Week 3

34

17

3.3 Database Approach


The limitations of the file-based approach can be attributed to two factors:
1. The definition of the data is embedded in the applications program, rather than being stored separately and independently; 2. There is no control over the access and manipulation of data beyond that imposed by the applications programs.

MIST811: Week 3

35

The database
The database and the Database Management System (DBMS) were developed to overcome these limitations.

DEFN: The database is a shared collection of logically related data, and a description of this data, designed to meet the information needs of an organisation.
The database is a single, possibly large, repository of data that can be used simultaneously by many departments and users, All data items are integrated with a minimum amount of duplication.
MIST811: Week 3 36

18

The database ctd


The database is a shared corporate resource. The database holds not only the organisations operational data but also a description of the data. For this reason, a database is also defined as a selfdescribing collection of integrated records. The description of the data is known as the system catalog (or data dictionary or metadata the `data about the data). It is the self-describing nature of a database that provides program-data independence.

MIST811: Week 3

37

The database ctd


The database approach separates the structure of the data from the application programs and stores it in the database. If the new data structures are added or existing structures are modified then the application programs are unaffected, provided they do not directly depend upon what has been modified.
If we add a new field or record or create a new file, existing applications are unaffected. If we remove a field from a file that an application program uses, then that application program is affected by this change and must be modified accordingly.
MIST811: Week 3 38

19

Some more definitions


An entity is a distinct object (a person, place, thing, concept, or event) in the organisation that is to represented in the database. An attribute is a property that describes some aspect of the object that we wish to record. A relationship is an association between entities.
MIST811: Week 3 39

DreamHome Case Study


Figure 3.6 Example Entity Relationship (ER) Diagram for part of the DreamHome

MIST811: Week 3

40

20

DreamHome Case Study ctd


Figure 3.6 consists of: Six entities (rectangles): Branch, Staff, PropertyForRent, Client, PrivateOwner, and Lease; Seven relationships (the names adjacent to the lines): Has, Offers, Oversees, Views, Owns, LeasedBy, and Holds; Six attributes, one for each entity: branchNo, staffNo, propertyNo, clientNo, ownerNo, and leaseNo.
MIST811: Week 3 41

DreamHome Case Study ctd


The database represents the entities, the attributes, and the logical relationships between the entities. That is, the database holds data that is logically related.

MIST811: Week 3

42

21

3.3.2 The Database Management System (DBMS)


DEFN: The Database Management System (DBMS) is a software system that enables users to define, create, maintain, and control access to the database. The DBMS is the software that interacts with the users application programs and the database.
MIST811: Week 3 43

DBMS ctd
A DBMS usually provides the following facilities It allows users to define the database, usually through a Data Definition Language (DDL). The DDL allows users to specify the data types and structures and the constraints on the data to be stored in the database.
MIST811: Week 3 44

22

DBMS ctd
It allows users to insert, update, delete, and retrieve data from the database, usually through a Data Manipulation Language (DML). Having a central repository for all data and data descriptions allows the DML to provide a general inquiry facility to this data, called a query language. The most common query language is the Structured Query Language (SQL).
MIST811: Week 3 45

DBMS ctd It provides access to the database. It may provide


A security system which prevents unauthorized users accessing the database; An integrity system which maintains consistency of stored data;

MIST811: Week 3

46

23

DBMS ctd
A concurrency system which allows shared access to the database; A recovery system which restores the database to a previous consistent state following a hardware or software failure; A user-accessible catalogue which contains descriptions of the data in the database.
MIST811: Week 3 47

DBMS ctd
DEFN: The Application Program is a computer program that interacts with the database by issuing an appropriate request (typically an SQL statement) to the DBMS. Users interact with the database through a number of application programs that are used to create and maintain the database and to generate information. The application programs may be written in some programming language or in some higher-level fourth-generation language.
MIST811: Week 3 48

24

Figure 3.7

Pearson Education Limited 1995, 2005


MIST811: Week 3 49

Views
DEFN: A view is some subset of the database. It allows users to see the data the way they want to see it. Benefits of views: Provide a level of security. Can exclude data that some users should not see. Provide a mechanism to customize the appearance of the database, Can present a consistent, unchanging picture of the structure of the database even if the underlying database is changed.
MIST811: Week 3 50

25

3.3.4 Components of the DBMS Environment


There are five major components: hardware, software, data, procedures, and people.

MIST811: Week 3

51

Hardware
Needed for DBMS and the applications to run on. Can range from a single personal computer, to a single mainframe, to a network of computers. Depends on the organizations requirements and the DBMS used. A DBMS requires a minimum amount of main memory and disk space to run, but this minimum may not give acceptable performance. The frontend is the part of the DBMS that interfaces with the user. This is called client-server architecture: the backend is the server and the frontends are the clients.

MIST811: Week 3

52

26

Software
Made up of DBMS software and the application programs, together with operating system, including network software if the DBMS is being used over a network. Application programs may be written using a fourth-generation language like SQL embedded in a third generation language.
MIST811: Week 3 53

Data
From the end-users point of view data is the most important component of the DBMS environment. Data is the bridge between the machine components and human components. The database contains both the operational data and the metadata (data about data). The structure of the database is called the schema.
MIST811: Week 3 54

27

Data ctd
Figure 3.7 on earlier slide is a schema consisting of four files (or tables): PropertyForRent, PrivateOwner, Client, Lease.
PropertyForRent table has 8 fields or attributes: propertyNo, street, city, postcode, type (the property type), rooms (the number of rooms), rent (the monthly rent), ownerNo. ownerNo attribute models the relationship between PropertyForRent and PrivateOwner (the owner Owns the property for rent), see Figure 3.6.

MIST811: Week 3

55

Procedures
Procedures are instructions and rules that govern the design and use of the database. Documented procedures on how to run the system are required by the users of the system and the staff that manage the database. Procedures may consist of instructions on how to:
log on to the DBMS; use a particular DBMS facility or application program; start and stop the DBMS; change the structure of a table.
MIST811: Week 3 56

28

People
There are four distinct types of people: data and database administrators, database designers, application developers, and end-users.

MIST811: Week 3

57

Data and Database Administrators


They deal with the management and control of a DBMS and its data. The Data Administrator (DA) is responsible for the management of the data resource including database planning, development and maintenance of standards, policies and procedures, and conceptual/logical database design. This person consults with and advises senior managers so that the database development will support corporate objectives.
MIST811: Week 3 58

29

Data and Database Administrators ctd


The Database Administrator (DBA) is responsible for the physical realization of the database, including physical database design and implementation, security and integrity control, maintenance of the operational system, and ensuring satisfactory performance of the application for users. This person is more technical than DA. Some organizations have one person performing both roles.
MIST811: Week 3 59

Database Designers
Large database design projects have two types of designer
Logical database designer: identifies the data (the entities and attributes), the relationships between the data, and the constraints on the data that is to be stored in the database. Physical database designer: decides how the logical database design is to be physically realized.
MIST811: Week 3 60

30

Application Developers
People who implement the application programs that provide the required functionality for the end-users. Usually application developers work from a specification produced by systems analysts. Each program contains statements that request the DBMS to perform some operations on the database including retrieving data, inserting, updating, and deleting data.
MIST811: Week 3 61

End-Users
End-users are the clients for the database, which has been designed and implemented, and is being maintained to serve their information needs. End-users can be classified according to the way they use the database system:
Nave users: access database through specially written application programs that attempt to make the operations as simple as possible. Sophisticated users: knows the structure of the database and facilities offered by the DBMS. May use a high-level query language like SQL to perform the required operations or may even write application programs for their own use.
MIST811: Week 3 62

31

3.4 History of Database Management Systems


First-generation (mid 1960s)
Hierarchical model (e.g. Information Management System) and network model by IDS (Integrated Data Store).

Second-generation
Relational model proposed by E.F. Codd in 1970. SQL is the standard language for for relational DBMSs.

Third-generation
Object-Oriented Object-Relational
MIST811: Week 3 63

3.5.1 Advantages of DBMSs


Control of data redundancy Data consistency More information from the same amount of data Sharing of data Improved data integrity Improved security Enforcement of standards Economy of scale
MIST811: Week 3 64

32

3.5.1 Advantages of DBMSs ctd


Balance of conflicting requirements Improved data accessibility and responsiveness Increased productivity Improved maintenance through data independence Increased concurrency Improved backup and recovery services
MIST811: Week 3 65

3.5.2 Disadvantages of DBMSs


Complexity Size Cost of DBMSs Additional hardware costs Cost of conversion Performance High impact of failure
MIST811: Week 3 66

33

3.6 The Relational Model


The Relational Database Management System (RDBMS) is the dominant data-processing software used today. This software is the second generation of DBMSs and is based on the relational model of E.F. Codd (1970). Codd was trained as a mathematician. Set theory and predicate logic mainly underpin his model. In the relational model, all data is logically structured within relations (tables). Each relation has a name and is made up of named attributes (columns) of data. Each tuple (row) contains one value per attribute.
MIST811: Week 3 67

3.6.1 Terminology
Relation: A relation is a table with columns and rows. Attribute: An attribute is a named column of a relation. Domain: A domain is the set of allowable values for one or more attributes. Tuple: A tuple is a row of a relation. Degree: The degree of a relation is the number of attributes it contains. Cardinality: The cardinality of a relation is the number of tuples it contains. Relational database: A collection of normalized* relations with distinct relation names. *(More on this in a later lecture.)
MIST811: Week 3 68

34

Alternative terminology
Formal terms Relation Tuple Attribute Alternative 1 Table Row Column
MIST811: Week 3

Alternative 2 File Record Field


69

3.6.2 Relational Keys


Superkey: An attribute, or set of attributes, that uniquely identifies a tuple within a relation. Candidate key: A superkey such that no proper subset is a superkey within the relation. Primary key: The candidate key that is selected to identify tuples uniquely within the relation. Foreign key: An attribute, or set of attributes, within one relation that matches the candidate key of some (possibly the same) relation.
MIST811: Week 3 70

35

Representing Relational Database Schemas


The common convention is to give the name of the relation followed by the attribute names in parenthesis. The primary key is usually underlined. (Q2 of the tutorial exercises provides an example.) The conceptual model, or conceptual schema, is the set of all such schemas for the database. (Q2 of the tutorial exercises provides an example.)
MIST811: Week 3 71

Integrity Constraints
Null: Represents a value for an attribute that is currently unknown or is not applicable for this tuple. Base relation: A named relation corresponding to an entity in the conceptual schema, whose tuples are physically stored in the database. Entity Integrity: In a base relation, no attribute of the primary key can be null. Referential Integrity: If a foreign key exists in a relation, either the foreign key value must match a candidate key value of some tuple in its home relation or the foreign key must be wholly null.
MIST811: Week 3 72

36

3.6.3 Views
View: The dynamic result of one or more relational operations operating on the base relations to produce another relation. A view is a virtual relation that does not necessarily exist in the database but can be produced upon request by a particular user, at the time of request. Views provide a powerful and flexible security mechanism by hiding parts of the database from certain users. Views permit users to access data in a way that is customised to their needs. Views can simplify complex operations on the base relations.
MIST811: Week 3 73

Tutorial Exercises for Week 3


The Wellmeadows Hospital Case Study is from DATABASE SYSTEMS A Practical Approach to Design, Implementation, and Management., Thomas Connolly & Carolyn Begg (2005), pp. 1260-1267, Addison Wesley. This will be handed out in the lecture.
MIST811: Week 3 74

37

Tutorial Exercises Week 3 ctd


1. Study the Wellmeadows Hospital Case Study hand out.

a) How would a file based approach be implemented? b) In what ways would a DBMS help this organisation? c) What data can you identify that needs to be represented in the database?

d) What relationships exist between the data items?


MIST811: Week 3 75

Tutorial Exercises Week 3 ctd


2. The following tables form part of a database held in a relational DBMS:
Hotel (hotelNo, hotelName, city) Room (roomNo, hotelNo, type, price) Booking (hotelNo, guestNo, dateFrom, dateTo, roomNo) Guest (guestNo, guestName, guestAddress)

where Hotel contains hotel details and hotelNo is the primary key; Room contains room details for each hotel and (roomNo,hotelNo) forms the primary key; Booking contains details of bookings and (hotelNo,guestNo,dateFrom) forms the primary key; Guest contains guest details and guestNo is the primary key.
MIST811: Week 3 76

38

Tutorial Exercises Week 3 ctd


a) Identify the foreign keys in this schema. Explain how the entity and referential integrity rules apply to these relations. b) Produce some sample tables for these relations that observe the relational integrity rules. Suggest some general constraints that would be appropriate for this schema.

HOMEWORK: Submit answers to 1a) 1b), 1c), 1d), 2a), 2b) to ERIC by 6pm Monday 17 March 2008.
MIST811: Week 3 77

Readings
Chapter 1: An Overview of Database Management in An Introduction to Database Systems (8th edition), Addison-Wesley, 2004 by C. J. Date. (QA76.9.D3 D3659 2004) Chapter 1: Introduction to Databases in DATABASE SYSTEMS A Practical Approach to Design, Implementation, and Management (FOURTH EDITION), Addison Wesley, 2005 by Thomas Connolly and Carolyn Begg. (QA76.9.D26 C66 2005)
MIST811: Week 3 78

39

Anda mungkin juga menyukai