Anda di halaman 1dari 111

DATABASE CONCEPT

Charumatr Pinthong

What is a Database? Database (DB) is a collection of stored operational data used by the application systems of some particular enterprise.

What is a Database Management System?


Database Management System (DBMS) is a software that handle all access to the database.

Problem in using Conventional File System


Data Redundancy  Data Inconsistency  Data Sharing  Data Dependency  Standardization  Security  Etc.


Benefits of Database Approach


Controlled data redundancy  Consistency of data  Integration of data  Sharing of data  Enforcement of standards  Ease of application development  Uniform security and integrity  Data accessibility, ad hoc queries  Reduced program maintenance


Examples of Pros and Cons in Databaes System (con.)




Pros
Reduce data redundancy and inconsistancy Fast application development Enforce standardization Data independence security

Examples of Pros and Cons in Database System




Cons
Expensive Resource comsuming Slow Require experience development team

Application Design Database vs Conventional


What are the difference?  Who can see the difference?  Who will feel the effect?


Users System Analysis Designers DBAs Admin. Vendors

Key Database Concepts


INTEGRATION logically organized to reduce redundancy and faclitate data access  SHARING uses have access to the same data  INDEPENDENCE application programs are resilent to change in database structure


Stages of the DBMS



  

Stage 1: Elementary Data Files predominant


in the early 1960s Stage 2 : File Access Method predominant in the late 1960s Stage 3 : Early Database System predominant in the early 1970s Stage 4 : Today Database System 1970 - now

Stage 1: Elementary Data Files (Predominant in the early 1960s)


Logical File Physical data (serially organized)

Simple Input/output) software

Files organized in serial manner. Physical data structure essentially the same as logical file structure. Batch-processing with no real-time access. Several copies stored of the same file because previous generations of data are kept. Software handles only input/output operations. Application programmer designs the physical file layouts and embeds them in the application programs. If the data structure or storage device are changed, the application program must be rewritten, recompiled, and retested. Data is usually designed and optimized for one application. Hence the same data is rarely used across applications. High level of redundancy between data files.

Stage 2: File Access Method (Predominant In the late 1960s)


Logical File Serial data set Logical File Direct access Data sets

Access method

Access method (with addressing Mechanism)

Serial access or random access is possible to records (not fields). Processing is batch, in-line, or real-time. Logical and physical file organization are distinguished but the relationship between them is fairly simple. Storage units can be changed without changing the application program. Data structures are usually serial, indexed sequential, or simple direct access. Multiple-key retrieval is generally not used. Data security measures may be used (but are rarely very secure). Data still tends to be designed and optimized primarily for one application. Much data redundancy still exists. If hierarchical files are used, the programmer usually has to construct the father-son relationships. The software provides "access" methods but not "data management"

Stage 3 : Early Database System predominant in the early 1970s)

Multiple different logical files can be derived from the same physical data. The same data can be accessed in different ways by applications with different requirements. Software provides the means to lessen data redundancy. Data elements are shared between diverse applications. Absence of redundancy facilitates data integrity. The physical storage organization is independent of the application programs. It may be changed often in order to improve the data-base performance without application program modification. The data is addressable at the field or group level. Multiple-key retrieval can be used. Complex forms of data organization are used without complicating the application programs.

Stage 4 : Todays Requirement in Data Base System

Software provides logical as well as physical data independence, allowing a global logical view of the data to exist independently of certain changes in the application programs' views of data or the physical data layouts. The data independence capabilities listed in Fig. 3.2 are provided. The data base may evolve without incurring high maintenance costs. Facilities are provided for a Data-Base Administrator to act as controller and custodian of the data, and ensure that its organization is the best for the users as a whole. Effective procedures are provided for controlling the privacy, security, and integrity of the data. Inverted files are used on some systems to permit rapid data base searching. Data bases are designed to provide answers to unanticipated forms of information request. Data migration is facilitated. The software provides a data description language for the Data-Base Administrator, a command language for the application programmer, and sometimes a data interrogation language for the user.

Primary Objectives of Data-Base Organization




The data base is the foundation stone of future application development


It should make application development easier, cheaper, faster, and more flexible.

The data can have multiple uses


Different users who perceive the same data differently can employ them in different ways.

Primary Objectives of Data-Base Organization (Con.)




Intellectual investment is protected


Existing programs and logical data structures (representing many man-years) will not have to be redone when changes are made to the data base.

Clarity
Users can easily know and understand what data are available to them.

Ease of use
Users can gain access to data in a simple fashion. Complexity is hidden from the users by the data-base management system.

Primary Objectives of Data-Base Organization (Con.)




Flexible usage
The data can be used or searched in flexible way with different access paths.

Unanticipated requests for data can be handled quickly


Spontaneous requests for data can be handled without application programs having to be written (a time-consuming bottleneck), by means of high-level query or report generation languages.

Primary Objectives of Data-Base Organization (Con.)




Change is easy
The data base can grow and change without interfering with established ways of using the data.

Low cost
Low cost of storing and using data, minimization of the high cost of making changes.

Less data proliferation


New application needs may be met with existing data rather creating new files, thus avoiding the excessive data proliferation in todays tape libraries.

Primary Objectives of Data-Base Organization (Con.)




Performance
Data requests can be satisfied with speed suitable to the usage of the data.

Accuracy and consistency


Accuracy controls will be used. The system will avoid having multiple versions of the same data item available to users in different stages of updating.

Privacy
Unauthorized access to the data will be prevented. The same data may be restricted in different ways from different uses.

Primary Objectives of Data-Base Organization (Con.)




Protection from loss or damage


Data will be protected from failures and catastrophes, and from criminals, vandals, incompetents, and persons who might falsely update them.

Availability
Data are quickly available to users at almost all times when they need them.

Secondary objectives (to help achieve the primary objectives)




Physical data independence


Storage hardware and physical storage techniques can be changed without causing application program rewriting (Fig. 4.1).

Logical data independence


New data items can be added, or the overall logical structures expanded, without existing programs having to be rewritten (Fig.4.2).

Controlled redundancy
Data items will be stored only once except where there are technical or economic reasons for redundant storage.

Secondary objectives (to help achieve the primary objectives) (Con.)




Suitably fast access


Access mechanisms and addressing methods will be fast enough for the usage in question.

Suitably fast searching


The need for fast spontaneous searching of the data will grow as interactive systems usage spreads.

Data standardization within a corporation


Interdepartmental agreement is needed on data formats and definitions. Standardization is needed between departments who would otherwise have created incompatible data.

Secondary objectives (to help achieve the primary objectives) (Con.)




Data dictionary
A data dictionary, defining all data items used, is needed.

High-level programmer interface


Application programmers should use simple, powerful data requests and be insulated from the complexities of file layout and addressing.

End user language


A high-level query or report-generation language should permit some end users to bypass the application programming step.

Secondary objectives (to help achieve the primary objectives) (Con.)




Integrity controls
Range checks and other controls should detect data inaccuracies where possible.

Tunability
The data base should be tunable, to improve performance without causing application program rewriting.

Design and monitoring aids


Aids which permit the designer or data administrator to predict and optimize performance.

Secondary objectives (to help achieve the primary objectives) (Con.)




Automatic reorganization or migration


Data migration or other automatic physical reorganization designed to improve performance.

Evolution to distributed data base operation


The system should be designed so that distributed processing and computer network operation can evolve.

An Architecture for a Database System




Three levels databases system Internal level Conceptual level External level

Components of DATABASE SYSTEM Environment


DBMS  DATA  Database Administrator (DBA)  Application Programmers  Users


Data Models DB System


Hierarchical approach  Network approach  Relational approach  . What is next?


Fig. 3.1 Sample data in relation form

Fig. 3.3 Sample data in hierarchical form (parts superior to suppliers)

Fig. 3.5 Sample data in network form

Relational Database?

Why Relation Database?




Simplicity
User Design

Standards  Portability  Distributed processing




Relational Model
Mathematical foundation  Basic component a RELATION  Relation-at-a-time processing  Non-procedural languages


Relational Data Model




Data structure
Relation two dimension table with ROW and COLUMN Tuple values that form one row Domain set of all possible values for attributes Primary and candidate keys

Relational Data Model (con.)




Data Integrity
Entity integrity Referential integrity

Data manipulation based on relation algebra

Are We Ready for Database Management System?


Yes  No, How are we going to prepare for database management system?


Phaesd S/W Development


Preliminary study  Feasibility study  Implementation


Requirement definition DATA Modeling System design Program development System testing Installation & Maintenance

Data Modeling
Data modeling is a technique to build the data model which is a map to show how data are associated.  WHEN?  HOW?


Integration System
Conceptual Design  An Example


Integration System


Conceptual Design From a book, Strategies for Data Modeling by M. Vetter

Integration System


An Example

Data Modeling


A simplified

A system which meet users requirements

Reduce change

Reduce cost

A system which meet users requirements

Should design system based on business activity

Because the business activity itself is more stable than the organization

Data Modeling

Is a technique to build the data model which is a map to show how data are associated.

DATA is an important resource within the company & It belongs to the whole organization

DATA need to be MANAGED and SHARED

Why do we need data modeling?


Extract/Clarify user requirement  Design data base  Make the data integration planning  Make the standardization of data elements  Minimize change


Rules for defining object


1. 2. 3.

Unique key Exists independently/meaningful Required by the application Unique key The whole key Nothing but the key

Rules for defining attributes


1. 2. 3.

6. Correction (step 2,4)


The DATA MODEL, in this case, need Not be corrected since it satisfies All specified information requirements.