Anda di halaman 1dari 76

Database Management

Basic Concepts related to Database


Database

It is a collection of related data.


Data

Known facts that can be recorded and have implicit meaning.


Mini-world

Some part of the real world about which data is stored in database.
Database

management system (DBMS)

A software package used to facilitate the creation and maintenance of a computerized database.
Database

system

It is the DBMS software together with the data itself.


2

Basics of Database

A database is a collection of information applicable to a particular subject or purpose It is a shared collection of logically related data (and a description of this data), designed to meet the information needs of an organization. Logically related data comprises entities, attributes, and relationships of an organization's information. The data is typically grouped into specific categories of information, which are contained in data storage files called table It is a collection of non-redundant data which can be shared by different application systems
3

Contd.

It stresses the importance of multiple applications, data sharing The spatial database becomes a common resource for an agency A database implies separation of physical storage from use of the data by an application program, i.e. Program/data independence The user or programmer or application specialist need not know the details of how the data are stored Changes can be made to data without affecting other components of the system
4

Contd.

Data in a database is stored under various categories known as fields When the information from each of these fields is combined together as one unit, that unit is considered a single record. All of the records combined becomes a table So the tables organize data into rows called records and columns called fields. But the tables only store the raw data In order to make use of these data, we need the following six objects:
5

Contd.

Tables: Store the data for the database Queries: Allow a user to select or interact with different sections of data in the database of their own choosing Forms: Used in conjunction with tables, they allow the user to see a single record or allow for easier data entry Reports: Organizes and summarizes information so that it may be easily read and printed Macros: These are programs within access that allow users to automate certain tasks. Modules: These are pieces of visual basic programming which can be associated with a database or particular parts of a database.
6

Database management: An Overview

data base is a collection on non-redundant data shareable between different application systems. [Howe, D.R. 1989] A database management system (DBMS) is a sophisticated software package capable of handling a database stored in computer files. In other words a DBMS is a data storage and retrieval system which permits data to be stored non-redundantly while making it appear to the user as if the data is wellintegrated.
7

Contd.
It

is a software system that enables users to define, create and maintain the database and which provide controlled access to this database The DBMS provides the interface between the application programs and the data The three main features of a DBMS that make it attractive are: Centralized data management, Data independence Systems integration.
8

DATABASE MANAGEMENT SYSTEM


APPLICATION #1

APPLICATION #2

DBMS
DBMS MANAGES DATA RESOURCES LIKE AN OPERATING SYSTEM MANAGES HARDWARE RESOURCES

DATABASE CONTAINING CENTRALIZED SHARED DATA

APPLICATION #3

File Based System


It

is a collection of application programs that performs services for the end users (e.g. Reports). Each program defines and manages its own data. There is no relationship among these files

10

Limitations of file based system

Separation and isolation of data Each program maintains its own set of data. Users of one program may be unaware of potentially useful data held by other programs. Duplication of data Same data is held by different programs. Wasted space and potentially different values and/or different formats for the same item. Data dependence File structure is defined in the program code. Incompatible file formats Programs are written in different languages, and so cannot easily access each others files. Fixed queries/proliferation of application programs Programs are written to satisfy particular functions. Any new requirement needs a new program.
11

Problems with the file system


File systems require extensive programming in a third-generation language (3gl).

As the number of files expands, system administration becomes difficult.


Making changes in existing file structures is important and difficult. Security features to safeguard data are difficult to program and usually omitted. Difficulty to pool data creates islands of information.
12

Contd.
Structural

and data dependence Field definitions and naming conventions Data redundancy that leads to data inconsistency and data anomalies

13

Types of database files


Flat

files and spreadsheets: All records in this data base have the same number of "fields". Individual records have different data in each field with one field serving as a key to locate a particular record. When the number of fields becomes lengthy a flat file is cumbersome to search. Although this type of database is simple in its structure, expanding the number of fields usually entails reprogramming. Additionally, adding new records is time consuming, particularly when there are numerous fields.
14

Contd.
Hierarchical files:

These store data in more than one type of record. This method is usually described as a "parent-child, one-to-many" relationship. One field is key to all records, but data in one record does not have to be repeated in another. This system allows records with similar attributes to be associated together.

The records are linked to each other by a key field in a hierarchy of files.
15

Contd.

Relational files These connect different files or tables without using internal pointers or keys. Instead a common link of data is used to join or associate records. The link is not hierarchical. A "matrices of tables" is used to store the information. As long as the tables have a common link they may be combined by the user to form new inquires and data output. This is the most flexible system and is particularly suited to SQL (structured query language).
16

Major characteristics of database systems


Self-contained

nature of a database system: A DBMS catalog stores the description (meta-data) of the database. This allows the DBMS software to work with different databases. Insulations between program and data: This is provided through: Data abstractions: A data model is used to hide storage details and present the user with a conceptual view of the database. Program-data independence: Allows changing data storage structures without having to change the DBMS access programs. Program-operation independence: Allows changing operation implementation without having to change the DBMS access programs. Support of multiple views of data
17

Other important characteristics of database technology


Controlling data redundancy Restricting unauthorized access to data. Providing persistent storage for program objects and data structure. Providing multiple interfaces to different classes of users. Representing complex relationships among data. Enforcing integrity constraints on the database. Providing backup and recovery services. Potential for enforcing standards. Flexibility to change data structures. Reduced application development time. Availability of up-to-date information. Economies of scale.
18

COMPONENTS OF DATABASE SYSTEMS


HARDWARE COMPUTER PERIPHERALS
SOFTWARE OPERATING SYSTEMS SOFTWARE DBMS SOFTWARE APPLICATIONS PROGRAMS AND UTILITIES SOFTWARE PEOPLE SYSTEMS ADMINISTRATORS DATABASE ADMINISTRATORS (DBAS) DATABASE DESIGNERS SYSTEMS ANALYSTS AND PROGRAMMERS END USERS PROCEDURES INSTRUCTIONS AND RULES THAT GOVERN THE DESIGN AND USE OF THE DATABASE SYSTEM DATA COLLECTION OF FACTS STORED IN THE DATABASE WHICH 19 ARE USED BY THE ORGANIZATION AND A DESCRIPTION OF

TYPES OF DATABASE SYSTEMS


NUMBER OF USERS SINGLE-USER DESKTOP DATABASE MULTI USER WORKGROUP DATABASE ENTERPRISE DATABASE SCOPE DESKTOP WORKGROUP ENTERPRISE LOCATION CENTRALIZED DISTRIBUTED USE TRANSACTIONAL (PRODUCTION) DECISION SUPPORT DATA WAREHOUSE

20

MAJOR FUNCTIONS OF A DBMS


DATA DICTIONARY MANAGEMENT

DATA STORAGE MANAGEMENT


DATA TRANSFORMATION AND PRESENTATION SECURITY MANAGEMENT

MULTI-USER ACCESS CONTROL


BACKUP AND RECOVERY MANAGEMENT DATA INTEGRITY MANAGEMENT

DATABASE ACCESS LANGUAGES (DDL AND DML) AND APPLICATION PROGRAMMING INTERFACES
DATABASE COMMUNICATION INTERFACES
21

Advantages of using DBMS


The

three main features of a database management system that make it attractive are: Centralized data management, Data independence, And systems integration. In DBMS, all files are integrated into one system thus reducing redundancies and making data management more efficient. In addition, DBMS provides centralized control of the operational data.
22

Contd.
Some

of the advantages of data independence, integration and centralized control are:


Redundancies and inconsistencies can be reduced Better service to the users Flexibility of the system is improved Cost of developing and maintaining systems is lower Standards can be enforced Security can be improved Integrity can be improved Enterprise requirements can be identified Data model must be developed

23

Disadvantages of using DBMS


Confidentiality, Data Data

privacy and security

quality integrity vulnerability

Enterprise The

cost of using a DBMS

24

EVOLUTION OF DATABASE SYSTEMS


FLAT FILES

- 1960S - 1980S HIERARCHICAL 1970S - 1990S NETWORK 1970S - 1990S RELATIONAL 1980S - PRESENT OBJECT-ORIENTED 1990S - PRESENT OBJECT-RELATIONAL 1990S - PRESENT DATA WAREHOUSING 1980S - PRESENT WEB-ENABLED 1990S - PRESENT
25

Database Schema, Instance & State


Schema

A description of a database but not the database itself! Corresponds to the type in a programming language, or the abstract data type Instance An occurrence of a data item described in the schema Database state The data in the database at a moment in time
26

DBMS languages
DDL:

Data Definition Language These are used to define/change the structure of the database In other words these are used to define the schema or describe the data (conceptual schema) DML: Data Manipulation Language After the database is built, these are used to query the database, insert data, change data or delete data DCL: Data Control Language These are used for having control on the user access

27

Database Models
A

database model is a collection of logical constructs used to represent the data structure and the data relationships found within the database. There are two categories of database models Conceptual models:
Focus

on the logical nature of the data representation. They are concerned with what is represented rather than how it is represented.

Implementation models
Places

the emphasis on how the data are represented in the database or on how the data structures are implemented

28

Types of Relationships used in Database Models

Generally following three types of relationships are used: One-to-many relationships (1:M) A painter paints many different paintings, but each one of them is painted by only that painter. Painter (1) paints painting (m) Many-to-many relationships (M:N) An employee might learn many job skills, and each job skill might be learned by many employees. Employee (m) learns skill (n) One-to-one relationships (1:1) Each store is managed by a single employee and each store manager (employee) only manages a single store. Employee (1) manages store (1)
29

Implementation Data Models


There

are three types of implementation database models Hierarchical database model

Network database model


Relational database model

30

A HIERARCHICAL STRUCTURE

31

Hierarchical Database Model


Collection of

records logically organized to conform to the upside-down tree (hierarchical) structure. top layer is perceived as the parent of the segment directly beneath it.

The

The

segments below other segments are the children of the segment above them.
tree structure is represented as a hierarchical path on the computers storage media.

32

Advantages and disadvantages of Hierarchical Database Model


Advantages Conceptual simplicity Database security Data independence Database integrity Efficiency dealing with a large database Disadvantages Complex implementation Difficult to manage Lacks structural independence Applications programming and use complexity Implementation limitations Lack of standards

33

A NETWORK DATABASE MODEL

34

A Network Database Model Basic structure


Set -- a relationship is called a set. Each set is

composed of at least two record types: an owner (parent) record and a member (child) record.
A

set represents a 1:m relationship between the owner and the member.

35

Advantages & Disadvantages of a Network Database Model


Advantages
Conceptual Handles Data

simplicity

more relationship types database integrity to standards

access flexibility independence

Promotes Data

Conformance

Disadvantages
System Lack

complexity
36

of structural independence

Relational database model


RDBMS

allows operations in a human logical environment.

The

relational database is perceived as a collection of tables.


table consists of a series of row/column intersections. (or relations) are related to each other by sharing a common entity characteristic.

Each

Tables

The
A

relationship type is often shown in a relational schema.


table yields complete data and structural independence.
37

LINKING RELATIONAL TABLES

38

Advantages & Disadvantages of Relational Database Model


Advantages Structural independence Improved conceptual simplicity Easier database design, implementation, management, and use Ad hoc query capability (SQL) Powerful database management system Disadvantages Substantial hardware and system software overhead Possibility of poor design and implementation Potential islands of information problems

39

Entity Relationship Modeling


E-R

models are normally represented in an entity relationship diagram (ERD). An entity is represented by a rectangle. Each entity is described by a set of attributes. An attribute describes a particular characteristics of the entity. A relationship is represented by a diamond connected to the related entities.

40

E-R Model Concepts

Entities

and attributes:

Entity - a thing, has independent existence-> employee Attribute describes something -> age, ssn, gender, name Value - taken on by an attribute -> 25, 456-876-788, female, bart simpson Composite attributes vs. Atomic or simple attributes -> bart simpson vs. 45 Single-valued attributes vs. Multivalued attributes -> age vs. College degrees Derived attributes vs. Stored attributes -> age vs. Birth date (age is derived from birth date)

41

Contd.
Entity

types, value sets and key attributes

Entity type - defines the structure of a set of entities that have the same attributes Entity an instance of an entity type Entity set, collections - group of entities Key, uniqueness Combination to create key Value sets (domains)

42

NOTATIONS OF E-R DIAGRAM

ENTITY TYPE

ATTRIBUTE
KEY ATTRIBUTE MULTIVALUED ATTRIBUTE

COMPOSITE ATTRIBUTE
DERIVED ATTRIBUTE

43

DEGREE OF A RELATIONSHIP: BINARY, TERNARY, UNARY


SNAME
QUANTITY

UNARY RELATIONSHIP
PROJECT

SUPPLIER
SUPPLY

TERNARY RELATIONSHIP

PROJNAME MANAGES

PART
EMPLOYEE
PARTNO

SSN

44

45

46

Advantages & Disadvantages of Entity Relationship Data Model


Advantages
Exceptional Visual

conceptual simplicity tool

representation
with the relational database model

Effective communication Integrated

Disadvantages
Limited constraint

representation representation

Limited relationship

No

data manipulation language


of information content
47

Loss

Normalization
Normalization

is a process for assigning attributes to

entities. It reduces data redundancies and helps eliminate the data anomalies.
Normalization

works through a series of stages called

normal forms: First normal form (1NF) Second normal form (2NF) Third normal form (3NF) Fourth normal form (4NF)
The

highest level of normalization is not always desirable

48

Contd.
It's

the process of efficiently organizing data in a database. There are two goals of the normalization process
Eliminate redundant data (for example, storing the same data in more than one table) and Ensure data dependencies make sense (only storing related data in a table).
These

goals help to reduce the amount of space a database consumes and ensure that data is logically stored.

49

Contd.
The

database community has developed a series of guidelines for ensuring that databases are normalized. These are referred to as normal forms and are numbered from one (the lowest form of normalization, referred to as first normal form or 1NF) through five (fifth normal form or 5NF). In practical applications, we often see 1NF, 2NF, and 3NF along with the occasional 4NF. Fifth normal form is very rarely seen All these normalization guidelines are cumulative. For a database to be in 2NF, it must first fulfill all the criteria of a 1NF database.
50

Example for Normalization


Case of a construction company
Building

project -- project number, name, employees assigned to the project.

Employee
The

-- employee number, name, job classification

company charges its clients by billing the hours spent on each project. The hourly billing rate is dependent on the employees position. report is generated.

Periodically, a The

table whose contents correspond to the reporting requirements is shown in table 5.1.

51

Scenario
A few employees works for one project. Employee Num : 101, 102, 103, 105

Project Name : Evergreen Project Num : 15

52

53

TABLE STRUCTURE MATCHES THE REPORT FORMAT

54

Problems with the report format


The project number is intended to be a primary key, but it contains nulls.

The table displays data redundancies.


The table entries invite data inconsistencies. The data redundancies yield the following anomalies:
Update

anomalies. anomalies. anomalies.


55

Addition Deletion

Solving the problem

Conversion to first normal form A relational table must not contain repeating groups. Repeating groups can be eliminated by adding the appropriate entry in at least the primary key column (s).

56

TABLE BEFORE NORMALIZATION

57

TABLE AFTER NORMALIZATION TO 1NF

58

FIRST NORMAL FORM(1NF) First normal form (1NF) sets the very basic rules for an organized database:

1NF definition The term first normal form (1NF) describes the tabular format in which: All the key attributes are defined. There are no repeating groups in the table. All attributes are dependent on the primary key.

For this eliminate duplicative columns from the same table. Create separate tables for each group of related data and identify each row with a unique column or set of columns (the primary key).
59

Contd.
The

first rule dictates that we must not duplicate data within the same row of a table. Within the database community, this concept is referred to as the atomicity of a table and the tables that comply with this rule are said to be atomic.

60

Dependency Diagram for the example


The primary key components are bold, underlined, and shaded in a different color. The arrows above entities indicate all desirable dependencies, i.e., Dependencies that are based on PK. The arrows below the dependency diagram indicate less desirable dependencies -- partial dependencies and transitive dependencies.

61

SECOND NORMAL FORM (2NF)


Second

normal form (2NF) further addresses the concept of removing duplicative data Remove subsets of data that apply to multiple rows of a table and place them in separate tables. Create relationships between these new tables and their predecessors through the use of foreign keys. These rules can be summarized in a simple statement: 2NF attempts to reduce the amount of redundant data in a table by extracting it, placing it in new table (s) and creating relationships between those tables.
62

Contd.

Conversion to second normal form Starting with the 1NF format, the database can be converted into the 2NF format by Writing each key component on a separate line, and then writing the original key on the last line and Writing the dependent attributes after each new key. Project (proj_num, proj_name) Employee (emp_num, emp_name, job_class, chg_hour) Assign (proj_num, emp_num, hours)

63

Contd.
A table is in 2NF if:
It

is in 1NF and

It

includes no partial dependencies; that is, no attribute is dependent on only a portion of the primary key.
(It is still possible for a table in 2NF to exhibit transitive dependency; that is, one or more attributes may be functionally dependent on non-key attributes.)

64

DEPENDENCY DIAGRAM FOR 2NF

65

THIRD & FOURTH NORMAL FORM (3NF & 4NF)

Third normal form (3NF) goes one large step further Remove columns that are not dependent upon the primary key. 3NF definition A table is in 3NF if It is in 2NF and It contains no transitive dependencies.

Finally, fourth normal form (4NF) has one requirement


A relation is in 4NF if it has no multi-valued dependencies.

66

CONVERSION TO 3NF IN THE EXAMPLE


Create

a separate table with attributes in a transitive functional dependence relationship


PROJECT (PROJ_NUM, PROJ_NAME) ASSIGN (PROJ_NUM, EMP_NUM, HOURS) EMPLOYEE (EMP_NUM, EMP_NAME, JOB_CLASS) JOB (JOB_CLASS, CHG_HOUR)

67

68

CONVERSION TO 4NF

69

Structured Query Language (SQL)

SQL is used to define, manipulate, and control data in relational databases. So all these fall into the following three main categories according to the functions: - Data Definition Language (DDL) - define or change database structure(s) Create Alter Drop - Data Manipulation Language (DML) - select or change data Insert Update Delete Select - Data Control Language (DCL) - control user access (e.g., Grant, Revoke) transactions (e.g., Commit)
70

Data Definition Language (DDL)


Creating table Empty tables are constructed using the create table statement. Data must be entered later using insert. Create table s ( sno char(5), Sname char(20), Status decimal(3), City char(15), Primary key (sno) )
Columns

which are defined as primary keys will never have two rows with the same key value. Primary keys may consist of more than one column (values unique in combination). A table name and unique column names must be specified

71

Contd.
Alter table: This is used to add or remove columns or constraints. Alter table categoriesnopix DROP COLUMN shortdesc; Drop table: Use drop objectname to remove from the database any object that was created DROP TABLE categoriesnopix;

72

Data Manipulation Language (DML)


There

are four basic SQL data manipulation operations.


SELECT - RETRIEVES DATA INSERT - ADD A NEW ROW UPDATE - CHANGE VALUES IN EXISTING RECORDS DELETE - REMOVE ROW (S)

73

Insert Command
Use

the Insert command to enter data into a table. You may insert one row at a time, or select several rows from an existing table and insert them all at once.

INSERT INTO SP ( SNO, PNO, QTY ) VALUES ( 'S4', 'P1', 1000 )

74

Update & Delete Commands


Use

the update statement to change data values in one or more columns, usually based on specific criteria. UPDATE MySuppliers SET Region = "UK" WHERE City IN ("London", "Manchester"); Delete command is used to remove whole rows from a table. Use with caution! DELETE * FROM Personnel WHERE Department = "Chemistry";
75

Select Command
SELECT

has the general form SELECT-FROM-WHERE. The result is another (new) table. If DISTINCT is used in SELECT then- no duplicate rows are asked for When WHERE is missing from the query- all rows of from table are returned. SELECT * is used for select the entire row (all columns).

76

Anda mungkin juga menyukai