Some part of the real world about which data is stored in database.
Database
A software package used to facilitate the creation and maintenance of a computerized database.
Database
system
Basics of Database
A database is a collection of information applicable to a particular subject or purpose It is a shared collection of logically related data (and a description of this data), designed to meet the information needs of an organization. Logically related data comprises entities, attributes, and relationships of an organization's information. The data is typically grouped into specific categories of information, which are contained in data storage files called table It is a collection of non-redundant data which can be shared by different application systems
3
Contd.
It stresses the importance of multiple applications, data sharing The spatial database becomes a common resource for an agency A database implies separation of physical storage from use of the data by an application program, i.e. Program/data independence The user or programmer or application specialist need not know the details of how the data are stored Changes can be made to data without affecting other components of the system
4
Contd.
Data in a database is stored under various categories known as fields When the information from each of these fields is combined together as one unit, that unit is considered a single record. All of the records combined becomes a table So the tables organize data into rows called records and columns called fields. But the tables only store the raw data In order to make use of these data, we need the following six objects:
5
Contd.
Tables: Store the data for the database Queries: Allow a user to select or interact with different sections of data in the database of their own choosing Forms: Used in conjunction with tables, they allow the user to see a single record or allow for easier data entry Reports: Organizes and summarizes information so that it may be easily read and printed Macros: These are programs within access that allow users to automate certain tasks. Modules: These are pieces of visual basic programming which can be associated with a database or particular parts of a database.
6
data base is a collection on non-redundant data shareable between different application systems. [Howe, D.R. 1989] A database management system (DBMS) is a sophisticated software package capable of handling a database stored in computer files. In other words a DBMS is a data storage and retrieval system which permits data to be stored non-redundantly while making it appear to the user as if the data is wellintegrated.
7
Contd.
It
is a software system that enables users to define, create and maintain the database and which provide controlled access to this database The DBMS provides the interface between the application programs and the data The three main features of a DBMS that make it attractive are: Centralized data management, Data independence Systems integration.
8
APPLICATION #2
DBMS
DBMS MANAGES DATA RESOURCES LIKE AN OPERATING SYSTEM MANAGES HARDWARE RESOURCES
APPLICATION #3
is a collection of application programs that performs services for the end users (e.g. Reports). Each program defines and manages its own data. There is no relationship among these files
10
Separation and isolation of data Each program maintains its own set of data. Users of one program may be unaware of potentially useful data held by other programs. Duplication of data Same data is held by different programs. Wasted space and potentially different values and/or different formats for the same item. Data dependence File structure is defined in the program code. Incompatible file formats Programs are written in different languages, and so cannot easily access each others files. Fixed queries/proliferation of application programs Programs are written to satisfy particular functions. Any new requirement needs a new program.
11
Contd.
Structural
and data dependence Field definitions and naming conventions Data redundancy that leads to data inconsistency and data anomalies
13
files and spreadsheets: All records in this data base have the same number of "fields". Individual records have different data in each field with one field serving as a key to locate a particular record. When the number of fields becomes lengthy a flat file is cumbersome to search. Although this type of database is simple in its structure, expanding the number of fields usually entails reprogramming. Additionally, adding new records is time consuming, particularly when there are numerous fields.
14
Contd.
Hierarchical files:
These store data in more than one type of record. This method is usually described as a "parent-child, one-to-many" relationship. One field is key to all records, but data in one record does not have to be repeated in another. This system allows records with similar attributes to be associated together.
The records are linked to each other by a key field in a hierarchy of files.
15
Contd.
Relational files These connect different files or tables without using internal pointers or keys. Instead a common link of data is used to join or associate records. The link is not hierarchical. A "matrices of tables" is used to store the information. As long as the tables have a common link they may be combined by the user to form new inquires and data output. This is the most flexible system and is particularly suited to SQL (structured query language).
16
nature of a database system: A DBMS catalog stores the description (meta-data) of the database. This allows the DBMS software to work with different databases. Insulations between program and data: This is provided through: Data abstractions: A data model is used to hide storage details and present the user with a conceptual view of the database. Program-data independence: Allows changing data storage structures without having to change the DBMS access programs. Program-operation independence: Allows changing operation implementation without having to change the DBMS access programs. Support of multiple views of data
17
Controlling data redundancy Restricting unauthorized access to data. Providing persistent storage for program objects and data structure. Providing multiple interfaces to different classes of users. Representing complex relationships among data. Enforcing integrity constraints on the database. Providing backup and recovery services. Potential for enforcing standards. Flexibility to change data structures. Reduced application development time. Availability of up-to-date information. Economies of scale.
18
20
DATABASE ACCESS LANGUAGES (DDL AND DML) AND APPLICATION PROGRAMMING INTERFACES
DATABASE COMMUNICATION INTERFACES
21
three main features of a database management system that make it attractive are: Centralized data management, Data independence, And systems integration. In DBMS, all files are integrated into one system thus reducing redundancies and making data management more efficient. In addition, DBMS provides centralized control of the operational data.
22
Contd.
Some
23
Enterprise The
24
- 1960S - 1980S HIERARCHICAL 1970S - 1990S NETWORK 1970S - 1990S RELATIONAL 1980S - PRESENT OBJECT-ORIENTED 1990S - PRESENT OBJECT-RELATIONAL 1990S - PRESENT DATA WAREHOUSING 1980S - PRESENT WEB-ENABLED 1990S - PRESENT
25
A description of a database but not the database itself! Corresponds to the type in a programming language, or the abstract data type Instance An occurrence of a data item described in the schema Database state The data in the database at a moment in time
26
DBMS languages
DDL:
Data Definition Language These are used to define/change the structure of the database In other words these are used to define the schema or describe the data (conceptual schema) DML: Data Manipulation Language After the database is built, these are used to query the database, insert data, change data or delete data DCL: Data Control Language These are used for having control on the user access
27
Database Models
A
database model is a collection of logical constructs used to represent the data structure and the data relationships found within the database. There are two categories of database models Conceptual models:
Focus
on the logical nature of the data representation. They are concerned with what is represented rather than how it is represented.
Implementation models
Places
the emphasis on how the data are represented in the database or on how the data structures are implemented
28
Generally following three types of relationships are used: One-to-many relationships (1:M) A painter paints many different paintings, but each one of them is painted by only that painter. Painter (1) paints painting (m) Many-to-many relationships (M:N) An employee might learn many job skills, and each job skill might be learned by many employees. Employee (m) learns skill (n) One-to-one relationships (1:1) Each store is managed by a single employee and each store manager (employee) only manages a single store. Employee (1) manages store (1)
29
30
A HIERARCHICAL STRUCTURE
31
records logically organized to conform to the upside-down tree (hierarchical) structure. top layer is perceived as the parent of the segment directly beneath it.
The
The
segments below other segments are the children of the segment above them.
tree structure is represented as a hierarchical path on the computers storage media.
32
33
34
composed of at least two record types: an owner (parent) record and a member (child) record.
A
set represents a 1:m relationship between the owner and the member.
35
simplicity
Promotes Data
Conformance
Disadvantages
System Lack
complexity
36
of structural independence
The
Each
Tables
The
A
38
39
models are normally represented in an entity relationship diagram (ERD). An entity is represented by a rectangle. Each entity is described by a set of attributes. An attribute describes a particular characteristics of the entity. A relationship is represented by a diamond connected to the related entities.
40
Entities
and attributes:
Entity - a thing, has independent existence-> employee Attribute describes something -> age, ssn, gender, name Value - taken on by an attribute -> 25, 456-876-788, female, bart simpson Composite attributes vs. Atomic or simple attributes -> bart simpson vs. 45 Single-valued attributes vs. Multivalued attributes -> age vs. College degrees Derived attributes vs. Stored attributes -> age vs. Birth date (age is derived from birth date)
41
Contd.
Entity
Entity type - defines the structure of a set of entities that have the same attributes Entity an instance of an entity type Entity set, collections - group of entities Key, uniqueness Combination to create key Value sets (domains)
42
ENTITY TYPE
ATTRIBUTE
KEY ATTRIBUTE MULTIVALUED ATTRIBUTE
COMPOSITE ATTRIBUTE
DERIVED ATTRIBUTE
43
UNARY RELATIONSHIP
PROJECT
SUPPLIER
SUPPLY
TERNARY RELATIONSHIP
PROJNAME MANAGES
PART
EMPLOYEE
PARTNO
SSN
44
45
46
representation
with the relational database model
Disadvantages
Limited constraint
representation representation
Limited relationship
No
Loss
Normalization
Normalization
entities. It reduces data redundancies and helps eliminate the data anomalies.
Normalization
normal forms: First normal form (1NF) Second normal form (2NF) Third normal form (3NF) Fourth normal form (4NF)
The
48
Contd.
It's
the process of efficiently organizing data in a database. There are two goals of the normalization process
Eliminate redundant data (for example, storing the same data in more than one table) and Ensure data dependencies make sense (only storing related data in a table).
These
goals help to reduce the amount of space a database consumes and ensure that data is logically stored.
49
Contd.
The
database community has developed a series of guidelines for ensuring that databases are normalized. These are referred to as normal forms and are numbered from one (the lowest form of normalization, referred to as first normal form or 1NF) through five (fifth normal form or 5NF). In practical applications, we often see 1NF, 2NF, and 3NF along with the occasional 4NF. Fifth normal form is very rarely seen All these normalization guidelines are cumulative. For a database to be in 2NF, it must first fulfill all the criteria of a 1NF database.
50
Employee
The
company charges its clients by billing the hours spent on each project. The hourly billing rate is dependent on the employees position. report is generated.
Periodically, a The
table whose contents correspond to the reporting requirements is shown in table 5.1.
51
Scenario
A few employees works for one project. Employee Num : 101, 102, 103, 105
52
53
54
Addition Deletion
Conversion to first normal form A relational table must not contain repeating groups. Repeating groups can be eliminated by adding the appropriate entry in at least the primary key column (s).
56
57
58
FIRST NORMAL FORM(1NF) First normal form (1NF) sets the very basic rules for an organized database:
1NF definition The term first normal form (1NF) describes the tabular format in which: All the key attributes are defined. There are no repeating groups in the table. All attributes are dependent on the primary key.
For this eliminate duplicative columns from the same table. Create separate tables for each group of related data and identify each row with a unique column or set of columns (the primary key).
59
Contd.
The
first rule dictates that we must not duplicate data within the same row of a table. Within the database community, this concept is referred to as the atomicity of a table and the tables that comply with this rule are said to be atomic.
60
61
normal form (2NF) further addresses the concept of removing duplicative data Remove subsets of data that apply to multiple rows of a table and place them in separate tables. Create relationships between these new tables and their predecessors through the use of foreign keys. These rules can be summarized in a simple statement: 2NF attempts to reduce the amount of redundant data in a table by extracting it, placing it in new table (s) and creating relationships between those tables.
62
Contd.
Conversion to second normal form Starting with the 1NF format, the database can be converted into the 2NF format by Writing each key component on a separate line, and then writing the original key on the last line and Writing the dependent attributes after each new key. Project (proj_num, proj_name) Employee (emp_num, emp_name, job_class, chg_hour) Assign (proj_num, emp_num, hours)
63
Contd.
A table is in 2NF if:
It
is in 1NF and
It
includes no partial dependencies; that is, no attribute is dependent on only a portion of the primary key.
(It is still possible for a table in 2NF to exhibit transitive dependency; that is, one or more attributes may be functionally dependent on non-key attributes.)
64
65
Third normal form (3NF) goes one large step further Remove columns that are not dependent upon the primary key. 3NF definition A table is in 3NF if It is in 2NF and It contains no transitive dependencies.
66
67
68
CONVERSION TO 4NF
69
SQL is used to define, manipulate, and control data in relational databases. So all these fall into the following three main categories according to the functions: - Data Definition Language (DDL) - define or change database structure(s) Create Alter Drop - Data Manipulation Language (DML) - select or change data Insert Update Delete Select - Data Control Language (DCL) - control user access (e.g., Grant, Revoke) transactions (e.g., Commit)
70
which are defined as primary keys will never have two rows with the same key value. Primary keys may consist of more than one column (values unique in combination). A table name and unique column names must be specified
71
Contd.
Alter table: This is used to add or remove columns or constraints. Alter table categoriesnopix DROP COLUMN shortdesc; Drop table: Use drop objectname to remove from the database any object that was created DROP TABLE categoriesnopix;
72
73
Insert Command
Use
the Insert command to enter data into a table. You may insert one row at a time, or select several rows from an existing table and insert them all at once.
74
the update statement to change data values in one or more columns, usually based on specific criteria. UPDATE MySuppliers SET Region = "UK" WHERE City IN ("London", "Manchester"); Delete command is used to remove whole rows from a table. Use with caution! DELETE * FROM Personnel WHERE Department = "Chemistry";
75
Select Command
SELECT
has the general form SELECT-FROM-WHERE. The result is another (new) table. If DISTINCT is used in SELECT then- no duplicate rows are asked for When WHERE is missing from the query- all rows of from table are returned. SELECT * is used for select the entire row (all columns).
76