Entity: A person, place, thing or event about which information must be kept.
Attribute: Piece of information describing a particular entity. These are mainly the
characteristics about the individual entity. Individual attributes help to identify and
distinguish one entity from another.
Entity Attributes
Personnel Name, Age, Address, Father’s Name
Academic Name, Roll No., Course, Dept. Name
E.g.
Student (Database Name)
Field name or attribute name
Personnel (Table Name) Academic (table name)
ROLL COURSE Dept. Name
Name Father Name Age Name
NO
John Albert 24 RECORD John 12 MSC Computer
1. Defining: Specifying data types and structures, and constraints for data to be
stored.
2. Constructing: Storing data in a storage medium.
3. Manipulating: Involves querying, updating and generating reports.
4. Sharing: Allowing multiple users and programs to access data simultaneously.
Eg. Of DBMS- Access, dBase, FileMaker Pro, and FoxBASE, ORACLE etc.
1. To provide a way to store and retrieve database information that is both convenient and
efficient.
2. To manage large and small bodies of information. It involves defining structures for
storage of information and providing mechanism for manipulation of information.
3. It should ensure safety of information stored, despite system crashes or attempts at
unauthorized access.
4. If data are to be shared among several users, then system should avoid possible
anomalous results.
1.2 Database Management Systems:
Advantages of DBMS:
• Data independence: provides an abstract view of the data that hides the details
data representation and storage.
• Efficient Data Access: This is the advantage where we use variety of techniques
to store and retrieve data.
• Data integrity and security: we can ensure data integrity if the data is always
enforced through integrity constraint.
• Data administration: data" administration deals with the modeling of the data
and treats data as an organizational resource, while "database" administration
deals with the implementation of the types of databases that are in use.
• Concurrent Access and crash recovery: It ensures concurrent access of the data
in such a way that the data is being accessed by only one user a time. Also
protects the system from crashes.
• Reduced Application Development time: It supports all the important functions
that are common to many applications.
Disadvantages of a DBMS:
Advantages Disadvantages
Greater flexibility Difficult to learn
Good for larger databases Packaged separately from the operating
system (i.e. Oracle, Microsoft Access,
Lotus/IBM Approach, Borland Paradox,
Claris FileMaker Pro)
Greater processing power Slower processing speeds
Fits the needs of many medium to large- Requires skilled administrators
sized organizations
Storage for all relevant data Expensive
Provides user views relevant to tasks
performed
Ensures data integrity by managing
transactions (ACID test = atomicity,
consistency, isolation, durability)
Supports simultaneous access
Enforces design criteria in relation to data
format and structure
1. Banking – For customer information, accounts, and loans, and banking transactions.
[all transactions]
2. Airlines – For reservation and schedule information. [reservations, schedules]
3. Universities – For student information, course registrations, and grades. [registration,
grades]
4. Credit Card Transactions – For purchases on credit card and generation of monthly
statements.
5. Telecommunication – For keeping records of calls made, generating monthly bills,
maintaining balances on prepaid calling cards, and storing information about
communication networks.
6. Finance – For storing information about holdings, sales, and purchases of financial
instruments such as stocks and bonds.
7. Sales – For customer, product, and purchase information. [customers, products,
purchases]
8. Manufacturing – For management of supply chain and for tracking production of
items in factories, inventories of items in warehouses/stores, and orders for items.
[production, inventory, orders, supply chain]
9. Human Resources – For information about employees, salaries, payroll taxes and
benefits, and generation of paychecks. [employee records, salaries, tax deductions]
3. View level: Application programs hide details of data types. Views can also hide
information (e.g., salary) for security purposes.
Features:
a) It is the highest level of abstraction.
b) It describes only a part of the whole Database for particular group of users.
c) This view hides all complexity.
d) It exists only to simplify user interaction with system.
e) The system may provide many views for the whole system.
Object-Based Logical Models: They are used in describing data at the logical and view
levels. They are characterized by the fact that they provide fairly flexible structuring
capabilities and allow data constraints to be specified explicitly. There are many different
models and more are likely to come. Several of the more widely known ones are:
• The E-R model
• The object-oriented model
• The semantic data model
• The functional data model
The E-R Model
The (E-R) data model is based on a perception of a real worker that consists of a
collection of basic objects, called entities, and of relationships among these objects.
The overall logical structure of a database can be expressed graphically by an E-R
diagram. Which is built up by the following components:
• Rectangles, which represent entity sets
• Ellipses, which represent attributes
• Diamonds, which represent relationships among entity sets
• Lines, which link attributes to entity sets and entity sets to relationships.
E.g. suppose we have two entities like customer and account, then these two entities
can be modeled as follow:
Account number
name
Customer Customer city Balanc
e
Semantic Models
These include the extended relational, the semantic network and the functional models.
They are characterized by their provision of richer facilities for capturing the meaning of
data objects and hence of maintaining database integrity. Systems based on these models
exist in monotype for at the time of writing and will begin to filter through the next
decade.
Hierarchical Model
The hierarchical model is similar to the network model in the sense that data and
relationships among data are represented by records and links, respectively. It differs
from the network model in that records are organized as collection of trees rather than
arbitrary graphs.
CUSTOMER
A-217 350
A-102 400
A-222 700
A-305 350
1. Schema:
The overall design of a database is called database schema. E.g., the database consists of
information about a set of customers and accounts and the relationship between them. It
is analogous to variable along with its type information in a program.
We have Data Definition Languages (DDL) to specify database schemas and Data
Manipulation Language (DML) to express database updates and queries. In practice,
these are not to separate languages but are part of a single database language, like SQL.
Features of DDL:
Features of DML:
o Procedural: Here user specifies what data is needed and how to get it.
o Non-procedural: Here user only specifies what data is needed.
- Easier for user
- May not generate code as efficient as that produced by
procedural languages.
We can explain the overall structure of DBMS/System structure and its components by
the diagram given below:
Figure: System Structure
1. Database systems are partitioned into modules for different functions. Some functions
(e.g. file systems) may be provided by the operating system.
2. Broadly the functional components of a database system are:
Advantages Disadvantages
Simpler to use Typically does not support multi-user
access
Less expensive· Limited to smaller databases
Fits the needs of many small businesses Limited functionality (i.e. no support for
and home users complicated transactions, recovery, etc.)
Popular FMS’s are packaged along with
the operating systems of personal
Decentralization of data
computers (i.e. Microsoft Card file and
Microsoft Works)
Good for database solutions for hand held
Redundancy and Integrity issues
devices such as Palm Pilot
1. Data Redundancy – Since different programmers create the files and application
programs over a long period, the various files are likely to have different formats and the
programs may be written in several programming languages. Moreover, the same
information may be duplicated in several files, this duplication of data over several files
is known as data redundancy. Eg. The address and telephone number of a particular
customer may appear in a file that consists of saving- account records and in a file that
consists of checking account records. This redundancy leads to higher storage & excess
cost also leads to inconsistency discussed in the next.
2. Data Inconsistency – The various copies of same data may no longer agree i.e.
various copies of the same data may contain different information. Eg. A changed
customer address may be reflected in savings-account records but not elsewhere in the
system.
5. Integrity problems – The data stored in the database must satisfy certain types of
consistency constraints. Eg. The balance of a bank account may never fall below a
prescribed amount (say, ICICI 2500/- ). Developers enforce these constraints in the
system by adding appropriate code in the various application programs. However, when
new constraints are added, it is difficult to change the programs to enforce them. The
problem is compounded when constraints involve several data items from different files.
8. Security Problems – Not every user of the database system should be able to access
all the data. Eg. In a bank system, payroll personnel need to see only that part of the
database that has information about the various bank employees. They do not need access
to information about customer accounts. But, since application programs are added to the
system in an ad hoc manner, enforcing such security constraints is difficult.
1.7.1 Difference between DBMS and File-processing system:
2. Data is easily accessed due to standard 2. Data cannot be easily accessed due to
query procedures. special application programs needed to
access data.
6. Several users can access data at the same 6. Concurrent accesses may cause problems
time i.e concurrently without problems such as . Inconsistencies.
• Minimal Data Redundancy - Since the whole data resides in one central database, the
various programs in the application can access data in different data files. Hence data
present in one file need not be duplicated in another.
This reduces data redundancy. However, this does not mean all redundancy can be
eliminated. There could be business or technical reasons for having some amount of
redundancy. Any such redundancy should be carefully controlled and the DBMS should
be aware of it.
• Data Integration - Since related data is stored in one single database, enforcing data
integrity is much easier. Moreover, the functions in the DBMS can be used to enforce the
integrity rules with minimum programming in the application programs.
• Data Sharing - Related data can be shared across programs since the data is stored in a
centralized manner. Even
new applications can be developed to operate against the same data.
• Application Development Ease - The application programmer need not build the
functions for handling issues like concurrent access, security, data integrity, etc. The
programmer only needs to implement the application business rules. This brings in
application development ease. Adding additional functional modules is also easier than in
file based systems.
• Better Controls - Better controls can be achieved due to the centralized nature of the
system.
• Data Independence - The architecture of the DBMS can be viewed as a 3-level system
comprising the following:
- The internal or the physical level where the data resides.
- The conceptual level which is the level of the DBMS functions
- The external level which is the level of the application programs or the end user.
Data Independence is isolating an upper level from the changes in the organization or
structure of a lower level. For example, if changes in the file organization of a data file do
not demand for changes in the functions in the DBMS or in the application programs,
data independence is achieved. Thus Data Independence can be defined as immunity of
applications to change in physical representation and access technique. The provision of
data independence is a major objective for database systems.
• Reduced Maintenance - Maintenance is less and easy, again, due to the centralized
nature of the system.
Functions of a DBA:
A DBMS can be considered as a buffer between application programs, end users and a
database designed to fulfill features of data independence. In 1975 the American National
Standards Institute Standards Planning and Requirements Committee (ANSI-SPARC)
proposed three-level architecture identified three levels of abstraction.
1. The External or User Level: This level describes the user’s or application program’s
view of the database. Several programs or users may share the same view.
2. The Conceptual Level: This level describes the organization’s view of all the data in
the database, the relationships between the data and the constraints applicable to the
database. This level describes a logical view of the database i.e. a view locking
implementation detail.
3. The Internal or Physical Level: This level describes the way in which data is stored
and the way in which data may be accessed. This level describes a physical view of the
database.
Each level is described in terms of schema – a map of the database. The three-
level architecture is used to implement data independence through two levels of mapping:
that between the external schema and the conceptual schema and that between the
conceptual schema and the physical schema.
a. Simplicity – The network data model is also conceptually simple and easy to design.
b. Ability to handle more relationship types – The network model can handle the one-
to-many and many-to-many relationships.
c. Ease of data access – In the network database terminology, a relationship is a set.
Each set comprises of two types of records – an owner record and a member record. In a
network model an application can access an owner record and all the member records
within a set.
d. Data Integrity – In a network model, no member can exist without an owner. A user
must therefore first define the owner record and then the member record. This ensures the
data integrity.
e. Data Independence – The network model draws a clear line of demarcation between
the programs and the complex physical storage details. The application programs work
independently of the data. Any changes made in the data characteristics do not affect the
application program.
f. Database standards – The standards devised by the DBTG (Database Task Group of
CODASYL Committee) form the basis of the network model. These standards were
further enhanced by ANSI/SPARC (American National Standards Institute/Standards
Planning and Requirements Committee) in the 1970s. All the network database
management systems adhere to these standards. These standards comprise of a DDL and
a DML that augments the database administration and portability.
a. System complexity – In a network model, data are accessed one record at a time. This
makes it essential for the database designers, administrators, and programmers to be
familiar with the internal data structures to gain access to the data. Therefore, a user-
friendly database management system cannot be created using the network model.
b. Lack of structural independence – Making structural modifications to the database is
very difficult in the network database model as the data access method is navigational.
Any changes made to the database structure require the application programs to be
modified before they can access data. Though the network database model achieves data
independence, it still fails to achieve structural independence.
3. Hierarchical Model:
In Object-based models, the database is organized in real world objects of several types.
A number of fields, or attributes, are defined in each object type, and each field is usually
of a variable length.
The two most popular object-based data models are
a. Object oriented model
b. E - R Model
a. Difficult to maintain – In the real world, the data model is not static. It changes as
organizational information needs change and as missing information is identified.
Consequently, the definition of objects must be changed periodically and existing
databases migrated to conform to the new object definitions. Object-oriented databases
are semantically rich introducing a number of challenges when changing object
definitions and migrating databases.
Object-oriented databases have a greater challenge handling schema migration because it
is not sufficient to simply migrate the data representation to conform to the changes in
class specifications. One must also update the behavioral code associated with each
object.
b. Not suited for all applications – Object-oriented database systems are not suited for
all applications. If it is used in situations where it is not required, then it will result in
performance degradation and high processing requirements.OODBMS is popular in area
such as e-commerce, engineering product data management, and special purpose
databases in securities and medicine. The strength of the object model is in applications
where there is an underlying needed to manage complex relationships among data
objects.