Anda di halaman 1dari 11

in/

MASTER OF COMPUTER
APPLICATIONS
(MCA)
MCA/ASSIGN/SEMESTER-IV
ASSIGNMENTS
(July - 2016 & January - 2017)

## SCHOOL OF COMPUTER AND INFORMATION SCIENCES

INDIRA GANDHI NATIONAL OPEN UNIVERSITY
MAIDAN GARHI, NEW DELHI 110 068
Course
Code
MCS-043

Course Title
Management Systems

Assignment No.
MCA(4)/043/Assignment/16-17

Submission-Schedule
July- December
Session
15th October,
2016

January-June
Session
15th April,
2017

Q3. Explain MVD (Multi Valued Dependencies) and join dependency with the help of an
Classification of Dependencies in DBMS:
Classification of
Dependencies
Partial Dependencies
Transitive
Dependencies
Multivalued
Dependencies
Join Dependencies
Inclusion Dependency

## Which Normal Form Remove these

Dependencies
Second Normal Form (2NF)
Third Normal Form (3NF)
Fourth Normal Form (4NF)
Fifth Normal Form (5NF)
(Dependencies among the Relations/Tables
or Databases)

Multivalued Dependency: Consider a relation Faculty (FID, Course, Book) which consists of two
multivalued attributes (Course and Book). The two multivalued attributes are independent of each
other.
FID
1
2

Course
C1/C2
C1

Book
B1/B2
B1

FID
1
1
1
1
2

Course
C1
C1
C2
C2
C1

Book
B1
B2
B1
B2
B1

It is clear that there are multiple copies of the information about Course and Book. This is an
example of a multivalued dependency which occurs when a relation has more than one
independent, multivalued attribute.
A multivalued dependency occurs when a relation R has attributes A(FID), B(Course), and
C(Book) such that
A determines a set of values for B
A determines a set of values for C and
B and C are independent of each other. (No relation between Course and Book)
These multivalued dependencies can be indicated as follows :
(FID Course)
(FID Book)
Join Dependency: Let R be a relation. Let A, B, , Z be arbitrary subsets of Rs attributes. R
satisfies the JD
* ( A, B, , Z )
If and only if R is equal to the join of its projections on A, B, , Z.
A join dependency JD(R1, R2, , Rn) specified on relation schema R, is a trivial JD, if one of the
relation schemas Ri in JD(R1, R2, .,Rn) is equal to R.
Q4. Define a simple hash-join and explain the process and cost calculations of hash-join
with the help of an example.

Hash-Join: This is applicable to both the equi-joins and natural joins. A hash function h is used to
partition tuples of both relations, where h maps joining attribute (enroll no in our xample) values to
{0, 1, ..., n-1}.
The join attribute is hashed to the join-hash partitions. In the example of Figure 4 we have used
mod 100 function to hashing, and n = 100.

Once the partition tables of STUDENT and MARKS are made on the enrolment number,
then only the corresponding partitions will participate in the join as:
A STUDENT tuple and a MARKS tuple that satisfy the join condition will have the same value for
the join attributes. Therefore, they will be hashed to equivalent
partition and thus can be joined easily.
Cost calculation for Simple Hash-Join
(i) Cost of partitioning r and s: all the blocks of r and s are read once and after partitioning written
back, so cost 1 = 2 (blocks of r + blocks of s).
(ii) Cost of performing the hash-join using build and probe will require at least one block transfer
Cost 2 = (blocks of r + blocks of s)
(iii) There are a few more blocks in the main memory that may be used for evaluation, they may
be read or written back. We ignore this cost as it will be too less in comparison to cost 1 and cost
2.
Thus, the total cost = cost 1 + cost 2 = 3 (blocks of r + blocks of s)
Q5. The following are the relational schemes of Employee, Project and Assigned-to
Employee (Emp#, Emp_name, Profession), Project (Proj#, Proj_name, Chief_Architect),
Assigned-to (Proj#, Emp#). Create appropriate samples of each relation according to the
question. Write the following queries in SQL.
(i) Get Emp# of employees working on Project numbered MCS-043.
(ii) Get details of employees working on database projects.

(iii) Finally create an optimal query tree for each query.
Lets put some dummy values into the tables of database as below:
Table Name-

Employee

Table Name-

Project
Proj_name

Table
Name-

Assignedto

Chief_Architect

Proj#

Emp#

Emp#

Emp_name

Profession

Proj#

Emp1

Vinay

DBA

MCS-043

Database

Vinay

MCS-043

Emp1

Akhand

MCS-043

Emp2

Emp2

Harish

DBA

MCS-041

Operating
System

Emp3

Ramesh

Web Developer

MCS-042

Networking

Kishor

MCS-042

Emp3

Emp4

SatyaRanjan

Web Designer

MCS-012

C Language

Abhinav

MCS-041

Emp4

Gagan

Software
Engineer

MCS-022

Java

Amit

MCS-012

Emp5

Emp5

1. To get Emp# of employees working on Project numbered MCS-043 below query will be
used:
Select Emp# from Asssigned-to where Proj#=MCS-043;
And the output will be Emp1, Emp2
2. Select A.Emp#, Emp_name from A.Assigned-to, Employee where project#=MCS-043;

Q6. Given the following semi-structure data in XML, create the DTD (Document Type
Declaration) for it
<document>
<student>
<NAME>
</student>
<student>
<NAME>
</student>
</document>
What are the different options available for storing XML data?
Answer: In semi-structured data, the schema or format information is mixed with the data values,
since each data object can have different attributes that are not known earlier. Thus, this type of
data is sometimes referred to as selfdescribing data.
The basic object in XML is the XML document. There are two main structuring concepts that
construct an XML document:
Elements and attributes: Attributes in XML describe elements. Elements are identified in a
document by their start tag and end tag. The tag names are enclosed between angular brackets
<>, and end tags are further identified by a backslash </>. Complex elements are constructed
from other elements hierarchically, whereas simple elements contain data values. Thus, there is a
correspondence between the XML textual representation and the tree structure. In the tree
representation of XML, internal nodes represent complex elements, whereas leaf nodes represent
simple elements. That is why the XML model is called a tree model or a hierarchical model.

Document Type Declaration (DTD): DTD is one of the component of XML document. A DTD is
used to define the syntax and grammar of a document, that is, it defines the meaning of the
document elements. XML defines a set of key words, rules, data types, etc to define the
permissible structure of XML documents. In other words, we can say that you use the DTD
grammar to define the grammar of your XML documents. Below is form of DTD for the given semistructured XML data:
<document> Student_Data
<student>
<NAME> Vishal </NAME>
</student>
<student>
<NAME> Vinod </NAME>
</student>
</document>
Q7. What is data mining? How is it different from OLTP? What is classification in context of
data mining?
Data is growing at a phenomenal rate today and the users expect more sophisticated information
from this data. There is need for new techniques and tools that can automatically generate useful
information and knowledge from large volumes of data. Data mining is one such technique of
generating hidden information from the data. Data mining can be defined as: an automatic
process of extraction of non-trivial or implicit or previously unknown but potentially useful
information or patterns from
data in large databases, data warehouses or in flat files. Data mining is related to data warehouse
in this respect that, a data warehouse is well equipped for providing data as input for the data
mining process. The advantages of using the data of data warehouse for data mining are or many
some of them are listed below:
Data quality and consistency are essential for data mining, to ensure, the accuracy of the
predictive models. In data warehouses, before loading the data, it is first extracted, cleaned and
transformed. We will get good results only if we have good quality data.
Data warehouse consists of data from multiple sources. The data in data warehouses is
integrated and subject oriented data. The data mining process performed on this data.
In data mining, it may be the case that, the required data may be aggregated or summarised
data. This is already there in the data warehouse.
Data warehouse provides the capability of analysing data by using OLAP operations. Thus, the
results of a data mining study can be analysed for hirtherto, uncovered patterns
.
As defined earlier, data mining generates potentially useful information or patterns from data. In
fact, the information generated through data mining can be used to create knowledge. So let us,
first, define the three terms data, information and knowledge.
OLTP vs. OLA: We can divide IT systems into transactional (OLTP) and analytical (OLAP). In
general we can assume that OLTP systems provide source data to data warehouses, whereas
OLAP systems help to analyze it.

## OLTP (On-line Transaction Processing) is characterized by a large number of short on-line

transactions (INSERT, UPDATE, DELETE). The main emphasis for OLTP systems is put on very
fast query processing, maintaining data integrity in multi-access environments and an
effectiveness measured by number of transactions per second. In OLTP database there is detailed
and current data, and schema used to store transactional databases is the entity model (usually
3NF).
OLAP (On-line Analytical Processing) is characterized by relatively low volume of transactions.
Queries are often very complex and involve aggregations. For OLAP systems a response time is
an effectiveness measure. OLAP applications are widely used by Data Mining techniques. In
OLAP database there is aggregated, historical data, stored in multi-dimensional schemas (usually
star schema).
The following table summarizes the major differences between OLTP and OLAP system design.

Specification

OLTP System

OLAP System

## Online Analytical Processing

(Operational System)

(Data Warehouse)

Source of data

Databases

Purpose of data

Inserts and

Queries

## Relatively standardized and simple queries Returning

relatively few records

Processing
Speed

## Depends on the amount of data involved; batch data refreshes

and complex queries may take many hours; query speed can
be improved by creating indexes

Space
Requirements

## Larger due to the existence of aggregation structures and

history data; requires more indexes than OLTP

Database
Design

## Typically de-normalized with fewer tables; use of star and/or

snowflake schemas

Backup and
Recovery

## Backup religiously; operational data is critical to run the

business, data loss is likely to entail significant monetary
loss and legal liability

## Instead of regular backups, some environments may consider

Q8. How will you enforce referential integrity constraints in Oracle? Explain with the help of
one example.

Answer: Data Integrity explains how to use integrity constraints to enforce the business rules
associated with your database and prevent the entry of invalid information into tables.
Introduction to Data Integrity: It is important that data adhere to a predefined set of rules, as
determined by the database administrator or application developer. As an example of data
integrity, consider the tables employees and departments and the business rules for the
information in each of the tables, as illustrated in Figure

Types of Data Integrity: This section describes the rules that can be applied to table columns to
enforce different types of data integrity.
Null Rule: A null rule is a rule defined on a single column that allows or disallows inserts or
updates of rows containing a null (the absence of a value) in that column.
Unique Column Values: A unique value rule defined on a column (or set of columns) allows the
insert or update of a row only if it contains a unique value in that column (or set of columns).
Primary Key Values: A primary key value rule defined on a key (a column or set of columns)
specifies that each row in the table can be uniquely identified by the values in the key.
Referential Integrity Rules: A referential integrity rule is a rule defined on a key (a column or set
of columns) in one table that guarantees that the values in that key match the values in a key in a
related table (the referenced value).
Referential integrity also includes the rules that dictate what types of data manipulation are
allowed on referenced values and how these actions affect dependent values. The rules
associated with referential integrity are:
Restrict: Disallows the update or deletion of referenced data.

Set to Null: When referenced data is updated or deleted, all associated dependent data is set to
NULL.
Set to Default: When referenced data is updated or deleted, all associated dependent data is set
to a default value.
Cascade: When referenced data is updated, all associated dependent data is correspondingly
updated. When a referenced row is deleted, all associated dependent rows are deleted.
No Action: Disallows the update or deletion of referenced data. This differs from RESTRICT in
that it is checked at the end of the statement, or at the end of the transaction if the constraint is
deferred. (Oracle uses No Action as its default action.)
How Oracle Enforces Data Integrity
Oracle enables you to define and enforce each type of data integrity rule defined in the previous
section. Most of these rules are easily defined using integrity constraints or database triggers.
Integrity Constraints Description: An integrity constraint is a declarative method of defining a
rule for a column of a table. Oracle supports the following integrity constraints:
NOT NULL constraints for the rules associated with nulls in a column
UNIQUE key constraints for the rule associated with unique column values
PRIMARY KEY constraints for the rule associated with primary identification values
FOREIGN KEY constraints for the rules associated with referential integrity. Oracle supports the
use of FOREIGN KEY integrity constraints to define the referential integrity actions, including:
Update and delete No Action
Delete SET NULL
CHECK constraints for complex integrity rules
Note:You cannot enforce referential integrity using declarative integrity constraints if child and
parent tables are on different nodes of a distributed database. However, you can enforce
referential integrity in a distributed database using database triggers.
Database Triggers: Oracle also lets you enforce integrity rules with a non-declarative approach
using database triggers (stored database procedures automatically invoked on insert, update, or
delete operations).
Overview of Integrity Constraints: Oracle uses integrity constraints to prevent invalid data entry
into the base tables of the database. You can define integrity constraints to enforce the business
rules you want to associate with the information in a database. If any of the results of a DML
statement execution violate an integrity constraint, then Oracle rolls back the statement and
returns an error.
For example, assume that you define an integrity constraint for the salary column of the
employees table. This integrity constraint enforces the rule that no row in this table can contain a
numeric value greater than 10,000 in this column. If an INSERT or UPDATE statement attempts to
violate this integrity constraint, then Oracle rolls back the statement and returns an information
error message.
Referential Integrity Constraints: Different tables in a relational database can be related by
common columns, and the rules that govern the relationship of the columns must be maintained.
Referential integrity rules guarantee that these relationships are preserved.
The following terms are associated with referential integrity constraints.

Term
Foreign key
Referenced key

Definition
The column or set of columns included in the definition of the
referential integrity constraint that reference a referenced key.
The unique key or primary key of the same or different table that is
referenced by a foreign key.

Dependent or child
table

The table that includes the foreign key. Therefore, it is the table that
is dependent on the values present in the referenced unique or
primary key.

Referenced or parent
table

The table that is referenced by the child table's foreign key. It is this
table's referenced key that determines whether specific inserts or
updates are allowed in the child table.

A referential integrity constraint requires that for each row of a table, the value in the foreign key
matches a value in a parent key.
Foreign keys can be defined as multiple columns. However, a composite foreign key must
reference a composite primary or unique key with the same number of columns and the same
datatypes. Because composite primary and unique keys are limited to 32 columns, a composite
foreign key is also limited to 32 columns.

Q9. How does PostgeSQL perform storage and indexing of tables? Briefly discuss the type
of indexes involved in PostgeSQL.

PostgreSQL INDEXES: Indexes are special lookup tables that the database search engine can
use to speed up data retrieval. Simply put, an index is a pointer to data in a table. An index in a
database is very similar to an index in the back of a book.
For example, if you want to reference all pages in a book that discuss a certain topic, you
first refer to the index, which lists all topics alphabetically and are then referred to one or more
specific page numbers.
An index helps speed up SELECT queries and WHERE clauses, but it slows down data
input, with UPDATE and INSERT statements. Indexes can be created or dropped with no effect on
the data.
Creating an index involves the CREATE INDEX statement, which allows you to name the
index, to specify the table and which column or columns to index, and to indicate whether the
index is in ascending or descending order.
Indexes can also be unique, similar to the UNIQUE constraint, in that the index prevents duplicate
entries in the column or combination of columns on which there's an index.
The CREATE INDEX Command:
The basic syntax of CREATE INDEX is as follows:
CREATE INDEX index_name ON table_name;
Index Types: PostgreSQL provides several index types: B-tree, Hash, GiST, SP-GiST and GIN.
Each index type uses a different algorithm that is best suited to different types of queries. By
default, the CREATE INDEX command creates B-tree indexes, which fit the most common
situations.
Single-Column Indexes: single-column index is one that is created based on only one table
column. The basic syntax is as follows:
CREATE INDEX index_name
ON table_name (column_name);
Multicolumn Indexes: A multicolumn index is defined on more than one column of a table. The
basic syntax is as follows:
CREATE INDEX index_name
ON table_name (column1_name, column2_name);
Whether to create a single-column index or a multicolumn index, take into consideration the
column(s) that you may use very frequently in a query's WHERE clause as filter conditions.
Should there be only one column used, a single-column index should be the choice. Should
there be two or more columns that are frequently used in the WHERE clause as filters, the
multicolumn index would be the best choice.
Unique Indexes: Unique indexes are used not only for performance, but also for data integrity. A
unique index does not allow any duplicate values to be inserted into the table. The basic syntax is
as follows:
CREATE UNIQUE INDEX index_name
on table_name (column_name);
Partial Indexes: A partial index is an index built over a subset of a table; the subset is defined by
a conditional expression (called the predicate of the partial index). The index contains entries only
for those table rows that satisfy the predicate. The basic syntax is as follows:
CREATE INDEX index_name
on table_name (conditional_expression);

Implicit Indexes: Implicit indexes are indexes that are automatically created by the database
server when an object is created. Indexes are automatically created for primary key constraints
and unique constraints.
Example: Following is an example where we will create an index on COMPANY table for salary
column:
# CREATE INDEX salary_index ON COMPANY (salary);
Now, let's list down all the indices available on COMPANY table using \d company command as
follows:
# \d company
This will produce the following result, where company_pkey is an implicit index which got created
when the table was created.
Table "public.company"
Column

Type

| Modifiers

---------+---------------+----------id

| integer

| not null

name

| text

| not null

age

| integer

| not null

salary

| real

Indexes:
"company_pkey" PRIMARY KEY, btree
(id)
"salary_index" btree (salary)

You can list down the entire indexes database wide using the \di command:
The DROP INDEX Command:
An index can be dropped using PostgreSQL DROP command. Care should be taken when
dropping an index because performance may be slowed or improved.
The basic syntax is as follows:
DROP INDEX index_name;
You can use following statement to delete previously created index:
# DROP INDEX salary_index;