Anda di halaman 1dari 18

ADMT Solutions:

Refer at your own risk….

Section 1:
Q1. Explain different control measures of database security mechanisms [CO3]
A DBMS typically includes a database security and authorization subsystem that
is responsible for ensuring the security portions of a database against
unauthorized access.
 Access Control: The security mechanism of a DBMS must include provisions
for restricting access to the database as a whole; this function is called access
control and is handled by creating user accounts and passwords to control
login process by the DBMS.
 Inference control: Controlling the access to a statistical database(which is
used to provide statistical information or summaries of values based on
various criteria) is a security issue.
The counter-measures to statistical database security problem is called
inference control measures.
 Flow control: Prevents information from flowing in such a way that it reaches
unauthorized users. Flow control is necessary for covert channels.( pathways
that violate the security policy of an organization).
 Data Encryption: The data is encoded using some coding algorithm. An
unauthorized user who access encoded data will have difficulty deciphering
it, but authorized users are given decoding or decrypting algorithms (or keys)
to decipher data.

2. Explain Mandatory based database security mechanism. (05) [CO3]

Mandatory access control is a security mechanism that classifies data and users
based on security classes. It is typically combined with the discretionary access
control mechanisms.

 Typical security classes are Top secret (TS), secret (S), confidential (C), and
unclassified (U), where TS is the highest level and U the lowest: TS ≥ S ≥ C ≥
U.
 The commonly used model for multilevel security, known as the Bell-
LaPadula model, classifies each subject (user, account, program) and
object (relation, tuple, column, view, operation) into one of the security
classifications, T, S, C, or U: clearance (classification) of a subject S as
class(S) and to the classification of an object O as class(O).
 Two restrictions are enforced on data access based on the subject/object
classifications:
1. A subject S is not allowed read access to an object O unless class(S) ≥
class(O). This is known as the simple security property.
2. A subject S is not allowed to write an object O unless class(S) ≤ class(O).
This known as the star property (or * property).
 Each attribute value in a tuple is associated with a corresponding security
classification.
 A multilevel relation schema R with n attributes would be represented as
R(A1,C1,A2,C2, …, An,Cn,TC)
Where each Ci represents the classification attribute associated with
attribute Ai. And TC is the tuple classification attribute.
 The value of the TC attribute in each tuple t is the same as the highest of a
attribute classification values within t.
 A multilevel relation will appear to contain different data to subjects
(users) with different clearance levels.
 It is possible to store a single tuple in the relation at a higher classification
level and produce the corresponding tuples at a lower-level classification
through a process known as filtering.
 The apparent key of a multilevel relation is the set of attributes that would
have formed the primary key in a regular(single-level) relation.
 In some cases, it is necessary to store two or more tuples at different
classification levels with the same value for the apparent key. This leads to
the concept of polyinstantiation where several tuples can have the same
apparent key value but have different attribute values for users at
different classification levels.
 The entity integrity rule for multilevel relations states that:
 All attributes that are members of the apparent key must not be null
and must have the same security classification within each individual
tuple.
 All other attribute values in the tuple must have a security
classification greater than or equal to that of the apparent key. This
constraint ensures that a user can see the key if the user is permitted
to see any part of the tuple at all.
3. Explain discretionary based security mechanism(05) [CO3]
The typical method of enforcing discretionary access control in a database
system is based on the granting and revoking privileges.
Types of discretionary Privileges:
 The Account level: At this level, the DBA specifies the particular privileges
that each account holds independently of the relations in the database.
 The relation (or table level): At this level, the DBA can control the privilege
to access each individual relation or view in the database.
The privileges at the account level apply to the capabilities provided to the
account itself and can include:
 CREATE SCHEMA/TABLE
 CREATE VIEW
 ALTER
 DROP MAD CST pe CV Send karne bola
 MODIFY
 SELECT etc.
 The second level of privileges applies to the relation level, whether they
are base relations or virtual (view) relations.
 Access matrix model :The granting and revoking of privileges generally
follow this authorization model for discretionary privileges.
 The rows of a matrix M represents subjects (users, accounts,
programs)
 The columns represent objects (relations, records, columns, views,
operations).
 Each position M(i,j) in the matrix represents the types of privileges
(read, write, update) that subject i holds on object j.
 The owner account holder can pass privileges on any of the owned
relation to other users by granting privileges to their accounts.
Suppose that A1 wants to allow A3 to retrieve information from employee
table and also to be able to propagate the SELECT privilege to other
accounts. A1 can issue the command:
GRANT SELECT ON EMPLOYEE TO A3 WITH GRANT OPTION;

 In SQL the following types of privileges can be granted on each individual


relation R:
 SELECT
Vrms
 MODIFY
 REFERENCES
 VIEW: It is an important discretionary authorization mechanism in its
own right. To create a view, the account must have SELECT privilege
on all relations involved in the view definition.
 In some cases it is desirable to grant a privilege to a user temporarily.
In SQL, a REVOKE command is included for the purpose of canceling
privileges.
Suppose that A1 decides to revoke the SELECT privilege on the EMPLOYEE
relation from A3; A1 can issue:
REVOKE SELECT ON EMPLOYEE FROM A3;
 Techniques to limit the propagation of privileges have been developed,
namely Horizontal and vertical propagation. Although they have not yet
been implemented in most DBMSs and are not a part of SQL.

4. Write short notes on Spatial database management system. [CO4]

 Spatial databases provide structures for storage and analysis of spatial


data which is comprised of objects in multi-dimensional space.
 Three types of uses
 Manage spatial data
 Analyze spatial data
 High level utilization
 SDBMS:
 Works with an underlying DBMS
 Allows spatial data models and types
 Supports querying language specific to spatial data types
 Provides handling of spatial data and operations.
 SDBMS has three layers:
 Interface to spatial application(front end)
 Core spatial functionality
 Interface to DBMS(back end)
 Spatial data types
 Point
 Linestring
 Polygon
 Spatial operators
 Topological Operators
 Inside
 Contains
 Touch CEO of CID TC
 Disjoint Bcoz within some distance your
 Covers Nearest Neighbour ie Soumya.N.N
 Covered By Radate ani oradte
 Equal
 Overlap Boundary
 Distance Operators
 Within Distance
 Nearest Neighbor
 Spatial queries:
 Range query
 Nearest Neighbour
 Spatial join or overlays
5. Write Short notes on Temporal database management system. [CO4]
 Temporal databases encompass all database applications that require
some aspect of time when organizing their information (e.g.,
reservation system in hotels, airline, etc.).
 For temporal databases, time is considered to be an ordered sequences
of points (e.g., seconds, minutes, day, etc.).
 In SQL2, the temporal data types include DATE, TIME, TIMESTAMP,
INTERVAL, and PERIOD.
 Types of temporal information:
 Point events: associated in the database with a single time point
(e.g., 15/08/1998).
 Duration events: associated in the database with a specific time
period (e.g., [15/08/1998, 15/08/2000]).
 Database types:
 Valid time database: A temporal database in which the associated
time with their events is a valid time in the real world.
 Transaction time database: A temporal database in which the
associated time with their events is the value of the system time
clock.
 Incorporating time in relational databases can be done by adding
attributes VST (Valid_START_TIME) and VET (VALID_END_TIME) into an
ordinary relation.
 Each tuple, V, represents a version of the entity member that is valid in
the interval [V.VST, V.VET].
 The current version has a special value, now.
 Updating: In order to update a tuple, a new version is created and the
current version is closed (by changing its VET to the end time).
 Deletion: When deleting a tuple the current version is closed.
 Insertion: To insert a new entity member, create the first tuple version
and make it the current version (i.e., VST being the effective time, and
VET= new).

6. What is temporal and bitemporal relation. Give an example for each. (05)
[CO4].
Temporal Relations:
A relation in which time related data is incorporated for storing historical data is
known as temporal relation.
This can be done using two methods:
1. Valid time relation: The associated time is a valid time in the real world.
We can convert the two relations EMPLOYEE and DEPARTMENT into valid
time relations by adding the attributes Vst (Valid Start Time) and Vet (Valid
End Time), whose data type is DATE in order to provide day granularity. In
EMP_VT, each tuple V represents a version of an employee’s information
that is valid (in the real world) only during the time period [V.Vst,V.Vet].
2. Transaction Time Relation: The associated time with is the value of the
system time clock.
 In a transaction time database, whenever a change is applied to the
database, the actual timestamp of the transaction that applied the
change (insert, delete, or update) is recorded.
 Such a database is most useful when changes are applied
simultaneously in the majority of cases—for example, real-time stock
trading or banking transactions.
 The two relations EMPLOYEE and DEPARTMENT are converted into
transaction time relations by adding the attributes Tst (Transaction
Start Time) and Tet (Transaction End Time), whose data type is typically
TIMESTAMP.

Bitemporal Relations: Some applications require both valid time and


transaction time, leading to bitemporal relations.
 Figure 26.7(c) shows how the EMPLOYEE and DEPARTMENT nontemporal
relations in Figure 26.1 would appear as bitemporal relations EMP_BT and
DEPT_BT, respectively.
 In these tables, tuples whose transaction end time Tet is uc are the ones
representing currently valid information, whereas tuples whose Tet is an
absolute timestamp are tuples that were valid until (just before) that
timestamp. Hence, the tuples with uc in Figure 26.9 correspond to the
valid time tuples in Figure 26.7.

FIG 26.9

 The transaction start time attribute Tst in each tuple is the


timestamp of the transaction that created that tuple.
SECTION 2:

Q. Explain OLAP Operations. [CO5] Click here


Roll-up

Roll-up performs aggregation on a data cube in any of the following ways −

 By climbing up a concept hierarchy for a dimension


 By dimension reduction

The following diagram illustrates how roll-up works.

 Roll-up is performed by climbing up a concept hierarchy for the dimension location.


 Initially the concept hierarchy was "street < city < province < country".
 On rolling up, the data is aggregated by ascending the location hierarchy from the level of city to the level of
country.
 The data is grouped into cities rather than countries.
 When roll-up is performed, one or more dimensions from the data cube are removed.

Drill-down

Drill-down is the reverse operation of roll-up. It is performed by either of the following


ways −

 By stepping down a concept hierarchy for a dimension


 By introducing a new dimension.
The following diagram illustrates how drill-down works −

 Drill-down is performed by stepping down a concept hierarchy for the dimension time.
 Initially the concept hierarchy was "day < month < quarter < year."
 On drilling down, the time dimension is descended from the level of quarter to the level of month.
 When drill-down is performed, one or more dimensions from the data cube are added.
 It navigates the data from less detailed data to highly detailed data.

Slice

The slice operation selects one particular dimension from a given cube and provides a new
sub-cube. Consider the following diagram that shows how slice works.

 Here Slice is performed for the dimension "time" using the criterion time = "Q1".
 It will form a new sub-cube by selecting one or more dimensions.
Dice

Dice selects two or more dimensions from a given cube and provides a new sub-cube.
Consider the following diagram that shows the dice operation.

The dice operation on the cube based on the following selection criteria involves three
dimensions.
 (location = "Toronto" or "Vancouver")
 (time = "Q1" or "Q2")
 (item =" Mobile" or "Modem")
Pivot

The pivot operation is also known as rotation. It rotates the data axes in view in order to
provide an alternative presentation of data. Consider the following diagram that shows the
pivot operation.

Q2. Explain MOLAP, ROLAP and HOLAP . [CO5]

ROLAP: Relational OLAP


• To store and manage warehouse data, ROLAP uses
relational or extended-relational DBMS.
• ROLAP includes the following:
 Implementation of aggregation navigation logic.
 Optimization for each DBMS back end.
• ROLAP servers are placed between relational back-end.
server and client front-end tools.
• Advantages:
 Highly Scalable
 Can store and analyse large volume of highly changeable and volatile
data.
• Disadvantages:
 Poor query performance.
 Requires experienced professionals.
MOLAP: Multi-Dimensional OLAP

• MOLAP uses array-based multidimensional storage engines for


multidimensional views of data.
• With multidimensional data stores, the storage utilization
may be low if the data set is sparse.
• Therefore, many MOLAP server use two levels of data.
Storage representation to handle dense and sparse data set.
• Advantages:
 Fast and easy to use.
 ‘Perform complex computation.
• Disadvantages:
 Not capable of storing detailed data.
 DBMS facility is weak.

HOLAP: Hybrid OLAP

• Hybrid OLAP is a combination of both ROLAP and


MOLAP.
• It offers higher scalability of ROLAP and faster
computation of MOLAP.
• HOLAP servers allows to store the large data volumes of
detailed information.
• The aggregations are stored separately in MOLAP store.

Q3. Explain types of data extraction methods in ETL process. [CO6]

Operational data in the source system falls into two broad categories.
• Current value: The values are transient or transitory. As business
transactions happen, the values change. E.g. Customer name and address.
• Periodic Status: In this category, the value of the attribute is preserved as the
status every time a change occurs. E.g. data about an insurance policy.
• Two major types of data extractions from the source operational
systems: “as is”(static) data and data of revisions.

1.Static Data Capture:


 “As is” or static data is the capture of data at a given point in
time.
 For current or transient data, this capture would include all
transient data identified for extraction.
 In addition, for data categorized as periodic, this data capture
would include each status or event at each point in time as
available in the source operational systems.
2. Data of Revisions: (Also known as Incremental data capture.)
 If the source data is transient, the capture of the revisions is not
easy.
 For periodic status data or periodic event data, the incremental
data capture includes the values of attributes at specific times.
 Extract the statuses and events that have been recorded since
the last date of extract.
Other techniques:
3. Immediate Data Extraction(Real-time extraction):
It occurs as the transactions happen at the source databases and
files.
i. Capture through Transaction logs:
This data extraction technique reads the transaction log and
selects all the committed transactions.
If some of source system data is on indexed and other flat
files, this option will not work for these cases.
ii. Capture through Database Triggers:
You can create trigger programs for all events for which you
need data to be captured.
The output of the trigger programs is written to a separate file
that will be used to extract data for the data warehouse.
We can capture both before and after images.
iii. Capture In Source application:
The source application is made to assist in the data capture for
the data warehouse.
This technique may be used for all types of source data: databases,
indexed files, or other flat files.
4. Deferred Data Extraction:
Deferred data extraction does not capture the changes in real time.
The capture happens later.
i. Capture Based on Date and Time Stamp:
The time stamp provides the basis for selecting records for data
extraction. This technique captures the latest state of the
source data.
ii. Capture by Comparing files:
This technique is also called the snapshot differential technique
because it compares two snapshots of the source data.
Do a full file comparison between today’s copy of the product
data and yesterday’s copy.
This technique necessitates the keeping of prior copies of all
the relevant source data.

Q4. Write a short note datawarehouse architecture. [CO5]


The three major areas in the data warehouse are:
• Data acquisition
 Data acquisition covers the entire process of extracting
data from the data sources, moving all the extracted data
to the staging area, and preparing the data for loading
into the data warehouse repository.
 The two major architectural components are source data
and data staging.

 Functions and Services:


i. Data Extraction
ii. Data Transformation
iii. Data staging

• Data storage
 Data storage covers the process of loading the data from
the staging area into the data warehouse repository.
 All functions for transforming and integrating the data are
completed in the data staging area.
Architecture and Services

• Information delivery
 The information delivery component makes it easy for the
users to access the information either directly from the
enterprise-wide data warehouse, from the dependent data
marts, or from the set of conformed data marts.
 Most of the information access in a data warehouse is
through online queries and interactive analysis sessions.
 Almost all modern data warehouses provide for online
analytical processing (OLAP). The users perform complex
multidimensional analysis using the information cubes in the MDDBs.
Q5.Write a short note on challenges in ETL functions. [CO6]

ETL functions are challenging primarily because of the nature of the source
systems. Most of the challenges in ETL arise from the disparities among the
source operational systems.
Source systems are very diverse and disparate.
• There is usually a need to deal with source systems on multiple platforms
and different operating systems.
• Many source systems are older legacy applications running on obsolete
database technologies.
• Generally, historical data on changes in values are not preserved in source
operational systems. Historical information is critical in a data warehouse.
• Quality of data is dubious in many old source systems that have evolved
over time.
• Source system structures keep changing over time because of new
business conditions. ETL functions must also be modified accordingly.
• Gross lack of consistency among source systems is commonly prevalent.
Same data is likely to be represented differently in the various source
systems. For example,data on salary may be represented as monthly
salary, weekly salary, and bimonthly salary in different source payroll
systems.
• Even when inconsistent data is detected among disparate source systems,
lack of a means for resolving mismatches escalates the problem of
inconsistency.
• Most source systems do not represent data in types or formats that are
meaningful to the users. Many representations are cryptic and ambiguous.
• Designing ETL functions is time consuming and arduous.

Q6. List and explain basic tasks involved in data transformation. [CO6]
Selection:
 This takes place at the beginning of the whole process of data
transformation.
 Select either whole records or parts of several records from the
source systems.
 The task of selection usually forms part of the extraction function
itself.
Splitting/joining:
 This task includes the types of data manipulation you need to
 perform on the selected parts of source records.
 Sometimes, you will be splitting the selected parts even further
 during data transformation.
 Sometimes joining the parts selected from many different
 source system
Conversion.
Conversions is done for two primary reasons—
1. To standardize among the data extractions from disparate
source systems.
2. To make the fields usable and understandable to
the users.
Summarization.
 It may be that none of users ever need data at the lowest
granularity for analysis or querying.
 For example, for a grocery chain, sales data at the lowest level
of detail for every transaction at the checkout may not be needed.
 So, in this case, the data transformation function includes
Summarization of daily sales by product and by store. Hence data is
easily understandable.

Q.7 Explain Factless Fact table with example. [CO5]


• Apart from the concatenated primary key, a fact table contains facts or
measures.
• Let us say we are building a fact table to track the attendance of students.
For analysing student attendance
The possible dimensions are :
student, course, date, room, and professor.
• The attendance may be affected by any of these dimensions. When you want
to mark the attendance relating to a particular course, date, room, and
professor in the fact table row, the attendance will be indicated
with the number one.
• Every fact table row will contain the number one as attendance.
Hence there is no need to do this. The very presence of a corresponding fact
table row could indicate the attendance.
• This type of situation arises when the fact table represents events. Such fact
tables really do not need to contain facts. They are “factless” fact tables.
Figure shows a typical factless fact table.

Anda mungkin juga menyukai