Attribution Non-Commercial (BY-NC)

1 tayangan

Attribution Non-Commercial (BY-NC)

- SQL Query Exercises
- 2.1
- Alok Choudhary NGDM07 Panel Talk
- Normalization
- Data processing
- 1.Principles of Security and Integrity of Databases
- Data Collection
- 6
- Plagiarism
- Oracle Discoverer
- Package MAIN
- 172
- XMLP Templates by Example
- Effectively Modeling Dependence and Features in IR
- Sergio
- Tarea_seminario
- MEEM Lab Report Guidelines January 2013
- Ch4.ppt
- 2 8dataanalysis
- 6nf

Anda di halaman 1dari 66

Attributes

GUIDELINE 1: Informally, each tuple in a relation should represent one entity or relationship instance. (Applies to individual relations and their attributes).

Attributes of different entities (EMPLOYEEs, DEPARTMENTs,

PROJECTs) should not be mixed in the same relation Only foreign keys should be used to refer to other entities Entity and relationship attributes should be kept apart as much as possible.

Bottom Line: Design a schema that can be explained easily relation by relation. The semantics of attributes should be easy to interpret.

Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition

Copyright 2004 Ramez Elmasri and Shamkant Navathe

Chapter 10-7 2

Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition

Copyright 2004 Ramez Elmasri and Shamkant Navathe

Chapter 10-8 3

Tuples and Update Anomalies

Mixing attributes of multiple entities may cause problems Information is stored redundantly wasting storage Problems with update anomalies

Copyright 2004 Ramez Elmasri and Shamkant Navathe

Chapter 10-9 4

Consider the relation:

EMP_PROJ ( Emp#, Proj#, Ename, Pname, No_hours)

Update Anomaly: Changing the name of project number P1 from Billing to CustomerAccounting may cause this update to be made for all 100 employees working on project P1.

Copyright 2004 Ramez Elmasri and Shamkant Navathe

Insert Anomaly: Cannot insert a project unless an employee is assigned to . Inversely - Cannot insert an employee unless an he/she is assigned to a project. Delete Anomaly: When a project is deleted, it will result in deleting all the employees who work on that project. Alternately, if an employee is the sole employee on a project, deleting that employee would result in deleting the corresponding project.

Copyright 2004 Ramez Elmasri and Shamkant Navathe

Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 10-12 7

Copyright 2004 Ramez Elmasri and Shamkant Navathe

Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 10-13 8

Copyright 2004 Ramez Elmasri and Shamkant Navathe

GUIDELINE 2: Design a schema that does not suffer from the insertion, deletion and update anomalies. If there are any present, then note them so that applications can be made to take them into account

Copyright 2004 Ramez Elmasri and Shamkant Navathe

GUIDELINE 3: Relations should be designed such that their tuples will have as few NULL values as possible Attributes that are NULL frequently could be placed in separate relations (with the primary key) Reasons for nulls:

- attribute not applicable or invalid - attribute value unknown (may exist) - value known to exist, but unavailable

Copyright 2004 Ramez Elmasri and Shamkant Navathe

Bad designs for a relational database may result in erroneous results for certain JOIN operations The "lossless join" property is used to guarantee meaningful results for join operations

GUIDELINE 4: The relations should be designed to satisfy the lossless join condition. No spurious tuples should be generated by doing a natural-join of any relations.

Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 10-16 11

Copyright 2004 Ramez Elmasri and Shamkant Navathe

There are two important properties of decompositions: (a) non-additive or losslessness of the corresponding join (b) preservation of the functional dependencies. Note that property (a) is extremely important and cannot be sacrificed. Property (b) is less stringent and may be sacrificed. (See Chapter 11).

Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 10-17 12

Copyright 2004 Ramez Elmasri and Shamkant Navathe

Spurious Tuples

If we decompose a relation R into smaller relations and then we apply natural join on smaller relations , if the result of natural join contains many more tuples than the original set of tuples in original relation R. The extra tuples are called Spurious Tuples. Spurious Tuples are normally generated when the join is made on

Example is on next slides

13

ENO

1 3 2

PNUMBER

PROJ1 PROJ3 PROJ2

HOURS

10 10 10

PNAME

XYZ PQR DEF

PLOCATION

DELHI NOIDA DELHI

14

ENAME RAVI KIRAN PLOCATION DELHI NOIDA ENO 1 2 3 PNUMBER PROJ1 PROJ2 PROJ3 HOURS 10 10 10 PNAME XYZ DEF PQR PLOCATION DELHI DELHI NOIDA

SAHIL

DELHI

1 2 3 1 2

EMP_PROJ1 PNUMBER

PROJ1 PROJ2 PROJ3 PROJ1 PROJ2

HOURS

10 10 10 10 10

PNAME

XYZ DEF PQR XYZ DEF

PLOCATION

DELHI DELHI NOIDA DELHI DELHI

15

Need of Schema Refinement : To reduce redundant storage. What is redundant storage: same information is stored repeatedly in the database.

16

1.

2.

17

3. Insertion anomalies: It may not be possible to store some information unless some other information is stored as well. 4. Deletion anomalies: It may not be possible to delete some information without losing some other information as well.

18

EMP_PROJ ( Emp#, Proj#, Ename, Pname, No_hours)

EmpID 10 20 30 40 50 60 70 ProjID 006 006 007 007 008 008 009 Ename E1 E2 E3 E4 E5 E6 E7 Pname P1 P1 P2 P2 P3 P3 P4

19

EMPID 10 20 30 40 50 60 70

ENAME E1 E2 E3 E4 E5 E6 E7

PROJID 006

PNAME P1

007

008 009

P2

P3 P4

20

Decompositions : Is replacing a relation with a collection of smaller relations. Each of the smaller relations contains a (strict) subset of the attributes of the original relation.

21

Decomposition can also create problems than it solves. Two important questions must be asked repeatedly.

1. 2.

Do we need to decompose a relation. What problems (if any) does a given decomposition cause ?

22

For the first question : Several Normal Forms are proposed for relations. If the relation is in one of these normal forms, we know that certain kinds of problems cannot arise. But if we still need decomposition than , we must carefully choose the decomposition.

23

For the second question : two properties of decompositions are of particular interest. A. Lossless-join property: B. Dependency-preservation :

24

Lossless-join Property enables us to recover any instance of the decomposed relation from corresponding instances of the smaller relations. Dependency Preservation property enables us to enforce any constraint on the original relation by simply enforcing some constraints one each of the smaller relations.

25

A serious drawback of decomposition is that queries over the original relation may require us to join the decomposed relations. If such queries are common (regularly performed), the performance penalty of decomposing the relation may not be acceptable.

26

Let R be a relation schema and let X and Y be nonempty sets of attributes in R. We say that an instance r of R satisfies the FD X -> Y if the following holds for every pair of tuples t1 and t2 in r: If t1.x = t2.x , then t1.y = t2.y

27

A A1

A1 A2 A1

B B1

B2 B1 B1

C C1

C2 C3 C1

D D1

D1 D1 D2

28

It is natural to ask whether we need to decompose relations produced by translating an ER diagram. The following examples will illustrate why decomposition of relations produced through ER design might be necessary.

29

Only FDs that determine all attributes of relation (i.e. key constraints) can be expressed in the ER Model.

{ssn} -> {ssn, name, lot, rating, hourly_wages, hours_worked} FD: rating->hourly_wages.

30

Ssn

12369 12345 12569 47895 12567

Name

E1 E2 E3 E4 E5

Lot

48 22 35 35 35

Rating

8 8 5 5 8

Hourly_wage

10 10 7 7 10

Hours_wor ked

40 30 30 32 40

Hourly_Emps

31

PID

PName

quanti ty

Sid

SName

Parts

Contracts

Suppliers

Departments

did

dname

32

Contractid C11

C12 c13 c14

Sid S1

s2 s1 s1

Quantity Pid 12

13 14 15

did D2

D3 d2 d2

P1

P2 P1 p1

Redundant information

Company have a policy that a department purchases at most one part from any given supplier. If there is several contracts between same supplier and department, we know that the same part must be involved in all of them. FD : DS->P

33

Contractid C11

C12 c13 c14

Quantity 12

13 14 15

Sid S1

s2 s1 s1

did D2

D3 d2 d2

Sid S1 s2

Did d2 d3

pid p1 p2

34

EID

Name

since

did

dName

Employees

Works_in

Department s

lot

budget

35

EID

Name

since

did

dName

Employees

Works_in

Department s

budget

lot

36

Workers table

Eid 101 Name E1 Lot L1 Did 10 since 1999 Redundant information

102

103 104 105

E2

E3 E4 e5

L1

L2 L2 L1

10

20 20 10

1999

2001 2002 2000

All employees are assigned parking lots based on their departments. So the FD: did -> lot

37

Eid

Name E1 E2 E3 E4 e5

Did 10 10 20 20 10

did

10 20

Lot

L1 L2

38

Sailor_id

Boat_id

Date

Credit_card_ no

S1 S1

S2 S2 S1

Interlake Redocean

Interlake Readocean Seanation

1 Jan 11 1 Feb 11

2 Jan 11 2 Feb 11 3 Feb 11

123696589 123696589

593569633 593569633 123696589

39

Sailor_id S1 S1 S2 S2 S1

Sailor id

Credit card

s1

s2

123696589

593569633

40

Given set of FDs over a relation schema R, there are typically several additional FDs that hold over R whenever all of the given FDs hold. Example

Workers( ssn, name, lot, did, since) Stated fds : ssn -> did, did -> lot Implied fd: ssn -> lot

41

Let R be a relation schema and let X and Y be nonempty sets of attributes in R. We say that an instance r of R satisfies the FD X -> Y if the following holds for every pair of tuples t1 and t2 in r: If t1.x = t2.x , then t1.y = t2.y

42

The set of all FDs implied by a given set F of FDs is called the closure of F and is denoted by F+.

Informally, We can group FDs in two types 1. All FDs stated in Set F of FDs. 2. Second group, is all FDs that can be derived from Set F The rule to derive FDs from F is by using three rules called Armstrongs Axioms. These rules can applied repeatedly to infer all FDs implied by a set of FDs.

43

Reflexivity:

If Y X then X Y

(trivial dependency)

Augmentation: If X Y then XW YW course_no subj so course_no, grade subj, grade Transitivity: If X Y and Y Z then X Z eid did and did lot so eid lot

44

A trivial FD is one in which the right side contains only attributes that also appear on the left side.

45

Armstrongs Axioms are sound in that they generate only FDs in F+ when applied to a set F of FDs. They are complete in that repeated applications of these rules will generate all FDs in the closure F+.

46

Union:

47

Proof of decomposition 1. X -> YZ (GIVEN) 2. YZ -> Y (using IR1 and knowing that YZ > Y) 3. X -> Y (using IR3 on 1 and 2)

48

(given) (Augmentation) (given) (2, 3 and Transitivity)

49

50

Question: Let R (C,S,J,D,P,Q,V) Find other FDs using rules Given FDs : 1. C->CSJDPQV 2. JP->C 3. SD->P Ans. 4. From 1 and 2 : JP -> CSJDPQV 5. Using Augmentation on 3 SDJ ->JP, refer to 4. JP -> CSDJPQV Then using transitivity : SDJ -> CSJDPQV

51

If we just want to check whether a given dependency say, X -> Y is in closure of a set of F of FDs, we can do efficiently without computing F+. We first compute the attribute closure X+ with respect to F, which is the set of attributes A such that X -> A can be inferred using the Armstrong Axioms.

52

if there is an FD U -> V in F such that U closure, then set closure = closure U V

53

1. 2. 3.

First Normal Form (1NF) Second Normal Form (2NF) Third Normal Form (3NF) The normal forms (abbrev. NF) of relational database theory provide criteria for determining a table's degree of vulnerability to logical inconsistencies and anomalies.

Initially Codd proposed three Normal Forms and later Boyce and Codd proposed stronger definition of 3NF known as BCNF.

54

Normalization of data can be looked upon as a process of analyzing the given relation schemas based on their FDs and primary keys to achieve the desirable properties of Minimizing redundancy and 2 minimizing the insertion, deletion and updation anomalies.

The normal form of a relation refers to the highest normal form condition it meets.

55

The process of normalization should not be considered in isolation , other two properties should also be considered.

1.

2.

Losseless join or nonadditive join property : which guarantees that the spurious tuple generation problem does not occur with respect to the relation schemas created after decomposition. Dependency preservation property which ensures that each functional dependency is represented in some individual relation resulting after decomposition.

56

If a relation schema has more than one key, each is called a candidate key. One of the candidate keys is arbitrarily designated to be the primary key, and the others are called secondary keys. APrime attribute must be a member of some candidate key ANonprime attribute is not a prime attribute that is, it is not a member of any candidate key.

Copyright 2004 Ramez Elmasri and Shamkant Navathe

Disallows composite attributes, multivalued attributes, and nested relations; attributes whose values for an individual tuple are non-atomic

Nested Relations: multivalued attributes that are themselves composite are called nested relations.

Copyright 2004 Ramez Elmasri and Shamkant Navathe

Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 10-35

Copyright 2004 Ramez Elmasri and Shamkant Navathe

Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 10-36

Copyright 2004 Ramez Elmasri and Shamkant Navathe

Definitions: Prime attribute - attribute that is member of the some candidate key. Full functional dependency - a FD Y -> Z where removal of any attribute from Y means the FD does not hold any more

Examples: - {SSN, PNUMBER} -> HOURS is a full FD since neither SSN -> HOURS nor PNUMBER -> HOURS hold - {SSN, PNUMBER} -> ENAME is not a full FD (it is called a partial dependency ) since SSN -> ENAME also holds

Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 10-37

Copyright 2004 Ramez Elmasri and Shamkant Navathe

Second Normal Form (2) A relation schema R is in second normal form (2NF) if every non-prime attribute A in R is fully functionally dependent on the primary key

normalization.

NOTE: if the primary key contains single attribute, the test need not be applied at all.

Copyright 2004 Ramez Elmasri and Shamkant Navathe

2NF NORMALIZATION ENO FD1 FD2 FD3 PNUMBER HOURS ENAME PNAME LOCAIOTN

ENO FD1

PNUMBER

HOURS

ENO

ENAME FD2

PNUMBER FD3

PNAME

PLOCATION

63

Definition: Transitive functional dependency - a FD X -> Z

that can be derived from two FDs X -> Y and Y -> Z Examples:

SSN -> DNUMBER and DNUMBER -> DMGRSSN hold - SSN -> ENAME is non-transitive since there is no set of attributes X where SSN -> X and X -> ENAME

Copyright 2004 Ramez Elmasri and Shamkant Navathe

A relation schema R is in third normal form (3NF) if it is in 2NF and no non-prime attribute A in R is transitively dependent on the primary key R can be decomposed into 3NF relations via the process of 3NF normalization

NOTE: In X -> Y and Y -> Z, with X as the primary key, we consider this a problem only if Y is not a candidate key. When Y is a candidate key, there is no problem with the transitive dependency . E.g., Consider EMP (SSN, Emp#, Salary ). Here, SSN -> Emp# -> Salary and Emp# is a candidate key.

Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 10-42

Copyright 2004 Ramez Elmasri and Shamkant Navathe

ENAME ENO

DOB

ADDR DNO

DNO

DNAME

DMGRENO

66

- SQL Query ExercisesDiunggah olehraruntc
- 2.1Diunggah olehSai Vishnu Vardhan
- Alok Choudhary NGDM07 Panel TalkDiunggah olehapi-3798592
- NormalizationDiunggah olehirwog
- Data processingDiunggah olehSuresh Murugan
- 1.Principles of Security and Integrity of DatabasesDiunggah olehCarlos A. Garcia
- Data CollectionDiunggah olehchickoooo
- 6Diunggah olehNagaraj Doddamani
- PlagiarismDiunggah olehRandy Garcia
- Oracle DiscovererDiunggah olehPavan Kumar Challa
- Package MAINDiunggah olehGaurav Sharma
- 172Diunggah olehShani Suthar
- XMLP Templates by ExampleDiunggah olehBhargi111
- Effectively Modeling Dependence and Features in IRDiunggah olehStray Man
- SergioDiunggah olehSergio Callizaya
- Tarea_seminarioDiunggah olehJean Rosas
- MEEM Lab Report Guidelines January 2013Diunggah olehkuromeru
- Ch4.pptDiunggah olehSulavshakya
- 2 8dataanalysisDiunggah olehapi-260889734
- 6nfDiunggah olehawda0
- imyDiunggah olehrajeshkri
- DBA Assignment by Arslan Ahmad (1044) (Autosaved)Diunggah olehAzeemChaudhary
- Art.M.ferro Relaciones Historia-CineDiunggah olehMaría Elisa Al Cheikh
- Multimodal information retrieval based on DSmT. Application to computeraided medical diagnosisDiunggah olehMia Amalia
- Q6_V1Diunggah olehAnaXYef
- Job Description Associate 4th AprilDiunggah olehMithra
- DBMS(5th)Dec2018Diunggah olehlovebangla
- IspDiunggah olehRosco Pamintuan
- Database IntegrationDiunggah olehManu Bhanot
- Insurance Management System report.docDiunggah olehPrince Ganesh

- Understanding Urimaps, Pipelines and Webservices for CICS100407Diunggah olehSabariram Kandasamy
- Chapter 2Diunggah olehSachin Jain
- LayersDiunggah olehDanudear Daniel
- integration techniquesDiunggah olehSrinivasKannan
- SoftEngr.docxDiunggah olehathena
- CS2258_SET2Diunggah olehrajasekarkala
- SQLServerquestions.pdfDiunggah olehobee1234
- SWEBOKv3Diunggah olehdiego
- Reference Partitioning MethodDiunggah olehCristian Leiva L
- Vaadin on GrailsDiunggah olehNikola Kurtić
- A7-R3_5Diunggah olehGeetanjali Arora
- Homework Chapter 9Diunggah olehAngela Pinto
- Relational Model TransparenciesDiunggah olehchinnuchoudary
- SQL TutorialDiunggah olehFaisal Aziz
- Data Modeling Case StudyDiunggah olehminderchen
- 11Diunggah olehPegy Rahayu
- B24_crDiunggah olehrakesh19865
- Board Questions and Answers Q No 5 of 7 SQL Worksheets With AnswersDiunggah olehSooraj Rajmohan
- ER Model - OutlineDiunggah olehnbpr
- webspheremessagebrokerapplicationdevelopmentv1-130719115834-phpapp01Diunggah olehYaj Bhattacharya
- 4th Normal Form of NormalizationDiunggah olehshoaib
- 004 Saral Ashtakoot MilanDiunggah olehRavi Bansal
- 5. NormalizationDiunggah olehAffan Ahmed
- Commands of SQLDiunggah olehharjinder pal singh
- ANNAUNIVERSITY DBMSDiunggah olehlegy86
- Design PatternsDiunggah olehNakan Phung
- TE-DS.pdfDiunggah olehAnsari Maqsood Ahmed
- dbms r16 syllabus of jntuk and its structureDiunggah olehChristina Robinson
- O2DDL DWDiunggah olehArima Prabu
- DBMS 30 Qns AnsDiunggah olehpavankumar1889