Based on the analysis of functional dependencies among Such a functional dependency is denoted as XÆY
attributes.
1
Candidate and Primary Keys Primary Keys
Every relation (entity) must have a primary key
Superkey – a set of one or more attributes that uniquely
identifies a specific instance of an entity To qualify as a primary key, an attribute must have the following
properties:
Candidate key – any subset of the attributes of a superkey
• it must have a non-null value for each instance of the entity
that is also a superkey and not reducible to another superkey
• the value must be unique for each instance of an entity
Primary key – a selection from the set of candidate keys - • the values must not change or become null during the life of
used to index a relation each entity instance
Student # Student Name Major Full functional dependence means that when a primary key is
38214 Bright IS composite, then the other columns must be identified by the
38214 Bright EE entire key and not just some of the columns that make up the key.
69173 Smith PM
2
Foreign Keys Steps in Normalization
A foreign key is an attribute that completes a relationship by
identifying the parent entity. Assemble data items from user views
Foreign keys provide a method for maintaining integrity in Convert to un-normalized relations
the data (called referential integrity) and for navigating
between different instances of an entity. Convert to first normal form (1NF)
Every relationship in the model must be supported by a foreign key. Convert to second normal form (2NF)
Foreign keys are formed in dependent entities by migrating the Convert to third normal form (3NF)
entire primary key from the parent entity. If the primary key is
composite, it may not be split. Should result in simple relations that correspond to
entities or associations between entity classes
3
Normalized relations: First First Normal Form
Remove repeating groups and form 2 new relations – migrate the
Normal Form primary key, and assure there is a valid new primary key
A relation is in first normal form if the underlying Student # Student Name Major
domains contain only atomic values 38214 Bright IS
69173 Smith PM
There are no repeating groups within a tuple
Student # Course # Course Instructor Instructor Grade
Most relational systems require a database to be in 1NF Title name Location
38214 IS 350 Database Codd B104 A
38214 IS 465 Sys Anal Kemp B213 C
69173 IS 465 Sys Anal Kemp B213 A
69173 PM 300 Op Res Lewis D317 B
4
Update anomaly Deletion anomaly
Changing a course title or course number requires Dropping a single student from a course requires dropping the
searching all tuples to find every occurrence of a course course and losing the associated course and instructor
number or title information
Student # Course # Course Instructor Instructor Grade Student # Course # Course Instructor Instructor Grade
Title name Location Title name Location
38214 IS 350 Database Codd B104 A 38214 IS 350 Database Codd B104 A
38214 IS 465 Sys Anal Kemp B213 C 38214 IS 465 Sys Anal Kemp B213 C
69173 IS 465 Sys Anal Kemp B213 A 69173 IS 465 Sys Anal Kemp B213 A
69173 PM 300 Op Res Lewis D317 B 69173 PM 300 Op Res Lewis D317 B
5
Second Normal Form Transitive dependencies
To convert from first to second normal form – remove partial dependencies
Course # Course Instructor Instructor
Create 2 new relations, one with attributes fully dependent on Title Name Location
primary key, other with attributes only partially dependent IS 350 Database Codd B104
A non-key
Student # Course # Grade IS 465 Sys Anal Kemp B213
Courses are independent of attribute is
38214 IS 350 A PM 300 Prod man Lewis D317 dependent on
Student # and so can be inserted
38214 IS 465 C or deleted independently, only a QM 440 Op Res Kemp B213 one or more no-
69173 IS 465 B single tuple needs to be updated key attributes
69173 PM 300 C in the course relation
one-to-one relationship
Course # Course Instructor Instructor
Title Name Location one-to-one relationship
IS 350 Database Codd B104
IS 465 Sys Anal Kemp B213 Instructor Instructor
Course# Course Title Name
PM 300 Prod man Lewis D317 Location
QM 440 Op Res Kemp B213
Deleting data for a course results in deleting instructor To update instructor information the entire relation must be
information searched since instructor information occurs more than once.
6
Third Normal Form Boyce-Codd Normal Form
A relation is in third normal form if it is in 2NF and Occurs in the case of overlapping candidate keys
contains no transitive dependencies
Each student can major in several subjects
Every non-key attribute is fully dependent on the primary
key and there are no transitive dependencies For each major a student has one advisor
Each major has several advisors
Instructor Name Instructor Location Non-key attributes that
Codd B104 participate in the Each advisor advises only one major
Kemp B213 transitive dependency
form a new relations There are 2 possible candidate keys:
Lewis D317 Student # Major Advisor
Student #-Major or Major –Advisor
123 Physics Einstein
Course # Course title Instructor Name and they are overlapping.
Foreign key – a non- 123 Music Mozart
IS 350 Database Codd
IS 465 Sys Anal Kemp key attribute in one 456 Biol Darwin Attributes that are part of a
relation that serves 789 Physics Bohr candidate key are dependent on
PM 300 Prod Mang Lewis as a primary key in
999 Physics Einstein part of another candidate key.
QM 440 OP Res Kemp another relation
7
Fourth Normal Form Normalization Summary
Computer Package Outlet Several
redundancies Leads to simpler (to implement) applications and to more
Apple Visicalc Computerland
exist in the maintainable systems
Apple Applestar Computerland
relation Based on a set of rules that define normal forms – of which first
Apple Visicalc Byte Shop
three are most important:
Zenith Wordstar Computershop Can generate
Zenith Supercalc Computershop
deletion and
update anomalies First normal form: All column values are atomic
Zenith Wordstar Byte Shop
Second normal form: All column values depend on the
Computer Package Computer Outlet
Project to 2 value of the primary key: no partial dependencies
Apple Visicalc Apple Computerland new Third normal form: No column value depends on the value
Apple Applestar Apple Byte Shop relations of any other column except the primary key – no
Zenith Wordstar Zenith Computershop transitive dependencies
Zenith Supercalc Zenith Byte Shop
Customer (Name, Street, City, State, Zipcode) Spatial data objects is one of them