Analisa
Tujuan: Mengembangkan spesifikasi teknologi Hasil: Struktur program/data, pembelian teknologi, restrukturisasi organisasi
Perancangan Logis
Perancangan Fisik
Implementasi
Pemeliharaan
UBL
PBD - 1
Pendahuluan
PERANCANGAN DATABASE adalah proses pembuatan (develop) stuktur database sesuai dengan data yang dibutuhkan oleh user. LANGKAH-LANGKAH DALAM PERANCANGAN DATABASE
1. Mendefinisikan kebutuhan (Requirements definition)
Tujuan: untuk mengidentifikasi dan mendeskripsikan data yang dibutuhkan oleh user dalam sebuah organisasi.
2. Rancangan Konseptual (Conceptual design) Tujuan: untuk membuat sebuah model data konseptual (atau arsitektur informasi) yang akan mendukung perbedaan kebutuhan informasi dari beberapa user dalam sebuah organisasi.
UBL
PBD - 2
3. Rancangan Implementasi (Implementation design) Tujuan: untuk memetakan model data logis (logical data model) kedalam sebuah skema yang dapat diproses oleh DBMS tertentu. 4. Rancangan Fisik (Physical design) Pada tahap terakhir ini, logical database structured ( normalized relation, trees, network dll) dipetakan menjadi physical storage structure seperti file dan tabel.
Langkah Perbaikan (Stepwise refinement) Keseluruhan proses perancangan pada perancangan database harus dipandang sebagai satu langkah perbaikan, dimana perancangan pada setiap tahapan diperbaiki secara progresif melalui perulangan (iteration). Langkah perbaikan harus dilakukan pada bagian akhir setiap tahapan sebelum melangkah ke tahapan berikutnya..
UBL Fakultas Teknologi Informasi PBD - 3
Tujuan: Menterjemahkan deskripsi logis data kedalam spesifikasi teknis penyimpanan dan pengambilan data Hasil: Suatu rancangan penyimpanan data yang menghasilkan kinerja pemrosesan yang memadai dan menjamin integritas, keamanan dan ketahanan data
UBL
PBD - 4
Pilihan
Menentukan
Tipe data tiap atribut Deskripsi record fisik Organisasi file Indeks dan arsitektur database Optimasi kinerja query
UBL
PBD - 5
Step 3 Disain Implementasi Logical database structure (DBMS-processible) And application program specifications Step 4 Disain Pisik Struktur Database pisik Hardware / Operating Karakter System
UBL
PBD - 6
MENDEFINISIKAN KEBUTUHAN
Mendefinisikan Kebutuhan (Requirements definition) adalah proses mengidentifikasi & mendokumentasikan data yang dibutuhkan oleh user dalam sebuah database untuk memenuhi kebutuhan informasi saat ini dan masa yang akan datang. 2 jenis informasi yang harus diperhatikan selama tahapan mendefinisikan kebutuhan : 1. Informasi yang menjelaskan struktur data, seperti entitas, atribut, dan relasi. Informasi ini biasanya dinyatakan dalam bentuk grafik seperti entity-relationship diagrams (E-RD). 2. Informasi yang menggambarkan aturan atau batasan yang dapat menjaga integritas data. Biasanya disebut aturan bisnis (business rules), batasan-batasan ini harus di tuangkan dalam data dictionary/directory (atau repository) suatu organisasi.
UBL Fakultas Teknologi Informasi PBD - 7
Entity
Relationship
attribute
Class-subclass
constraints
Candidate key
Foreign key
descriptor
domain
Referential integrity
Primary key
Name /Definition
insert
simple
Type
delete
composite
Length
update
Format
Allowable Value
UBL
PBD - 8
Tiga type utama batasan (constraints) : 1. Domain constraints Batasan ini mendefinisikan type, lebar, format, dan nilai yang diperkenankan untuk setiap data. 2. Referential Integrity constraints Batasan ini meyakinkan integritas of references antara beberapa baris dari sebuah tabel dan baris-baris pada tabel lainnya (atau beberapa tabel). 3. Other business constraints Batasan ini meyakinkan integritas dari nilai sebuah data dalam sebuah tabel, memberikan satu atau lebih nilai sebuah data pada tabel yang sama atau pada tabel yang lain.
UBL Fakultas Teknologi Informasi PBD - 9
UBL
PBD - 10
2. Memilih metodologi dalam Mendefinisikan Kebutuhan Memilih metodologi dan CASE tools yang sesuai adalah hal yang esensial. Metodologi memberikan prosedur standar dan format pengumpulan data yang dibutuhkan untuk mengelola pengumpulan metadata pada disiplin tertentu. CASE tools memberikan dukungan berbasis komputer (computerbased support) untuk membangun sebuah repository dari metadata dan membuat tampilan yang terstruktur dari metadata tersebut. CASE tools yang digunakan selama tahapan Mendefinisikan Kebutuhan harus sesuai dengan CASE tools yang digunakan selama merencanakan database.
UBL
PBD - 11
Tahapan ini membutuhkan struktur yang konsisten pada setiap user views yang telah diidentifikasikan pada tahapan sebelumnya.
Pada sesi sebelumnya, kita menggunakan E-R Diagrams untuk membuatn model data structure. Memodelkan user views dalam bentuk E-R Diagrams memerlukan entitas, relasi (relationship), atribut, candidate keys, Primary Key dan descriptor yang relevan untuk setiap pandangan user yang kita identifikasikan.
UBL
PBD - 13
5. Model database constraints Selama tahapan mendefinisikan kebutuhan, database analyst juga harus mengidentifikasi basic constraints yang menjaga integritas database. Batasan-batasan ini : domains, referential integrity dan aturan bisnis lainnya. Batasan ini seharusnya disimpan dalam data dictionary (atau repository), dengan menggunakan CASE tools yang tersedia.
UBL
PBD - 14
Seorang analyst juga harus mengumpulkan informasi yang berkenaan dengan kebutuhan operasional user akan data. Tahapan ini meliputi kebutuhan untuk masing-masing area berikut: 1. Keamanan (Security). 2. Waktu Respon (Response times). 3. Backup and recovery. 4. Dokumentasi (Archiving). 5. Prediksi Perkembangan (Growth Projections) Database.
UBL
PBD - 15
report
Mengidentifikasi User Views
display Model Data Structure Model Data Constraints Mengidentifikasi kebutuhan operasional Ke tahapan logical database design
UBL Fakultas Teknologi Informasi PBD - 16
Case Tools
Perancangan Field
Field: satuan data terkecil dalam database Perancangan field
Memilih tipe data yang sesuai Kodifikasi, kompresi, enkripsi Penjagaan integritas data
UBL
PBD - 17
UBL
PBD - 18
Record Fisik
Record Fisik: Satu set berisi field-field yang disimpan secara berdekatan dalam suatu lokasi memori dan diakses secara bersamaan sebagai satu kesatuan Page: Ukuran data yang dibaca atau ditulis DBMS dalam satu operasi Input/Output (I/O) Blocking Factor: Jumlah record fisik dalam satu page
UBL
PBD - 21
Denormalisasi
Mengubah relasi-relasi ternormalisasi ke spesifikasi record fisik yang tak ternormalisasi Keuntungan:
Dapat meningkatkan kinerja (kecepatan) proses dengan mengurangi jumlah tabel yang dilibatkan dalam proses (mengurangi jumlah operasi join)
UBL
PBD - 22
Denormalisasi
Penggunaan denormalisasi yang umum:
Hubungan satu-ke-satu Hubungan banyak-ke-banyak dengan atribut Data referensi (hubungan 1:N dengan satu pihak memiliki data yang tidak digunakan dalam hubungan-hubungan lain)
UBL
PBD - 23
Deskripsi Data
Model Conceptual. Analisa Entity Relationship.
UBL
PBD - 24
1. Entitas (Entity)
Adalah orang, tempat, peristiwa, objek atau konsep yang datanya akan dikelola. Contoh: Orang MAHASISWA, KARYAWAN, PASIEN Tempat TOKO, GUDANG, KOTA Peristiwa PENJUALAN, PEMINJAMAN Objek MOBIL, BARANG, BUKU Konsep MATAKULIAH, REKENING, ACCOUNT
UBL
PBD - 25
UBL
PBD - 26
Inappropriate entities
System user
System output
Appropriate entities
UBL Fakultas Teknologi Informasi PBD - 27
Simbol Entitas
Simbol Relasi
Simbol Atribut
UBL
PBD - 28
Klasifikasi Atribut
a. Simple Attribute
UBL
PBD - 29
UBL
PBD - 30
Composite attribute
UBL
PBD - 31
UBL
PBD - 32
Entitas dengan multivalued attribute (Skill) dan derived attribute (Years_Employed) Whats wrong with this?
Multivalued: Derived
from date employed and current date
UBL
UBL
PBD - 34
Semua atribut yang menguraikan Entity atau Relationship tertentu harus diberi nama.
Masing-Masing Relationship harus meliputi atribut yang menguraikan Entity tersebut dalam membentuk Raltionship. Nama penuh arti harus diterpilih sehingga E-R diagram adalah self-explanatory (menjelaskan isi dari dirinya)
UBL Fakultas Teknologi Informasi PBD - 35
3. Relationship Sets
Untuk Memodelkan Interaksi Antara Entitas Dalam Entity Sets
PERSONS
DRIVE
VEHICLES
Sebuah Relationship Set direpresentasikan dengan sebuah diamond pada E-R diagram Diamond ini menghubungkan Entity Sets dari entitas yang berinteraksi dalam sebuah relasi (relationship) Relatlationship dapat mempunyai attributs Attributs ini mengambarkan ciri dari Entity dan Relationship. Dua Entity dapat mempunyai lebih dari satu jenis Relasi (multiple relationships) Associative Entity = Kombinasi dari Relationship and Entity
UBL Fakultas Teknologi Informasi PBD - 36
Degree of Relationships
Jenis-Jenis dari Degree of Relationship Unary Relationship Binary Relationship Ternary Relationship
UBL
PBD - 37
Contoh :
UBL
PBD - 38
Unary Relationship
UBL
PBD - 39
Binary Relationship
UBL
PBD - 40
Ternary Relationship
UBL
PBD - 41
PERSONS persons
p1
DRIVE
VEHICLES vehicles
Relasi Drive
d1
OCCURENCE DIAGRAM Dimana, Person p1 Drive Vehicle v1 Person p2 Drive Vehicles v1 dan v2 Person p3 Drive Vehicle v2 Person p4 Drive Vehicles v2 dan v3
v1 d2 d3 v2 d4 d5 v3
p2
p3 p4
d6
P4 Drive v3
UBL
PBD - 42
PUNYA
COMPANIES SEWA
VEHICLES
COMPANIES c1 c2 c3
PUNYA
VEHICLES v1
v2 v3
v4 v5 SEWA UBL Fakultas Teknologi Informasi PBD - 43
OF FOR IN
TO ABOUT BY
UBL
PBD - 44
Examples/Contoh
Proyek
Pimpin
Manajer
Orang
Paka i
Buat
Material
Perjalanan
Ke
Kepulauan
Suppliers
Of
Of
Vehicles
By
Persons
Items
UBL
PBD - 46
Cardinality of Relationships
One to One
Each entity in the relationship will have exactly one related entity
One to Many
An entity on one side of the relationship can have many related entities, but an entity on the other side will have a maximum of one related entity
Many to Many
Entities on both sides of the relationship can have many related entities on the other side
UBL
PBD - 47
Cardinality Constraints
Cardinality Constraints Jummlah dari instances pada satu Entity dapat atau harus dihubungkan dengan masing-masing instance pada entity lain. Minimum Cardinality Jika Kosong, merupakan optional Jika Satu atau Lebih, merupakan mandatory Maximum Cardinality Jumlah maximum
UBL
PBD - 48
1 : 1 Relationship
PERSONS
DRIVE
VEHICLES
P1 P2 P3
V1
V2
V3
UBL
PBD - 49
N :1 and 1 : N Relationships
PERSONS
DRIVE
P1
VEHICLES
V1 V2 V3 V4
P2
P3
PERSONS
DRIVE
P1
VEHICLES
V1 V2
P2 P3
UBL
PBD - 50
N : M Relationship
M N
PERSONS
DRIVE
VEHICLES
p1
p2
p3 p4
v1
UBL
v2
v3
v4
v5
Fakultas Teknologi Informasi PBD - 51
UBL
PBD - 52
Mahasiswa
NIM Nama Alamat
Punya
No_KTP NIM
KTP
No_KTP Masa Berlaku
UBL
PBD - 53
Strong entities Keberadaanya berdiri sendiri. Mempunyai Primary Key (unique identifier) Digambarkan dengan Persegi Empat dengan Garis Tunggal. Weak entity Tergantung pada strong entityTidak Dapat berdiri sendiri. Tidak Mempunyai Primary Key (unique identifier) Digambar dengan dengan Persegi Empat dengan Garis double. Identifying relationship Penghubung strong entities ke weak entities Digambar dengan Belah Ketupat dengan garis double.
UBL Fakultas Teknologi Informasi PBD - 54
UBL
PBD - 55
Associative Entities
Merupakan entity yang mempunyai attributes Dan merupakan relationship merupakan pengubung entities bersama. Kapan sebaiknya relationship dengan attributes menjadi sebuah associative entity? Semua Relationships pada associative entity harus many The associative entity bisa mempunya arti tidak terikat pada Entity lain The associative entity Lebih disukai mempunyai unique identifier, dan juga harus mempunyai attributes lain. Ternary relationships harus dikonversi ke associative entities
UBL Fakultas Teknologi Informasi PBD - 56
Mereka dapat dimasukkan awal akan tetapi yang dikurangi langkah-langkah kemudiannya
Identifiers harus unik.
UBL
PBD - 58
UBL
PBD - 59
PERSONS
PERSON-ID DATE-OF-BIRTH
DEPENDENTS*
PERSON-ID QUALIFICATION
1 HAS
OBTAINED
QUALIFICATIONS
QUALIFICATION
UBL
PBD - 60
Good Relationships
Do not Include Derived Relationships
UBL
PBD - 61
OBJECT-ORIENTED REPRESENTATION
Object-oriented representation is an emerging tool that helps overcome some of these limitations of E-R models. The advantage of object-oriented techniques is that they allow us to incorporate integrity rules ( as well as other database operations) directly in the database model or description, in the form of methods or procedures.
UBL
PBD - 62
OBJECT
an object is a named representation of real-word entity. for example, the following are objects : customers, products, accounts, students, and courses Two type of characteristics that are associated with objects : Attributes Operations ( also called Methods) Attributes properties of objects that are of interest to the organizations. for example, some attributes associated with the object Bicycle are Make, Model, Serial Number, frame size, and color.
UBL
PBD - 63
Operations (Methods) Operations are actions that may be performed on objects and that may change the values of attributes of the object. for example, some operations that are performed on the bicycle objects are lock bicycle, unlock bicycle, ride bicycle, and paint bicycle.
The combination of all the values of the attributes of a given object represent the state of that object. For example, the values of make, model, serial number, frame size, and color determine the state of bicycle object at a given point in time. Some operations may change the state of the object. In a larger sense, a database represents the state of an organization; the state changes constantly through time as operations (such as transactions) occur
UBL
PBD - 64
CLASSES AND INSTANCES A class is a logical grouping of objects that share the same attributes and operations. An instance is one member (or materialization) of that class For example, bicycle is a class of object; marys bicycle is an instance of that class. All of the attributes and operations for the class are described in one place (the class object). Object instances may contain attributes or operations that are peculiar to that instance but are not shared by other instances of the class.
UBL
PBD - 65
INHERITANCE
An object class is really a subclass of a more general class (called a super class). for example, bicycle, motorcycles, and auto-mobiles are subclasses of a super class called vehicles
vehicles
bicycle
motorcycle
automobile
UBL
PBD - 66
Inheritance mean that each subclass inherits the attributes and operations of the super class to which it belongs. for example, the attributes make, model, serial number, and color would apply to the super class vehicle These attributes would be defined once for vehicle and would automatically be inherited by the bicycle, motorcycle, and automobile sub-classes. However, each subclass often possesses additional attributes and operation that do not apply to the super class. for example, the attribute frame size would apply to the bicycle subclass only, while the attribute engine displacement would apply to motorcycle and automobile ( but not to bicycle)
UBL
PBD - 67
UBL
PBD - 68
Object Oriented Database Management Systems Motivation Record-oriented data models are effective when applied to traditional transaction processing applications. However, for advanced database applications such as the following, traditional model are not suitable: CAD/CAM CIM Imaging and Graphics Geographic Information Systems Scientific databases Multimedia databases Such applications have complex requirements: Complex data structures (beyond simple character, number, date) New data types for large, unstructured objects (Video, audio, multimedia documents) Reactive/Active transactions Long duration activities and transactions Collaborative activities (multiple authors) Object Oriented Data Models and OODBMS are designed to address these requirements UBL Fakultas Teknologi Informasi PBD - 69
Object Oriented Concepts OODBMS are based on Object Oriented models They add Persistence to objects Some basics: An Object: is an Instance of a Class Contains both data and the methods (functions) to operate on data Hides its internal data structure (data hiding) May only be accessed using the methods provided (encapsulation) Objects in an OO program are transient Objects in an OO DBMS are persistent An OODBMS provides interfaces to one or more OO programming languages (Smalltalk, C++, Java) Both transient and persistent objects are treated uniformly - the programmer need not be concerned with the difference. Every Object has a unique Object Identifier (OID) - remains for the life of the object An object may be arbitrarily complex - capable of expressing and storing all related data in a single object (unlike relational model) Consider an Object as a triple: O( i, c, v ) where i is the object identifier c is a set of constructors v is a set of instance variables Instance variables are the object's internal structure that hold its state. Constructor is basic data type used to make up instance variables - tuple, bag, set, list, etc. Relationships are captured through OID references
UBL Fakultas Teknologi Informasi PBD - 70
define type Department: tuple ( dname: string, dnumber: integer, manager: tuple (manager: Employee, startdate: Date), locations: set(string), employees: set(Employees), projects: set(Projects) ); define type Date: tuple ( year: month: day:
UBL
PBD - 71
class Employee inherit Person { type tuple ( hiredate: Date, dept: Department )
method number_of_employees: integer }
UBL
PBD - 72
Operations on Objects
Encapsulation: Only a specific set of predefined operations or functions may be applied to a given object - Objects are assumed to independent Each Operation has two components: 1. 2. Interface (or signature): The name of the operation and the arguments or parameters it expects. This is visible to other objects.
Method (body): The actual implementation of the operation. This is not visible to other objects. Operations are invoked by passing messages from one objects to the next. message (oid, method, arguments) Thus the implementation of a particular method can be changed without affecting the objects that invoke the method This gives us object-program independence.
UBL Fakultas Teknologi Informasi PBD - 73
Type and Class Hierarchy Object Type: Set of allowable values Object Class: Collection of objects meaningful in an application New types and classes can be defined based on existing ones Inheritance: New types and classes that the structure and operations from which they are derived (see Person and Employee example above). Fosters incremental design and reuse.
UBL
PBD - 74
Disadvantages
Requires OO Programming (in general) Not much data is presently in Object form huge investment in relational Poor query and reporting tools Limited concurrency control and transaction management Unproven performance Steep learning curve
UBL
PBD - 75
Hierarchical Data Model For the Hierarchical model, we focus on IBM's Information Management System or IMS developed in the 1960's. IMS/DB is the data management part of the IMS product and is based on the DL/I language. All data is held in hierarchical structures (trees). A Field is some item of data. Like an attribute in relational. Fields are grouped into Segments. A segment is a node in a tree (hierarchy). Segments also have a sequence field used for sorting. A particular tree structure is called a Data Base Record. A Data Base Record is a collection of segments organized as a hierarchy. Physical Data Base Record (PDBR) - Describes data as they exist on a storage device. Logical Data Base Record (LDBR) - Describes data as they appear to applications programs. An LDBR may represent multiple PDBR's.
UBL
PBD - 76
DL/I has a data manipulation language that operates on LDBR's. Commands such as: GET UNIQUE - Reads a segment GET NEXT - Reads the next segment in order GET HOLD UNIQUE and GET HOLD NEXT - Perform the get operation but hold the pointer on the segment pending a replace or delete. REPLACE - Modify data in a segment. DELETE - Deletes a segment and all child segments in the data base record. INSERT - Creates a new segment.
UBL
PBD - 77
Inheritance mean that each subclass inherits the attributes and operations of the super class to which it belongs.
for example, the attributes make, model, serial number, and color would apply to the super class vehicle These attributes would be defined once for vehicle and would automatically be inherited by the bicycle, motorcycle, and automobile sub-classes. However, each subclass often possesses additional attributes and operation that do not apply to the super class. for example, the attribute frame size would apply to the bicycle subclass only, while the attribute engine displacement would apply to motorcycle and automobile ( but not to bicycle)
UBL
PBD - 78
Conceptual database design is the process of constructing a detailed architecture for a database that is independent of implementation details such as the target database management system, application programs or programming language, or any other physical consideration.
The primary inputs to conceptual database design are the structured requirement defined during requirement definition.
UBL
PBD - 79
The conceptual data model should have the criteria : 1. Structural validity : consistency with the way the business defines and organizes data
2. Simplicity : ease of understanding by both IS professional and nontechnical users. 3. Non redundancy : each piece of information represented exactly once in the model 4. Share ability : all qualified users sharing the data in the conceptual model 5. Extensibility : ability to evolve to support new requirements with minimal impact on existing users 6. Integrity : consistency with the way the business uses and manages data
UBL Fakultas Teknologi Informasi PBD - 80
STEPS IN CONCEPTUAL DATABASE DESIGN A five steps process for conceptual database design : 1. Develop conceptual data model During this steps, the E-R diagrams that were developed for each user view during requirement definition are combined to form a single, integrated conceptual data model. Database analyst must perform this step carefully to ensure that the resulting model (called the conceptual schema) is non redundant and logically consistent. 2. Transform data model to relations During this steps, E-R diagrams (or other logical data model) are converted to relations. This steps might also be considered part to implementation design (rather than conceptual design) But, this steps is preliminary to normalization ( we consider part of conceptual design)
UBL Fakultas Teknologi Informasi PBD - 81
3. Normalize the relations During this step, the relation that were derived in the previous step are normalized 4. Integrate the relation. View integration is the process of merging individual user (in the form of E-R diagrams or 3NF relations) into an integrated data structure (or conceptual schema) 5. Develop action diagrams. action diagrams are high level-level definitions of data operations that maintain a database in a current and consistent state. Typical database operations add and delete records, modify records, and produce output in the form of reports and displays.
UBL
PBD - 82
Step 1
Step 2
Step 3
Step 4
Step 5
In combining these views, we include all of the relevant components (entities, relationships, attributes) but eliminate redundant components.
As a result, the conceptual data model is a non redundant superset of the individual E-R diagrams
UBL
PBD - 84
TRANSFORMING E-R DIAGRAMS TO RELATIONS Each entity in an E-R diagram is transformed to a relation. The primary key (or identifier) of the entity becomes the primary key of the corresponding relation, and descriptors (non-key attributes) of the entity become non-key attributes of the relation
For example the customer entity. Customer-no is the primary key for the entity, as well as for the customer relation
UBL
PBD - 85
NAME
CITY-STATE-ZIP
DISCOUNT
CUSTOMER
CUST_NO
CUST_NO NAME 1273 6390
UBL
ADDRES
ADDRES 123 OAK ST 18 HOOSIER DR. CITY-STATE-ZIP AUSTIN, TX 38405 BLOOMINGTON, IN 45821
CUSTOMER
DISCOUNT 5% 3%
PBD - 86
RELATIONSHIP 1 : N relationship a one-to-many (1:N) relationship in an E-R diagram is represented by placing a foreign key in the relation that represents the entity on the many-side of the relationship. This foreign key is the primary key of the entity on the one-side of the relationship.
CITY-STATE-ZIP DISCOUNT ORDER_NO
NAME
1 CUSTOMER
PLACED BY
ORDER
CUST_NO
UBL
ADDRES
ORDER_DATE
PROMISED_DATE
PBD - 87
CUSTOMER
CUST_NO NAME 1273 6390 CONTEMPORARY DESIGN CASUAL CORNER ADDRES 123 OAK ST 18 HOOSIER DR. CITY-STATE-ZIP AUSTIN, TX 38405 BLOOMINGTON, IN 45821 DISCOUNT 5% 3%
ORDER
ORDER_NO 57194 63725 80149 ORDER_DATE 3/15/2003 3/17/2003 3/14/2003 PROMISED_DATE 3/28/2003 4/01/2003 3/24/2003 CUST_NO 6390 1273 6390
UBL
PBD - 88
M : N relationship For this relationship, we create new relation (thus, there are three relations: one for each of the two entities and one for the relationship). The key of this relation is a composite key consisting of the primary key for each of the two entities in the relationship.
ORDER_NO QUANTITY ORDERED PRODUCT_NO
ROOM
M ORDER
REQUESTED ON
PRODUCT
ORDER_DATE
PROMISED_DATE
DESCRIPTION
(OTHER ATTRIBUTES)
UBL
PBD - 89
ORDER
ORDER_NO 61384 62009 62807 ORDER_DATE 2/17/2003 2/13/2003 2/15/2003 PROMISED_DATE 3/01/2003 3/27/2003 3/01/2003
ORDER_LINE
ORDER_NO 61384 61384 PRODUCT_NO M128 A261 QUANTITY_ORDER 2 1
ORDER
PRODUCT_NO DESCRIPTION M128 A261 R149 BOOKCASE WALL UNIT CABINET
UBL
OTHER ATTRIBUTES . .
Fakultas Teknologi Informasi PBD - 90
ISA relationship (class-subclass) strategy that database designer can use to represent ISA relationship using relation. The strategy is :
1. Create a separate relation for the class and for each of the subclasses 2. The table (relation) for the class consists only of the column that are common to all of the subclasses, plus a suntype identification column 3. The table for each subclass contains only its primary key and the columns unique to that subclass 4. The primary key of the class and each of the subclasses are from the same domain
UBL
PBD - 91
CITY-STATE-ZIP
STREET_ADDR
NO-ROOMS
TYPICAL-RENT
BEACH_PROPERTY
BLOKSTO BEACH
MOUNTAIN_PROPERTY
STREET_ADDR
CITY-STATE-ZIP
STREET_ADDR
CITY-STATE-ZIP
UBL
PBD - 92
PEOPERTY
STREET_ADDR 120 SURF DR. CITY-STATE-ZIP HONOLULU, HI 99987 NO-ROOMS 3 TYPICALRENT 500 SUBTYPE BEACH
JACKSON, WY 89204
250
MOUNTAIN
BEACH
STREET_ADDR
120 SURF DR.
CITY-TATE-ZIP
HONOLULU, HI 99987
BLOCKS-TO-BEACH
2
MOUNTAIN
STREET_ADDR
100 MOGUL DR.
CITY-STATE-ZIP
JACKSON, WY 89204
SKIING
4
UBL
PBD - 93
Normalization
What youll learn : Normalization
De-Normalization
All-In-One Example of normalization.
UBL
PBD - 94
Normalization Relations can fall into one or more categories (or classes) called Normal Forms Normal Form: A class of relations free from a certain set of modification anomalies. Normal forms are given name such as: First normal form (1NF) Second normal form (2NF) Third normal form (3NF) Boyce-Codd normal form (BCNF) Fourth normal form (4NF) Fifth normal form (5NF) Domain-Key normal form (DK/NF)
These forms are cumulative. A relation in Third normal form is also in 2NF and 1NF.
UBL
PBD - 95
First Normal Form (1NF) A relation is in first normal form if it meets the definition of a relation: 1.Each column (attribute) value must be a single value only. 2.All values for a given column (attribute) must be of the same type. 3.Each column (attribute) name must be unique. 4.The order of columns is insignificant. 5.No two rows (tuples) in a relation can be identical. 6.The order of the rows (tuples) is insignificant. If you have a key defined for the relation, then you can meet the unique row requirement. Example relation in 1NF: STOCKS (Company, Symbol, Date, Close_Price)
Company IBM IBM IBM Netscape Netscape Symbol IBM IBM IBM NETS NETS Date 01/05/94 01/06/94 01/07/94 01/05/94 01/06/94 Close Price 101.00 100.50 102.00 33.00 112.00
UBL
PBD - 96
Second Normal Form (2NF) A relation is in second normal form (2NF) if all of its non-key attributes are dependent on all of the key. Relations that have a single attribute for a key are automatically in 2NF. This is one reason why we often use artificial identifiers as keys. In the example below, Close Price is dependent on Company, Date and Symbol, Date The following example relation is not in 2NF:
STOCKS (Company, Symbol, Headquarters, Date, Close_Price)
Company IBM IBM IBM Netscape Netscape Symbol IBM IBM IBM NETS NETS Headquarters Armonk, NY Armonk, NY Armonk, NY Sunyvale, CA Sunyvale, CA Date 01/05/94 01/06/94 01/07/94 01/05/94 01/06/94 Close Price 101.00 100.50 102.00 33.00 112.00
Company, Date -> Close Price Symbol, Date -> Close Price Company -> Symbol, Headquarters Symbol -> Company, Headquarters
UBL Fakultas Teknologi Informasi PBD - 97
Consider that Company, Date -> Close Price. So we might use Company, Date as our key. However: Company -> Headquarters This violates the rule for 2NF. Also, consider the insertion and deletion anomalies. One Solution: Split this up into two relations: COMPANY (Company, Symbol, Headquarters) STOCKS (Symbol, Date, Close_Price)
Symbol IBM IBM IBM NETS Date 01/05/94 01/06/94 01/07/94 01/05/94 Close Price 101.00 100.50 102.00 33.00 Company IBM Netscape Symbol IBM NETS Headquarters Armonk, NY Sunnyvale, CA
NETS
01/06/94
112.00
Third Normal Form (3NF) A relation is in third normal form (3NF) if it is in second normal form and it contains no transitive dependencies. Consider relation R containing attributes A, B and C. If A -> B and B -> C then A -> C Transitive Dependency: Three attributes with the above dependencies. Example: At CUNY: Course_Code -> Course_Num, Section Course_Num, Section -> Classroom, Professor Example: At Rutgers: Course_ Index_Num -> Course_Num, Section Course_Num, Section -> Classroom, Professor
UBL
PBD - 99
Example:
Company IBM AT&T County Putnam Bergen Tax Rate 28% 26%
Company -> County and County -> Tax Rate thus Company -> Tax Rate
What happens if we remove AT&T ? We loose information about 2 different themes. Split this up into two relations:
Company IBM County Putnam
AT&T
Bergen
UBL
PBD - 100
Boyce-Codd Normal Form (BCNF) A relation is in BCNF if every determinant is a candidate key. Recall that not all determinants are keys. Those determinants that are keys we initially call candidate keys. Eventually, we select a single candidate key to be the primary key for the relation. Consider the following example: Funds consist of one or more Investment Types. Funds are managed by one or more Managers Investment Types can have one more Managers Managers only manage one type of investment.
FundID 99 99 33 22 InvestmentType Common Stock Municipal Bonds Common Stock Growth Stocks Manager Smith Jones Green Brown
FundID, InvestmentType -> Manager FundID, Manager -> InvestmentType Manager -> InvestmentType
11
Common Stock
Smith
UBL
PBD - 101
. In this case, the combination FundID and InvestmentType form a candidate key because we can use FundID,InvestmentType to uniquely identify a tuple in the relation. Similarly, the combination FundID and Manager also form a candidate key because we can use FundID, Manager to uniquely identify a tuple. Manager by itself is not a candidate key because we cannot use Manager alone to uniquely identify a tuple in the relation. Is this relation R(FundID, InvestmentType, Manager) in 1NF, 2NF or 3NF ? Given we pick FundID, InvestmentType as the Primary Key: 1NF for sure. 2NF because all of the non-key attributes (Manager) is dependant on all of the key. 3NF because there are no transitive dependencies. Consider what happens if we delete the tuple with FundID 22. We loose the fact that Brown manages the InvestmentType "Common Stocks."
FundID InvestmentType Common Stock Municipal Bonds Common Stock Growth Stocks Common Stock Manager Smith Jones Green Brown Smith PBD - 102
FundID, InvestmentType -> Manager FundID, Manager -> InvestmentType Manager -> InvestmentType
UBL
99 99 33 22 11
1. 2. 3.
The following are steps to normalize a relation into BCNF: List all of the determinants. See if each determinant can act as a key (candidate keys). For any determinant that is not a candidate key, create a new relation from the functional dependency. Retain the determinant in the original relation. For our example: Rorig(FundID, InvestmentType, Manager) 1. The determinants are: FundID, InvestmentType FundID, Manager Manager 2. Which determinants can act as keys ? FundID, InvestmentType YES FundID, Manager YES Manager NO 3. Create a new relation from the functional dependency: Rnew(Manager, InvestmentType) Rorig(FundID, Manager) In this last step, we have retained the determinant "Manager" in the original relation Rorig.
UBL Fakultas Teknologi Informasi PBD - 103
Fourth Normal Form (4NF) A relation is in fourth normal form if it is in BCNF and it contains no multivalued dependencies. Multivalued Dependency: A type of functional dependency where the determinant can determine more than one value. More formally, there are 3 criteria: 1. There must be at least 3 attributes in the relation. call them A, B, and C, for example. 2. Given A, one can determine multiple values of B. Given A, one can determine multiple values of C. 3. B and C are independent of one another. example: Student has one or more majors. Student participates in one or more activities.
UBL
PBD - 104
StudentID
100 100 100 100 200
Major
CIS CIS Accounting Accounting Marketing
Activities
Baseball Volleyball Baseball Volleyball Swimming StudentID ->-> Major StudentID ->-> Activities
Stock Fund Janus Fund Janus Fund Scudder Global Fund Scudder Global Fund Kaufmann Fund
Bond Fund Municipal Bonds Dreyfus Short-Intermediate Municipal Bond Fund Municipal Bonds Dreyfus Short-Intermediate Municipal Bond Fund T. Rowe Price Emerging Markets Bond Fund
UBL
PBD - 105
1. 2. 3. 4.
A few characteristics: No regular functional dependencies All three attributes taken together form the key. Latter two attributes are independent of one another. Insertion anomaly: Cannot add a stock fund without adding a bond fund (NULL Value). Must always maintain the combinations to preserve the meaning. Stock Fund and Bond Fund form a multivalued dependency on Portfolio ID. PortfolioID ->-> Stock Fund PortfolioID ->-> Bond Fund
UBL
PBD - 106
Bond Fund Municipal Bonds Dreyfus Short-Intermediate Municipal Bond Fund T. Rowe Price Emerging Markets Bond Fund
Fakultas Teknologi Informasi PBD - 107
888
888
Fifth Normal Form (5NF) There are certain conditions under which after decomposing a relation, it cannot be reassembled back into its original form.
UBL
PBD - 108
Domain Key Normal Form (DK/NF) A relation is in DK/NF if every constraint on the relation is a logical consequence of the definition of keys and domains. Constraint: An rule governing static values of an attribute such that we can determine if this constraint is True or False. Examples: 1. Functional Dependencies 2. Multivalued Dependencies 3. Inter-relation rules 4. Intra-relation rules
However: Does Not include time dependent constraints. Key: Unique identifier of a tuple. Domain: The physical (data type, size, NULL values) and semantic (logical) description of what values an attribute can hold. T here is no known algorithm for converting a relation directly into DK/NF.
UBL
PBD - 109
De-Normalization Consider the following relation: CUSTOMER (CustomerID, Name, Address, City, State, Zip) This relation is not in DK/NF because it contains a functional dependency not implied by the key. Zip -> City, State We can normalize this into DK/NF by splitting the CUSTOMER relation into two: CUSTOMER (CustomerID, Name, Address, Zip) CODES (Zip, City, State) We may pay a performance penalty - each customer address lookup requires we look in two relations (tables). In such cases, we may de-normalize the relations to achieve a performance improvement.
UBL
PBD - 110
All-in-One Example
Many of you asked for a "complete" example that would run through all of the normal forms from beginning to end using the same tables. This is tough to do, but here is an attempt: Example relation: EMPLOYEE ( Name, Project, Task, Office, Phone ) Note: Keys are underlined. Example Data:
Name Bill Bill Bill Bill Sue Sue Project 100X 100X 200Y 200Y 100X 200Y Task T1 T2 T1 T2 T33 T33 Office 400 400 400 400 442 442 Floor 4 4 4 4 4 4 Phone 1400 1400 1400 1400 1442 1442
Sue
Ed
UBL
300Z
100X
T33
T2
442
588
4
5
1442
1588
PBD - 111
Name is the employee's name Project is the project they are working on. Bill is working on two different projects, Sue is working on 3. Task is the current task being worked on. Bill is now working on Tasks T1 and T2. Note that Tasks are independent of the project. Examples of a task might be faxing a memo or holding a meeting. Office is the office number for the employee. Bill works in office number 400. Floor is the floor on which the office is located. Phone is the phone extension. Note this is associated with the phone in the given office. Question : First Normal Form Assume the key is Name, Project, Task. Is EMPLOYEE in 1NF ?
UBL Fakultas Teknologi Informasi PBD - 112
Second Normal Form List all of the functional dependencies for EMPLOYEE. Are all of the non-key attributes dependant on all of the key ? Split into two relations EMPLOYEE_PROJECT_TASK and EMPLOYEE_OFFICE_PHONE. EMPLOYEE_PROJECT_TASK (Name, Project, Task)
Name Bill Name Bill Bill Bill Bill Sue Sue Sue Ed Project 100X 100X 200Y 200Y 100X 200Y 300Z 100X Task T1 T2 T1 T2 T33 T33 T33 T2 Office 400 400 400 400 442 442 442 588 Floor 4 4 4 4 4 4 4 5 Phone 1400 1400 1400 1400 1442 1442 1442 1588 Name Bill Sue Ed Office 400 442 588 Floor 4 4 5 Phone 1400 1442 1588 Bill Bill Bill Sue Sue Sue Ed Project 100X 100X 200Y 200Y 100X 200Y 300Z 100X Task T1 T2 T1 T2 T33 T33 T33 T2
UBL
PBD - 113
Third Normal Form Assume each office has exactly one phone number. Are there any transitive dependencies ? Where are the modification anomalies in EMPLOYEE_OFFICE_PHONE ? Split EMPLOYEE_OFFICE_PHONE. EMPLOYEE_PROJECT_TASK (Name, Project, Task) EMPLOYEE_OFFICE (Name, Office, Floor) Name Office Floor Bill 400 4 Sue 442 4 Ed 588 5 EMPLOYEE_PHONE (Office, Phone)
Office 400 442 588 Phone 1400 1442 1588
UBL Fakultas Teknologi Informasi PBD - 114
Boyce-Codd Normal Form List all of the functional dependencies for EMPLOYEE_PROJECT_TASK, EMPLOYEE_OFFICE and EMPLOYEE_PHONE. Look at the determinants. Are all determinants candidate keys ?
UBL
PBD - 115
Forth Normal Form Are there any multivalued dependencies ? What are the modification anomalies ? Split EMPLOYEE_PROJECT_TASK. EMPLOYEE_PROJECT (Name, Project )
Name Bill Bill Sue Sue Sue Ed Project 100X 200Y 100X 200Y 300Z 100X
Task T1 T2 T33 T2
Floor 4 4 5
R4 (Office, Phone)
UBL
PBD - 117
At each step of the process, we did the following: 1.Write out the relation 2.(optionally) Write out some example data. 3.Write out all of the functional dependencies 4.Starting with 1NF, go through each normal form and state why the relation is in the given normal form.
UBL
PBD - 118
Another short example Consider the following example of normalization for a CUSTOMER relation.
Relation Name CUSTOMER (CustomerID, Name, Street, City, State, Zip, Phone) Example Data
CustomerID C101 C102 Name Bill Smith Mary Green Street 123 First St. 11 Birch St. City New Brunswick Old Bridge State Zip NJ NJ Phone
Functional Dependencies CustomerID -> Name, Street, City, State, Zip, Phone Zip -> City, State
UBL
PBD - 119
Normalization
1NF Meets the definition of a relation. 2NF All non key attributes are dependent on all of the key. 3NF There are no transitive dependencies. BCNF Relation CUSTOMER is not in BCNF because one of the determinants Zip can not act as a key for the entire relation. Solution: Split CUSTOMER into two relations: CUSTOMER (CustomerID, Name, Street, Zip, Phone) ZIPCODES (Zip, City, State)
Check both CUSTOMER and ZIPCODE to ensure they are both in 1NF up to BCNF. 4NF There are no multi-valued dependencies in either CUSTOMER or ZIPCODES. As a final step, consider de-normalization.
UBL Fakultas Teknologi Informasi PBD - 120
UBL
PBD - 121
UBL
PBD - 123
UBL
PBD - 124
Gambarkan TIME Data Type dengan menambahkan i memposisikan untuk pecahan suatu detik/second. For example:
HH:MM:SS:dd
UBL
PBD - 125
-3.402823E38 to -1.401298E-45 for negative values; 1.401298E-45 to 3.402823E38 for positive values.
-1.79769313486232E308 to -4.94065645841247E-324 for negative values; 4.94065645841247E-324 to 1.79769313486232E308 for positive values. -922,337,203,685,477.5808 to 922,337,203,685,477.5807.
8 bytes 8 bytes
Date
Object String (variable-length) String (fixed-length) Variant (with numbers) Variant (with characters) UBL
8 bytes
4 bytes 10 bytes + string length Length of string 16 bytes 22 bytes + string length
Note: Dalam hal ini tidak perlu dihawal, kondisi ini hanya untuk bahan acuan.
UBL
PBD - 127
Data Definition Language DDL is used to define the schema of the database. Create a database schema Create a domain Create, Drop or Alter a table Create or Drop an Index Define Integrity constraints Define access privileges to users Define access privileges on objects SQL2 specification supports the creation of multiple schemas per database each with a distinct owner and authorized users.
UBL Fakultas Teknologi Informasi PBD - 128
Creating Domains Yang diharapkan adalah untuk menciptakan suatu bentuk standar untuk jenis data, ukuran dan memberi suatu nama. Yang Baik Untuk standardisasi untuk semua tabel. CREATE DOMAIN d_last_name AS VARCHAR(30) CREATE DOMAIN d_gender AS VARCHAR(1) CREATE DOMAIN d_salary AS NUMBER(12,2) CREATE DOMAIN d_soc_sec AS VARCHAR(11)
UBL
PBD - 129
Creating Domains
Creating a Schema Creating a Table: CREATE TABLE employee ( Last_Name d_last_name NOT NULL, First_name VARCHAR(18) NOT NULL, Soc_Sec d_soc_sec NOT NULL, Date_of_Birth DATE, Salary d_salary ) ; CREATE TABLE dependant ( Last_Name d_last_name NOT NULL, First_name VARCHAR(18) NOT NULL, Soc_Sec d_soc_sec NOT NULL, Date_of_Birth DATE, Employee_Soc_Sec d_soc_sec NOT NULL );
UBL
PBD - 130
UBL
PBD - 131
Index File
CREATE INDEX order_index ON order_header (order_number) ASC ;
Example from MS Access: CREATE TABLE employee ( FirstName TEXT, LastName TEXT, ssn INTEGER CONSTRAINT ssnConstraint PRIMARY KEY ); CREATE INDEX employee_index ON employee (ssn) ;
UBL Fakultas Teknologi Informasi PBD - 132
UBL
PBD - 133
Examples of ON DELETE and ON UPDATE CREATE TABLE order_items ( order_number line_item part_number quantity NUMBER(10,0) NOT NULL, NUMBER(4,0) VARCHAR(12) NUMBER(4,0), NOT NULL, NOT NULL,
PRIMARY KEY (order_number, line_item), FORIEGN KEY (order_number) REFERENCES order_header (order_number) ON DELETE SET DEFAULT ON UPDATE CASCADE, FOREIGN KEY (part_number) REFERENCES parts (part_number) );
UBL Fakultas Teknologi Informasi PBD - 134
CREATE TABLE order_header ( order_number NUMBER(10,0) NOT NULL, order_date DATE, sales_person VARCHAR(25), bill_to VARCHAR(35), bill_to_address VARCHAR(45), bill_to_city VARCHAR(20), bill_to_state VARCHAR(2), bill_to_zip VARCHAR(10), CONSTRAINT order_header_pk PRIMARY KEY (order_number) ); CREATE TABLE order_items ( order_number NUMBER(10,0) NOT NULL, line_item NUMBER(4,0) NOT NULL, part_number VARCHAR(12) NOT NULL, quantity NUMBER(4,0),
UBL
PBD - 135
CONSTRAINT order_items_pk PRIMARY KEY (order_number, line_item), CONSTRAINT order_items_fk1 FORIEGN KEY (order_number) REFERENCES order_header (order_number) ON DELETE SET DEFAULT ON UPDATE CASCADE, CONSTRAINT order_items_fk2 FOREIGN KEY (part_number) REFERENCES parts (part_number) ON DELETE SET DEFAULT ON UPDATE CASCADE );
UBL
PBD - 136
membuang Attributes (not widely implemented): ALTER TABLE student DROP home_phone;
UBL
PBD - 138
UBL
PBD - 140
Tanda kutip ditempatkan di antara data yang tergantung pada Jenis Data dan pada spesifik RDBMS digunakan:
Text Data Type TEXT: Either " or ' VARCHAR: ' VARCHAR: ' CHAR and VARCHAR: "
Dates DATETIME: Either " or ' DATE: ' DATE: ' DATE: "
UBL
PBD - 141
UBL
PBD - 142
Some example queries: SELECT employee_id, last_name, first_name FROM employees WHERE last_name = "Smith" ORDER BY first_name DESC SELECT FROM WHERE ORDER BY employee_id, last_name, first_name employees salary > 40000 last_name, first_name DESC
UBL
PBD - 143
Relational operators each have implementations in SQL. SELECT employee_id, last_name, first_name FROM employee WHERE salary > 40000 SELECT AVG(salary) FROM employee WHERE state = 'NJ' SELECT * FROM employee WHERE last_name = 'Smith' AND state = 'NY'
UBL
PBD - 144
Mencari nama siswa dengan nilai yang paling tinggi, contoh suatu subquery SELECT name, grade FROM students WHERE grade = ( SELECT MAX(grade) FROM students ); Results: NAME GRADE -------------- ----Mary 98
UBL
PBD - 146
SELECT name, major, grade FROM students s1 WHERE grade = ( SELECT max(grade) FROM students s2 WHERE s1.major = s2.major ) ORDER BY grade DESC;
Results:
GRADE ----98 92 89
Fakultas Teknologi Informasi PBD - 147
Selecting from 2 or More Tables Di dalam FROM , list semua table yang dipisahkan koma. Disebut Join. WHERE menjadi bagian dari Join Condition Example table EMPLOYEE:
Name Joe Alice Jill Jack Fred Department Finance Finance MIS MIS Accounting Salary 50000 52000 48000 32000 33000
UBL
PBD - 148
UBL
PBD - 149
Mecari dartar nama karyawan dan dibagian mana nama-nama tersebut bekerja: SELECT FROM WHERE ORDER BY employee.name, department.location employee, department employee.department = department.department department.location, employee.name;
LOCATION ------------CA CA CA NJ NJ
UBL
PBD - 150
Mencari department dan semua karyawan yang bekerja disana.. SELECT FROM ON ORDER BY department.department, department.location, employee.name employee RIGHT JOIN department employee.department = department.department department.location, employee.name;
LOCATION ---------------CA CA CA NJ NJ NY
What is the highest paid salary in California ? SELECT MAX(employee.salary) FROM employee, department WHERE employee.department = department.department AND department.location = 'CA';
UBL
PBD - 152
FROM
department;
From our Bank Accounts example. List the Customer name and their total account holdings: SELECT customers.LastName, Sum(Balance) FROM customers, accounts WHERE customers.CustomerID = accounts.customerid GROUP BY customers.LastName Results: LASTNAME --------Axe Builder Jones Smith
UBL
Kita juga dapat menggunakan Kolom Alias untuk merubah Judul pada kolom. SELECT customers.LastName, Sum(Balance) AS TotalBalance FROM customers, accounts WHERE customers.CustomerID = accounts.customerid GROUP BY customers.LastName
Results:
LASTNAME --------Axe Builder Jones Smith TotalBalance -----------$15,000.00 $1,300.00 $1,000.00 $6,000.00
UBL
PBD - 155
SELECT name, department, salary AS CurrentSalary, (salary * 1.03) AS ProposedRaise (Usulan Gaji) FROM employee;
Results: name -------Alice Fred Jack Jill Joe department -----------Finance Accounting MIS MIS Finance CurrentSalary ------------52000 33000 32000 48000 50000 ProposedRaise ------------53560 33990 32960 49440 51500
UBL
PBD - 156
Recursive Queries and Aliases Recall some of the E-R diagrams and relations we dealt with had a recursive relationship. For example: A student can tutor one or more other students. A student has only one tutor. STUDENTS (StudentID, Name, Student_TutorID)
StudentID
S101 S102 S103 S104 S105 S106 S107
UBL
Name
Bill Alex Mary Liz Ed Sue Petra
Student_TutorID
NULL S101 S101 S103 S103 S101 S106
Fakultas Teknologi Informasi PBD - 157
Menyediakan Daftar siswa sama tutornya: SELECT s1.name AS Student, tutors.name AS Tutor FROM students s1, students tutors WHERE s1.student_tutorid = tutors.studentid;
Results: Student ---------Alex Mary Sue Liz Ed Petra Tutor ---------Bill Bill Bill Mary Mary Sue
UBL
PBD - 158
Dari table ada sesuatu yang hilang: Kita tidak melihat Siapa yang menjadi tutornya Bill Smith. Gunakan LEFT JOIN: SELECT FROM ON Results: Student ---------Bill Alex Mary Sue Liz Ed Petra
UBL
s1.name AS Student, tutors.name AS Tutor students s1 LEFT JOIN students tutors s1.student_tutorid = tutors.studentid;
Mengetahui jumlah siswa yang ditutor? Gunakan RIGHT JOIN How many students does each tutor work with ? SELECT s1.name AS TutorName, COUNT(tutors.student_tutorid) AS NumberTutored FROM students s1, students tutors WHERE s1.studentid = tutors.student_tutorid GROUP BY s1.name; Results: TutorName ---------Bill Mary Sue NumberTutored ------------3 2 1
UBL
PBD - 160
WHERE Clause Expressions There are a number of expressions one can use in a WHERE clause. Typical Logic expressions: COLUMN = value Also: = != <= = Also consider BETWEEN
SELECT name, grade, "You Got an A" FROM students WHERE grade between 91 and 100
This assumes the subquery returns only one tuple as a result. Typically used for aggregate functions.
UBL Fakultas Teknologi Informasi PBD - 161
Subqueries using IN: SELECT name FROM employee WHERE department IN ('Finance', 'MIS');
name employee department IN (SELECT department FROM departments WHERE location = 'CA');
In the above case, the subquery returns a set of tuples. The IN clause returns true when a tuple matches a member of the set.
UBL
PBD - 162
The above query shows all employees names and salaries where there is at least one person who makes more money (the first exists) and at least one person who makes less money (second exists).
UBL Fakultas Teknologi Informasi PBD - 163
NOT EXISTS:
name, salary employee NOT EXISTS (SELECT name FROM EMPLOYEE e2 WHERE e2.salary > employee.salary)
salary ---------52000
Above query shows all employees for whom there does not exist an employee who is paid less.
UBL
PBD - 164
Deleting Tuples with DELETE DELETE digunakan untuk menghilangkan tuples dari table. With no WHERE clause, DELETE akan menghilangkan semua tuples dari table. Remove all employees:
DELETE employee;
UBL
PBD - 165
Merubah Values dengan UPDATE Perintah UPDATE digunakan untuk merubah nilai atribut pada tabel.. UPDATE dengan SET clause untuk menuliskan ulang nilai. Change the last name of an Employee: UPDATE employee SET last_name = 'Smith' WHERE employee_id = 'E1001'; Give an Employee a raise: UPDATE employee SET salary = salary * 1.05 WHERE employee_id = 'E1001';
UBL
PBD - 166
Defining Views Digunakan untuk menggambarkan pandangan terhadap table. Example, Jika kita biasanya mengakses 2 atau 3 kolom dalam suatu table , Kita dapat mendefinisikan View pada table dan menggunakan view name untuk queries yang spesifik.
CREATE VIEW emp_address AS SELECT first_name, last_name, street, city, state, zip FROM employee; CREATE VIEW emp_salary AS SELECT first_name, last_name, salary FROM employee; CREATE VIEW avg_sal_dept AS SELECT department, AVG(salary) FROM employee GROUP BY department;
UBL Fakultas Teknologi Informasi PBD - 167
One can then query these views as if they were tabes SELECT * FROM emp_address ORDER BY last_name; SELECT FROM WHERE * avg_sal_dept department = 'Finance';
UBL
PBD - 168
System Administration
Database Administration Data Administration
UBL
PBD - 169
1.
2.
Legal and Ethical considerations Siapa yang mempunyai hak dalam membaca informasi ?
Policy issues Siapa yang menyelenggaran keamanan (government, corporations) ?
3.
System-Level issues dimana keamanan diletakan pada sistem dan bagai mana ?
UBL
PBD - 170
Jika sebagai Database Admin, Kita harus menyediakan cara mencegah penggunaan data yang tidak syah di dalam suatu database. 3 Kondisi yang harus dipertimbangkan: 1. Access Control: Siapa yang harus diijinkan akses ke database yang mana . Ini adalah suatu cara yang digunakan untuk pertanggung-jawaban tingkatan pengguna sistem dengan kata sandi. 2. Authorization: Untuk Kepentingan: Reading Data Seperti untuk mengetahui gaji karyawan (gunakan SELECT statement for example). Writing Data untuk merubah nilai pada database (using UPDATE or DELETE). 3. Statistical Information: Keharusan Siapa yang harus mengakses ke informasi dari databases (Census example).
UBL Fakultas Teknologi Informasi PBD - 171
Authorization Subsystem
Secara khusus keamanan untuk tujuan otorisasi database diterapkan suatu subsistem otorisasi yang memonitor tiap-tiap transaksi di dalam database. Ini bagian dari DBMS,
UBL
PBD - 172
Constraint None None None None Salary < 50,000 Total < 1,000
Total < 1,000 None
What happens as the number of subjects and objects grows ? Presently, no commercial DBMS support this level of authorization flexibility. Typically, a DBMS supports some basic authorization models. Application developers must provide more complex constraint enforcement.
UBL Fakultas Teknologi Informasi PBD - 173
Subject-Based Security
Subjects are individually defined in the DBMS and each object and action is specified. For example, user SMITH (a Subject) has the following authorizations:
Objects
Actions
Read Insert Modify Delete Grant
EMPLOYEES
Y N N N N
ORDERS
Y Y Y N N
PRODUCTS
Y N Y N N
...
... ... ... ... ...
UBL
PBD - 174
Object-Based Security
Objects are individually defined in the DBMS and each subject and action is specified. For example, the EMPLOYEES table (an Object) has the following authorizations:
Subjects Actions SMITH Read Insert Modify Delete Grant Y N N N N JONES Y N N N N GREEN N N N N N DBA Y Y Y Y Y .. . ..
. .. . ..
. .. . .. .
UBL
PBD - 175
The SQL GRANT and REVOKE Statements SQL provides two main statements for setting up authorization. GRANT is used to grant an action on an object to a subject. GRANT action1, action2 ... ON object1, object2, ... TO subject1, subject2 ... Another option of the GRANT statement is WITH GRANT OPTION. This allows the grantee to propagate the authorization to another subject.
UBL
PBD - 176
For example, assume we have a database with Tables: employees, departments, orders, products Users: smith, jones, green, dba GRANT INSERT, DELETE, UPDATE, SELECT ON employees, departments TO smith ; GRANT INSERT, SELECT GRANT SELECT ON orders TO smith ;
ON products
TO smith ;
GRANT SELECT ON employees, departments TO jones ; GRANT INSERT, DELETE, UPDATE, SELECT ON employees, departments, orders, products TO dba WITH GRANT OPTION ;
UBL Fakultas Teknologi Informasi PBD - 177
Grants can also be done on specific columns in a table: GRANT SELECT ON products TO jones ; GRANT UPDATE ON products (price) TO jones ; If no GRANT statement is issued, it is assumed that no authorization is given (e.g., user GREEN above). REVOKE is used to revoke authorizations from subjects. REVOKE action1, action2, ... ON object1, object2, ... FROM subject1, subject2, ...
UBL
PBD - 178
If Ms. Smith leaves the company, then we should: REVOKE INSERT, DELETE, UPDATE, SELECT ON employees, departments FROM smith ; REVOKE INSERT, SELECT REVOKE SELECT ON orders FROM smith ;
ON products
FROM smith ;
Many RDBMS have an ALL PRIVILEGES option that will revoke all of the privileges on an object from a subject: REVOKE ALL PRIVILEGES ON employees, departments, orders, products FROM smith ;
UBL Fakultas Teknologi Informasi PBD - 179
System Administration
Database Administration Transaction Processing Concurrency Control and Locking Characteristics of Locks Two Phase Locking Deadlock Database Recovery and Backup Reprocessing Rollback / Rollforward Database Backup
UBL
PBD - 180
Database Administration
Any reasonably sized organization that relies on a database for its business processes will probably have a Database Administrator or DBA. The DBA is responsible for: 1.Installing and configuring the DBMS 2.Assisting in the implementation of information systems 3.Monitoring the performance of the database and tuning the DBMS for optimal performance 4.Maintaining documentation including recording all changes to the database and DBMS. 5.Ensuring data integrity is maintained and appropriate backups are made. Thus a DBA is mainly concerned with the day to day operational aspects of database systems.
UBL Fakultas Teknologi Informasi PBD - 181
Data Administration
Data Administration Many organizations, in addition to DBAs, will also have a Data Administrator. DAs are concerned with the data needs and data flows throughout the entire organization. Thus DAs are responsible for: 1.Specifying data standards across databases 2.Establishing policies: data usage, security and authorization, data flows into and out of the organization 3.Assisting the application development process by identifying data resources in the organization 4.Arbitrating the sharing of data across departments 5.Increasing the return on an organization's data investment
UBL
PBD - 182
MultiUser Databases Multiuser database - more than one user processes the database at the same time Several issues arise: 1.How can we prevent users from interfering with each other's work ? 2.How can we safely process transactions on the database without corrupting or losing data ? 3.If there is a problem (e.g., power failure or system crash), how can we recover without loosing all of our data ? Transaction Processing
A transaction is a set of read and write operations that must either commit or abort. Consider the following transaction that reserves a seat on an airplane flight and changes the customer: 1. Read customer information 2. Write reservation information 3. Write charges We need the ability to control how transactions are run in a multiuser database.
UBL Fakultas Teknologi Informasi PBD - 183
Suppose that after the second step, the database crashes. Or for some reason, changes can not be written... Transactions can either reach a commit point, where all actions are permanently saved in the database or they can abort in which case none of the actions are saved. Another way to say this is transactions are Atomic. All operations in a transaction must be executed as a single unit - Logical Unit of Work. Consider two users, each executing similar transactions:
Example #1:
User A Read Salary for emp 101 Multiply salary by 1.03 Write Salary for emp 101
User B Read Salary for emp 101 Multiply salary by 1.04 Write Salary for emp 101
UBL
PBD - 184
Example #2: User A Read inventory for Prod 200 Decrement inventory by 5 Write inventory for Prod 200 User B Read inventory for Prod 200 Decrement inventory by 7 Write inventory for Prod 200
First, what should the values for salary (in the first example) really be ? The DBMS must find a way to execute these two transactions concurrently and ensure the result is what the users (and designers) intended. These two are examples of the Lost Update or Concurrent Update problem. Some changes to the database can be overwritten.
UBL
PBD - 185
Consider how the operations for user's A and B might be interleaved as in example #2. Assume there are 10 units in inventory for Prod 200:
Read inventory for Prod 200 Read inventory for Prod 200 Decrement inventory by 5 Decrement inventory by 7 Write inventory for Prod 200 Write inventory for Prod 200 for for for for for for user user user user user user A B A B A B
UBL
PBD - 186
PERANCANGAN BASIS DATA (3 SKS) In the first case, the incorrect amount (3) is written to the database. This is called the Lost Update problem because we lost the update from User A - it was overwritten by user B. The second example works because we let user A write the new value of Prod 200 before user B can read it. Thus User B's decrement operation will fail. Here is another example. User's A and B share a bank account. Assume an initial balance of $200.
User User User User User User A A B A B B reads the balance deducts $100 from the balance reads the balance writes the new balance of $100 deducts $100 from the balance writes the new balance of $100
The reason we get the wrong final result (remaining balance of $100) is because transaction B was allowed to read stale data. This is called the inconsistent read problem.
Suppose, instead of interleaving (mixing) the operations of the two transactions, we execute one after the other (note it makes no difference which order: A then B, or B then A)
User User User User User User A A A B B B reads the balance deducts $100 from the balance writes the new balance of $100 reads the balance (which is now $100) deducts $100 from the balance writes the new balance of $0
UBL
PBD - 187
If we insist only one transaction can execute at a time, in serial order, then performance will be quite poor.
Concurrency Control is a method for controlling or scheduling the operations in such a way that concurrent transactions can be executed.
If we do concurrency control properly, then we can maximize transaction throughput while avoiding any chance.
Transaction throughput: The number of transactions we can perform in a given time period. Often reported as Transactions per second or TPS.
A group of two or more concurrent transactions are serializable if we can order their operations so that the final result is the same as if we had run them in serial order (one after another). Consider transaction A, B, C and D. Each has 3 operations. If executing: A1, B1, A2, C1, C2, B2, A3, B3, C3 has the same result as executing: A1, A2, A3, B1, B2, B3, C1, C2, C3 Then the above schedule of transactions and operations is serialized.
UBL Fakultas Teknologi Informasi PBD - 188
Characteristics of Locks Locks may be applied to data items in two ways: Implicit Locks are applied by the DBMS Explicit Locks are applied by application programs. Locks may be applied to: 1. a single data item (value) 2. an entire row of a table 3. a page (memory segment) (many rows worth) 4. an entire table 5. an entire database This is referred to as the Lock granularity Locks may be of type types depending on the requirements of the transaction: 1.An Exclusive Lock prevents any other transaction from reading or modifying the locked item. 2.A Shared Lock allows another transaction to read an item but prevents another transaction from writing the item.
UBL Fakultas Teknologi Informasi PBD - 189
Here is a more involved example: User A places a shared lock on item raise_rate User A reads raise_rate User A places an exclusive lock on item Amy_salary User A reads Amy_salary User B places a shared lock on item raise_rate User B reads raise_rate
User A calculates a new salary as Amy_salary * (1+raise_rate) User User User User B B B B places an exclusive lock on item Bill_salary reads Bill_salary calculates a new salary as Bill_salary * (1+raise_rate) writes Bill_salary
User A writes Amy_salary User A releases exclusive lock on Amy_salary User B releases exclusive lock on Bill_Salary User B releases shared lock on raise_rate User A releases shared lock on raise_rate
UBL Fakultas Teknologi Informasi PBD - 191
User A places an exclusive lock on item Amy_salary User A reads raise_rate User A releases shared lock on raise_rate
User B places an exclusive lock on raise_rate
Deadlock
Locking can cause problems, however. Consider:
User A places an exclusive lock on item 1001 User B places an exclusive lock on item 2002 User A attempts to place an exclusive lock on item 2002 User A placed into a wait state User B attempts to place an exclusive lock on item 1001 User B placed into a wait state ...
This is called a deadlock. One transaction has locked some of the resources and is waiting for locks so it can complete. A second transaction has locked those needed items but is awaiting the release of locks the first transaction is holding so it can continue. Two main ways to deal with deadlock. Prevent it in the first place by giving each transaction exclusive rights to acquire all locks needed before proceeding. Allow the deadlock to occur, then break it by aborting one of the transactions.
UBL Fakultas Teknologi Informasi PBD - 193
UBL
PBD - 194
Reprocessing
Reprocessing
In a Reprocessing approach, the database is periodically backed up (a database save) and all transactions applied since the last save are recorded If the system crashes, the latest database save is restored and all of the transactions are re-applied (by users) to bring the database back up to the point just before the crash. Several shortcomings: Time required to re-apply transactions Transactions might have other (physical) consequences Re-applying concurrent transactions is not straight forward.
UBL
PBD - 195
Rollback / Rollforward
We apply a similar technique: Make periodic saves of the database (time consuming operation). However, maintain a more intelligent log of the transactions that have been applied. This transaction log Includes before images and after images Before Image: A copy of the table record (or page) of data before it was changed by the transaction. After Image: A copy of the table record (or page) of data after it was changed by the transaction. Rollback: Undo any partially completed transactions (ones in progress when the crash occurred) by applying the before images to the database.
Rollforward: Redo the transactions by applying the after images to the database. This is done for transactions that were committed before the crash. Recovery process uses both rollback and rollforward to restore the database. In the worst case, we would need to rollback to the last database save and then rollforward to the point just before the crash. The DBMS flushes all pending transactions and writes all data to disk and transaction log. Checkpoints can also be taken (less time consuming) in between database saves. Database can be recovered from the last checkpoint in much less time.
UBL Fakultas Teknologi Informasi PBD - 196
Database Backup
When secondary media (disk) fails, data may become unreadable. We typically rely on backing up the database to cheaper magnetic tape or other backup medium for a copy that can be restored. However, when an DBMS is running, it is not possible to backup its files as the resulting backup copy on tape may be inconsistent. One solution: Shut down the DBMS (and thus all applications), do a full backup copy everything on to tape. Then start up again. May be infeasible to do often. Most modern DBMS allow for incremental backups. An Incremental backup will backup only those data changed or added since the last full backup. Sometimes called a delta backup. Follows something like: Weekend: Do a shutdown of the DBMS, and full backup of the database onto a fresh tape(s). Nightly: Do an incremental backup onto different tapes for each night of the week.
UBL Fakultas Teknologi Informasi PBD - 197