Data Mining
1. Merupakan salah Satu step KDD
Ket :
Data Mining :
Proses pengekstraksian powerful / interesting knowlage dari data yang tersimpan di
dalam database berukuran besar.
DB
Data
Mining
Standart
mathematika
Knowlage
Teknik AI
Neural Network
Fuzzy Logic
Algoritma Genetik
Rhouge Set
Soft Set
Membaca dan Memahami Jurnal ( membuat peper dan di publis dengan melakukan
pembayaran ) dan Prosiding ( membuat paper dan dikrim melalui seminar dan melakukan
persentasi )
Rule Assosiation :
Assosiasi / korelasi dari sejumlah item ( set of items ) dari data yang tersimpan di
dalam database
Bentuk umum :
X --------- Y [ Support, Confidence ]
Nilai yang digunakan untuk mengevaluasi rule asosiasi
Prinsip Concept :
1. Support x --- y = X U Y
N --- Jumlah Transaksi
Support y --- x = Y U X
N --- Jumlah Transaksi
Ket : Support x - -- y = Suport y --- x
2. Confidence x --- y = Support x --- y
Support x
Confidence y --- x = Support y --- x
Support y
Ket : Confidence x --- y # Confidence y --- x
3. Sebuah rule asosiasi dikatakan Interisting apabila mempunyai Support >=
Minimum Support Dan Confidence >= Minimum Confidence.
Catatan : Nilai minimum Support dan minimum confidence ditentukan oleh
DOMAIN EXPERT.
X ---- Y [ Support, Confidence ]
Contoh :
1. Nilai minimum Support = 50%
Nilai Minimum Confidence = 50%
Transaction
2000
1000
4000
5000
Item Bought
A,B,C
A,C
A,D
B,E, F
Jawab :
Support x --- y = X U Y = 2 = 0,5 = 50 %
N
4
Support y --- x = Y U X = 2 = 0,5 = 50 %
N
4
Confidence x --- y = Support x --- y = 2 /4 = 2/3 = 0,666 = 66,7 %
Support x
3/4
Confidence y --- x = Support y --- x = 2 /4
= 2/2 = 1 = 100 %
Support y
2/4
Jadi : Maka
A ---- C Support { 50%, 66,6% } merupakan Interisting
C ---- A Support { 50%,100% }
2-Item Set
Item
Support
A,B
1/4 = 25%
A,C
2/4 = 50%
Merupakan Foundensial Item Set
B,C
1/4 = 25%
2. Menggunakan Fourent Item Set ( mulai dari item set ).
A --- C { s = 50%, C = 66,7% }
Knowlage
IF --- Then
Income
None
Low
Low
Low
Medium
Hight
Ec1
Ec2
Ec3
Ec4
AB
Ec5
AB
C
AB
AB
C
AB
AB
C
AB
X
C
AB
AB
AB
X
C
C
4. Reduction
1. Dilihat dari Modulo D dan Matrix, dengan mengunakan aljabar Boolean :
a. Cara pencarian dalam bentuk booleean :
Ket : ^ = * ( Perkalian ), dan V = + ( Penjumlahan )
Untuk Nilai : AA = A
A + AB = A ( 1+B ), dimana ( 1+B ) = 1
=A
Menentukan nilai boolean paada Modulo D, apabila ada yang memiliki nilai yang
sama di ambil salah satunya contoh { ( A v B v C ) ^ ( A v B v C ) maka (A v B v C):
1. C ^ A ^( A v B ) ^ ( A v B v C )
Cara menyederhanakan :
C * A * ( A + B ) * ( A + B + C ) = C * A * AA * AB *AC * BA * BB
* BC
= CA * A * AB * AC * BA * B * BC
= C ( 1 +A)
2. C ^ ( A v B )
Cara menyederhanakan :
C * ( A + B ) = CA + CB
=
3. A ^ ( A v B v C )
Cara menyederhakan :
A ^ ( A v B v C ) = AA + AB + AC
= A + AB + AC
= A ( 1 + B ) + AC
= A + AC
= A( 1 + C ) = A
4. ( A v B ) ^ ( A v B v C )
Cara menyederhanakan :
5. ( A v B v C ) ^ ( A v B )
Cara menyederhanakan :
Jadi : Hasil Reduction :
1. {A,C} = { Studies, Works }
2. {B,C} = { Education, Works }
3. { A } = { Studies }
4. { B } = { Education }
5. Generade Rules
Dengan mengunkan atribut Reduct
a. { Studies, Work }
1. IF Studies = Poor and Work = Poor, Then Income = Low
2. IF Studies = Poor and Work = Good, Then Income = Low
3. IF Studies = Moderate and Work = Poor, Then Income = Low
4. IF Studies = Good and Work = Good, Then Income = Medium Or Income
= Good
b. { Education, Work }
1. IF Education = Smu and Work = Poor, Then Income = None
2. IF Education = Smu and Work = Good, Then Income = Low
3. IF Education = Diploma and Work = Poor, Then Income = Low
4. IF Education = Msc and Work = Good, Then Income = Medium Or Income
= Hight
c. { Studies }
1. IF Student = Poor Then Income = None Or Income = Low
2. IF Student = Moderate Then Income = Low
3. IF Student = Good Then Income = Medium Or Income = Hight
d. { Education }
1. IF Education = Smu then Income = None Or Income = Low
2. IF Education = Diploma then Income = Low
3. IF Education = Msc then Income = Medium Or Income = Hight
Jadi : Terdapat 14 pengetahuan.......
Tugas :
1. Analisa Hasil yang di peroleh oleh Roseta :
LHS Support Jumlah object yang memenuhi bagian if
RHS Supoort Jumlah object yang memenuhi bagian then
RHS support
RHS Accuracy LHS Support
LHS support
LHS Coverage= Jumlah object dalam DS
RHS support
RHS Coverage= Jumlah object yg memenuhi bagianthen
RHS support
RHS Stability= Jumlah object yg memenuhi rule
dd
1
0
0
1
0
1
1
1
0
0
1
0
1
1
Algoritma MD-Heuristic
Langkah- langkah :
1. Menyiapkan tabel , Misal tabel A, berdasarkan Discernibility formula
Pa1
Pa2
Pa3
Pa4
Pb1
Pb2
Pb3
D*
(U1,U2)
(U1, U3)
(U1, U5)
(U4, U2)
(U4, U3)
(U4, U5)
(U6, U2)
(U6, U3)
(U6, U5)
(U7, U2)
(U7, U3)
(U7, U5)
New
Jumlah
angka 1