Tugas Besar Data MIning

KELOMPOK 9F
10523002 Wahyu Purnomo *

10523010 Anang Andrianto
10523092 Depri Harpindo*
10523118 Rizky Yusuf Yulizar*
10523239 Farah Fauziyah Hanum
10523355 Rahima Wahyu Sabilla
*kerja tidak maksimal
DESKRIPSI DATA SET
Jumlah objek di dalam data set:

180000
Jumlah mahasiswa yang melakukan rekaman data set:

6 orang
Jenis bahasa/suara yang direkam:

Sandi Morse (Peluit)
Jumlah atribut:
27
DESKRIPSI PREPROCESSING
Jenis pre-processing yang digunakan:

Supervised: Attribute Selection
Hasil pre-processing data:
ALGORITMA KLASIFIKASI
Hasil akhir
=== Summary ===

Correctly Classified Instances
Incorrectly Classified Instances
Kappa statistic
Mean absolute error
Root mean squared error
Relative absolute error
Root relative squared error
Total Number of Instances
23028
156972
0.031
0.178
0.302
98.8811 %
100.6551 %
180000
12.7933 %
87.2067 %
=== Detailed Accuracy By Class ===

TP Rate
FP Rate
Precision
Recall
F-Measure
ROC Area
0.028
0.023
0.119
0.028
0.046
0.516
0.011
0.008
0.132
0.011
0.021
0.531
0.017
0.019
0.091
0.017
0.029
0.509
0.009
0.01
0.09
0.009
0.017
0.567
0.666
0.53
0.122
0.666
0.207
0.614
0.409
0.268
0.145
0.409
0.214
0.601
0.082
0.057
0.139
0.082
0.103
0.559
0.041
0.037
0.11
0.041
0.06
0.553
0.005
0.005
0.099
0.005
0.01
0.534
0.011
0.011
0.099
0.011
0.02
0.55
0.128
0.097
0.115
0.128
0.072
0.553
Class
delapan
dua
empat
enam
lima
nol
satu
sembilan
tiga
tujuh
Weighted Avg.
=== Summary ===

Kappa statistic
Mean absolute error
52106
73894
0.3484
0.1389
0.2709
77.1564 %
90.3014 %
126000
41.354
58.646
%
%

TP Rate
FP Rate
Precision
Recall
F-Measure
ROC Area
0.483
0.085
0.388
0.483
0.43
0.803
0.421
0.078
0.375
0.421
0.397
0.781
0.423
0.076
0.382
0.423
0.401
0.78
0.421
0.072
0.393
0.421
0.407
0.787
0.399
0.065
0.405
0.399
0.402
0.801
0.451
0.059
0.458
0.451
0.454
0.821
0.404
0.053
0.46
0.404
0.43
0.8
0.376
0.053
0.441
0.376
0.406
0.798
0.394
0.048
0.478
0.394
0.432
0.8
0.363
0.063
0.391
0.363
0.377
0.783
0.414
0.065
0.417
0.414
0.414
0.795
Class
delapan
dua
empat
enam
lima
nol
satu
sembilan
tiga
tujuh
Weighted Avg.
=== Summary ===

Kappa statistic
Mean absolute error
43775
118225
0.1891
0.1486
0.3458
82.5465 %
115.2564 %
162000
27.0216 %
72.9784 %
TP Rate
FP Rate
Precision
Recall
F-Measure
ROC Area
0.295
0.095
0.257
0.295
0.275
0.63
0.25
0.085
0.247
0.25
0.248
0.617
0.266
0.093
0.241
0.266
0.253
0.612
0.284
0.087
0.267
0.284
0.275
0.626
0.264
0.084
0.258
0.264
0.261
0.626
0.317
0.071
0.33
0.317
0.323
0.658
0.275
0.084
0.267
0.275
0.271
0.626
0.265
0.077
0.277
0.265
0.271
0.631
0.263
0.07
0.295
0.263
0.278
0.624
0.223
0.066
0.275
0.223
0.246
0.619
0.27
0.081
0.271
0.27
0.27
0.627
Class
delapan
dua
empat
enam
lima
nol
satu
sembilan
tiga
tujuh
Weighted Avg.
=== Confusion Matrix ===
4782 1028 1240 1239 1102 1405 1214 1968
824 1430 |
<-- classified as
a = delapan
1271 4049 1531 1195 1426 1031 1952 1022 1757
979 |
b = dua
1427 1353 4303 1611 1819 1144 1320 1044 1216
932 |
c = empat
1406 1212 1708 4594 1733 1000 1204
988 1001 1305 |
d = enam
1338 1434 2013 1983 4277
837 1215 1102 |
e = lima
1691 1183 1409 1160
831 1160
862
789 |
f = nol
1360 1859 1486 1288 1248 1090 4466 1059 1451
940 |
g = satu
2039 1182 1241 1223
980 5133 1241 1755
989 1790 1339 4284
849 1229 |
1287 1840 1486 1222 1478 1074 1691 1004 4267
861 |
1986 1251 1412 1702 1502 1056 1160 1510 1021 3620 |
h = sembilan
i = tiga
j = tujuh
Perbandingan Akurasi
Naive bayes
: 23028
12.7933 %
Random Forest : 52106
41.354 %
J48
27.0216 %
: 43775

Naive bayes
: 156972
87.2067 %
Random Forest : 73894
58.646 %
J48
72.9784 %
: 118225
Kappa statistic
Naive bayes
: 0.031
Random Forest : 0.3484
J48
: 0.1891
Mean absolute error

Naive bayes
: 0.178
Random Forest : 0.1389
J48
: 0.1486

Naive bayes
: 0.302
Random Forest
: 0.2709
J48
: 0.3458
Naive bayes
: 98.8811 %
Random Forest
: 77.1564 %
J48
: 82.5465 %
Naive bayes
: 100.6551 %
Random Forest
: 90.3014 %
J48
: 115.2564 %
Naive bayes
: 180000
Random Forest
: 126000
J48
: 162000
Pilihan Terbaik
Pilihan terbaik untuk mengolah jenis bahasa/suara
sandi morse (peluit) dengan jumlah atribut 27 dan objek
data set 180000 adalah dengan menggunakan data preprocessing: AttributeSelection (digunakan untuk memfilter
jumlah atribut yang akan diolah sehingga pada saat proses
pengolahan data, tidak memakan waktu yang lama dan
tidak memberatkan atau melemahkan kinerja komputer.
Dalam data yang kami olah terdapat 27 atribut. Setelah
dilakukan peoses filterisasi, atribut berkurang menjadi 6
atribut) .
Algoritma yang cocok untuk mengolah data ini
adalah Random Forest.

Tugas Besar Data MIning

Diunggah oleh

Informasi Dokumen

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

Tugas Besar Data MIning

Diunggah oleh

Hak Cipta:

Format Tersedia

KELOMPOK 9F

10523002 Wahyu Purnomo *

DESKRIPSI DATA SET

Jumlah objek di dalam data set:

Jumlah mahasiswa yang melakukan rekaman data set:

Jenis bahasa/suara yang direkam:

Jenis pre-processing yang digunakan:

=== Summary ===

=== Detailed Accuracy By Class ===

=== Summary ===

=== Detailed Accuracy By Class ===

=== Summary ===

Correctly Classified Instances

=== Detailed Accuracy By Class ===

=== Confusion Matrix ===

4782 1028 1240 1239 1102 1405 1214 1968

1271 4049 1531 1195 1426 1031 1952 1022 1757

1427 1353 4303 1611 1819 1144 1320 1044 1216

1406 1212 1708 4594 1733 1000 1204

988 1001 1305 |

1338 1434 2013 1983 4277

837 1215 1102 |

1691 1183 1409 1160

1360 1859 1486 1288 1248 1090 4466 1059 1451

2039 1182 1241 1223

980 5133 1241 1755

989 1790 1339 4284

1287 1840 1486 1222 1478 1074 1691 1004 4267

Random Forest : 52106

Incorrectly Classified Instances

Random Forest : 73894

Random Forest : 0.3484

Mean absolute error

Random Forest : 0.1389

Root mean squared error

Anda mungkin juga menyukai