Anda di halaman 1dari 15

Int. J. Decision Support Systems, Vol. 1, No.

1, 2015

A comparative study of naive Bayes classifier and


Bayes net classifier for fault diagnosis of roller
bearing using sound signal
Rahul Kumar Sharma* and V. Sugumaran
Department of Mechanical Engineering,
School of Mechanical and Building Sciences,
VIT University,
Vandalur Kelambakkam Road,
Chennai 600127, Tamil Nadu, India
Email: rahulkumar.sharma2011@vit.ac.in
Email: sugumaran.v@vit.ac.in
*Corresponding author

Hemantha Kumar
Department of Mechanical Engineering,
National Institute of Technology Karnataka,
Srinivasanagar, Surathkal, Mangalore-575025 Karnataka, India
Email: hemanta76@gmail.com

M. Amarnath
Department of Mechanical Engineering,
Indian institute of information technology,
Design and Manufacturing,
Jabalpur Dumna Airport Road,
P.O. Khamaria, Jabalpur 482 005, Madhya Pradesh, India
Email: amarnath@iiitdm.in
Abstract: Bearing is an important and necessary part of any big or small
machinery and for proper working of machinery the bearing condition should
be good. Hence, there is a requirement for continuous bearing monitoring. For
the condition monitoring of bearings sound signal can be used. This paper uses
sound signal for condition monitoring of roller bearing by nave Bayes and
Bayes net algorithms. The statistical features from the sound signals were
extracted. Then features giving better results were selected using J48 decision
tree algorithm. These selected features were classified using nave Bayes and
Bayes net algorithm. The classification results for both nave Bayes and Bayes
net algorithm for fault diagnosis of roller bearing using sound signals were
compared and results were tabulated.
Keywords: nave Bayes; NB; Bayes net; BN; machine learning approach; fault
diagnosis; bearing; sound signal; decision tree algorithm; statistical features;
decision-making; condition monitoring; decision support systems.

Copyright 2015 Inderscience Enterprises Ltd.

115

116

R.K. Sharma et al.


Reference to this paper should be made as follows: Sharma, R.K.,
Sugumaran, V., Kumar, H. and Amarnath, M. (2015) A comparative study of
naive Bayes classifier and Bayes net classifier for fault diagnosis of roller
bearing using sound signal, Int. J. Decision Support Systems, Vol. 1, No. 1,
pp.115129.
Biographical notes: Rahul Kumar Sharma is an engineering student at the VIT
University, Chennai Campus, Chennai. He is currently pursuing his BTech in
Mechanical Engineering. His research interests include design engineering,
machine learning and robotics.
V. Sugumaran is working as an Associate Professor in School of Mechanical
and Building Sciences at the VIT University, Chennai Campus, Chennai. He
has published 57 international refereed journal papers. He is a reviewer in nine
international journals and editor in four international journals. He has also filed
two patents and authored one book on Instrumentation and Control Systems.
Hemantha Kumar is currently working as an Assistant Professor at the National
Institute of Technology, Karnataka. He received his BE and MTech from
Mysore University and Viseswaraya Technological University. He received his
PhD in Mechanical Engineering from Indian Institute of Technology Madras,
India. His research interests include vehicle dynamics, vibrations, condition
monitoring and design engineering.
M. Amarnath is an Assistant Professor in Mechanical Engineering Department
at the Indian Institute of Information Technology Design and Manufacturing,
Jabalpur. He received his BE and MTech degrees from Mysore University and
Viseswaraya Technological University respectively. He carried out his doctoral
research work in Machine Design from Indian Institute of Technology Madras,
Chennai and received his PhD degree in 2008. His areas of research interest are
condition monitoring of rotating machines, cutting tool tribology and
biomedical signal processing.

Introduction

In almost all industries, machineries have lots of rotating machine parts. It is desirable to
know the condition of bearing so as to produce maintenance in time in order to avoid big
loss. There are many mechanisms that lead to bearing damage like mechanical damage,
wear damage, lubrication problem, corrosion in moving parts and plastic deformation of
elements. If the bearing fault is not detected then it leads to machinery shut down which
finally results in economic losses and certain damages. Hence, it is very important to
conduct a study which provides a method for its proper monitoring and fault diagnosis.
Machine fault diagnosis is a branch in mechanical engineering concerned with the
determination of faults occurring in machineries. Many methods are used to identify the
most commonly arising faults which further leads to failure like vibration analysis, sound
signal analysis, oil particle analysis, thermal imaging, etc. Most commonly used methods
are vibration and sound signal analysis which provides necessary information when
maintenance of machinery is required.
In machine learning approach, various algorithms were used for the fault
classification with better classification accuracy than classification by decision tree. In

A comparative study of naive Bayes classifier and Bayes net classifier

117

terms of feature selection, nave Bayes (NB) and Bayes net (BN) classifiers are highly
efficient, simple and sensitive. For various types of models in a supervised learning
setting, NB classifier can be trained efficiently and it is possible to use NB without using
any Bayesian methods. Another advantage of using NB is that, it estimates feature for
classification with only a small amount of training data. It is very easier to understand
direct dependencies and local distributions for a human in BN. BN can save a
considerable amount of memory which is highly economical. BN has effective algorithms
to perform learning and inference. However no known research is done by using NB and
BN classifiers for fault diagnosis of roller bearing using sound signal. Hence, there is a
need to conduct a study of fault diagnosis of bearing by NB and BN algorithm. In the
present study an attempt is made to classify various faults occurring in the bearing with
NB and BN classifiers using sound signals.
The contributions of the present study are:
1

A statistical feature extraction is done and important features were selected using
decision tree algorithm. Features selected by decision tree algorithm had shown
better classification accuracy when provided to the algorithm in an increasing order
of their importance. Maximum classification accuracy is found in case of BN and
decision tree algorithm when only most of the selected features were used.

The fault diagnosis of the roller bearing using sound signal is studied with the
following simulated faults: inner race fault (IRF), outer race fault (ORF) and inner
and outer race faults (IORF) with NB and BN classifiers. Both of these classifiers
have different ability and advantages and this study compares their compared results
with specified system. NB and BN classifiers were used and found to have good
classification accuracy in the fault diagnosis process of roller bearing using sound
signal.

Classification accuracy of BN and decision tree algorithm was found to be maximum


when richer features were used but classification accuracy of NB found to increase
with the number of features used.

Literature review

Fault diagnosis is performed by feature extraction, feature selection and feature


classification from signal. There are different features like statistical features (Samanta
et al., 2003), histogram features (Sakthivel et al., 2011) and wavelet features
(Muralidharan and Sugumaran, 2012). The process of fault diagnosis was carried out by
accomplishing various processes like feature extraction from signals, feature selection
from the extracted features and feature classification from selected features. In present
study statistical features were used.
From an acquired sound signal for various condition of bearing, sets of statistical
features were extracted. The techniques available for feature selection process are
principal component analysis (Soman and Ramachandran, 2005), genetic algorithm
(Zhang et al., 2005) and decision tree (Sugumaran and Ramachandran, 2007) and so on.
In a study, Amarnath et al. (2013) used decision tree technique for feature selection from
given set of samples extracted from acquired sound signal (Samanta et al., 2003). The
feature that appears on the top of decision tree will be the most important feature and

118

R.K. Sharma et al.

hence will be the first selected feature. All other features will also be selected using the
same method.
Many classifiers namely, artificial neural network (ANN), fuzzy, decision tree,
support vector machine (SVM), proximal support vector machine (PSVM), NB and BN,
etc., were used for feature classification after feature selection. In a study, Wang and
Chen (2007) used frequency domain signal for fault diagnosis of a centrifugal pump with
fuzzy and neural network classifier for fault classification (Sugumaran and
Ramachandran, 2007; Wang and Chen, 2007). The efficiency of fuzzy classifier depends
on the rules suggested of the algorithms. Two different ANN approaches, namely feed
forward network and binary adaptive resonance network (ART1) were used to develop a
fault classification system for centrifugal pump (Rajakarunakaran et al., 2008). ANN
gave very good and efficient result. However, classifier training of an ANN classifier is a
complex and time consuming process.
For the fault diagnosis of various machine elements, various data mining algorithms
like SVM and PSVM have been used. A method that jointly optimisqes the feature
selection and SVM parameters was presented in that study. The fault features and the
SVM parameters were described by a hybrid vector that was taken in constraint condition
(Sakthivel et al., 2010). Yuan and Chu (2006, 2007) used SVM for fault diagnosis which
is a binary tree classifier composed of several two class classifiers. However, the
computational complexity, training time and pattern size will increase for PSVM (Yuan
and Chu, 2006, 2007). Sugumaran et al. (2007) used SVM and PSVM for fault
classification using statistical features. Classification accuracy of PSVM is found better
than SVM after the classification results of both PSVM and SVM were compared
(Sugumaran et al., 2007; Sakthivel et al., 2012).
A classifier with simple operation and high classification accuracy is required for
feature selection and feature classification process. J48 decision tree algorithm satisfies
all these conditions and it is being used in many applications. In a study, a novel hybrid
classification system based on J48 algorithm is proposed by Polat and Gunes (2009) and
one-against-all approach to classify the multi-class problems. In a study, Sugumaran and
Ramchandran (2007) discussed condition monitoring of roller bearing using decision tree
algorithm. Sugumaran and Ramchandran (2007) described the use of decision tree to
identify the best feature in order of their importance from the given set of samples.
Jin (2012) presented the fault diagnosis system of vehicle hydraulic brake system by
applying the virtual instrumentation technology to the online examination of hydraulic
brake system. The system collect the signals from a hydraulic brake system and compare
the faults occurred in the signals (Jin, 2012). Addin and Sapuan (2008) presented an
article for damage detection in engineering materials. In past fault diagnosis of
components using machine learning approach with algorithms like SVM, BN and NB
were reported. Feature subset method was used for feature extraction and NB was used
for feature classification and to compare the results (Chen and Huang, 2009). NB and BN
were used with good efficiency for condition monitoring (Elangovan, 2010).
Ananthapadmanaban and Radhakrishnan (1983) studied the effect of surface
irregularities in sliding and rolling contacts during noise generation. As per results, with
increase in roughness there will be an increase in overall noise level of the system. Noise
level increases with wearing of fine surface but decreases with wearing of rough surfaces.
Hence, overall noise level monitoring and discrete frequency patterns provide condition
detail of contacting pair (Ananthapadmanaban and Radhakrishnan, 1983). Baydar and
Ball (2003) demonstrated the effectiveness of acoustic and vibration signal for various

A comparative study of naive Bayes classifier and Bayes net classifier

119

fault detection in two-stage gearbox. Results of both the signals acoustic and vibration
after signal analysis were compared and found that acoustic signals were very effective as
compared to vibration signals for early fault detection in rotating machine elements
(Baydar and Ball, 2003). Shibata et al. (2000) used summarised dot pattern method in the
fault diagnosis of fan bearing using sound signal. The representation of the results was
diagrammatic from which healthy and faulty bearing can be easily classified by a
maintenance person (Shibata et al., 2000). Heng and Nor (1997) used sound pressure and
vibration signals to detect the faults in rolling elements using statistical parameter
estimation method. Statistical parameters such as crest factor and the distribution of
moments including kurtosis, skewness and other parameters obtained from beta
distribution function were used in this study. After testing kurtosis and crest factor from
both sound signal and vibration signal provides better fault diagnostic information as
compared to beta function parameters (Heng and Nor, 1997).
Wang and Chen (2011) used an intelligent method for fault diagnosis of bearing on
the basis of fuzzy neural network and possibility theory with frequency domain features
extracted from vibration signal. When tested faults were automatically identified on the
basis of possibilities of symptom parameters (Wang and Chen, 2011). Amarnath et al.
(2013) used sound signal and built a model using data modelling technique for fault
diagnosis of roller bearing. Decision tree algorithm was used for feature learning and
classification. The model was found to give an accuracy of 95.5% when tested with
ten-fold cross validation (Amarnath et al., 2013). Fernndez-Francos et al. (2013) used an
automatic method for fault diagnosis of a roller bearing by pattern recognition and signal
processing techniques. He combined m-SVM, envelope analysis and a rule-based expert
system for early detection and diagnosis of faulty component of the bearing. Bearing fault
detection and identification of location of failure was determined in its very early stage
(Fernndez-Francos et al., 2013).

Experimental setup

In present study, experiment is done on bearing (SKF R7 NB 62) of motor of a pump.


The motor speed is normally 1,200 rpm. A roller bearing element has basically two parts
which are known as outer race way and inner race way with a set of rolling elements
which rotate on tracks. A bearing is fixed in the test rig. The four test conditions
investigated are:
1

healthy bearing

inner race fault

ORF

inner and outer race fault.

To introduce pits in the inner and outer races of the bearing, faults are simulated using
electric discharge machining. The depth and diameter of the cylindrical pits in the inner
and outer race is approximately 0.7 mm. Sound signal from bearing is acquired by
mounting microphone and data acquisition on the machinery. The healthy bearing is
replaced by defected bearing and sound signals were recorded for rest of three cases
separately under same operating conditions. The block diagram of experimental setup is

120

R.K. Sharma et al.

shown in Figure 1. The bearing that is used in the present study has the following
specifications:
1

average diameter = 14 mm

ball diameter = 4 mm

no. of rolling elements = 7

contact angle = 0 deg

inner ring speed = 0 rpm

outer ring speed = 1,200 rpm

healthy condition = 33.3 Hz

frequency of IRF = 149 Hz

frequency of ORF = 84 Hz.

Figure 1

Experimental setup

As mentioned, there are four conditions of bearings and these conditions are the classes
in which the acquired sound signal needs to be classified. Each class is having
30 observations acquired using microphone.

Feature extraction and feature selection

The process of computing a few parameters from a signal which will represent the
particular signal is called the feature extraction for the signal. For this study, a set of
statistical parameters were computed. The features are mean, standard error, sample
variance, kurtosis, skewness, minimum, maximum, standard deviation, count, mode and
median. These features were extracted from sound signals. After feature extraction, next
step is to select the suitable feature for classification which is called feature selection
process. In the present study, feature selection process is done by decision tree algorithm.

A comparative study of naive Bayes classifier and Bayes net classifier

121

Decision tree is a set of braches, nodes, root and leaves. The chain of nodes from root
to leaf is branch. Each node involves an attribute. These attributes occur in the node to
provide the information about its importance. The set of data is taken from source and
given to the J48 algorithm and output comes as a decision tree. The decision tree has
branches, leafs, nodes and a root. The decision tree is used to classify the feature vectors.
It is done by beginning from the root of tree and moving until a leaf node is identified.
The algorithm works based on the highest entropy reduction. This criterion is used to find
out the best feature from the list of features. The decision tree algorithm (J48 algorithm)
has been applied to the data of sound signal from the roller bearing which is shown in
Figure 2. The computational environment used in this study is Weka 3.7.9.
Figure 2

Decision tree using J48 algorithm

The method for feature selection is as follows:


1

Decision tree (shown in Figure 2) is obtained using J48 algorithm when 13 statistical
features and condition (good, IRF, ORF and IORF) were given to the algorithm.

As shown in decision tree, only six statistical features were selected by the algorithm
for deciding the fault condition as only six of them appear in the decision tree.

Only these six statistical features will be selected for determining the classification
accuracy of NB and BN classifiers.

The selected features were maximum, mean, median, mode, range and kurtosis.

As maximum is on the top of decision tree. It is the most important statistical feature,
hence selected first.

Other two statistical features that appear next in the decision tree were mean and
mode. As mean has more branching then mode. Mean was selected as second and
mode as third.

Using same procedure median, range and kurtosis were selected as fourth, fifth and
sixth statistical feature.

122

R.K. Sharma et al.

Classification based on NB algorithm

The algorithm of NB works on the basis of Bayes rule that assumes the all attribute
X1, , Xn are conditionally independent of each other, for a given Y. This assumed value
is dramatically simplifies the representation of P(X / Y), and difficulty in its estimation
form training data. Example, a case where X = (X1, X2). Here
P ( XY ) = P ( X1 , X 2 Y )
P ( XY ) = P ( X1 X 2 , Y ) P ( X 2 Y )

(1)

P ( XY ) = P ( X1 Y ) P ( X 2 Y )

Generally, when X contains n attributes and they are conditionally independent of each
other for a given Y, then,

P ( X1 X n Y ) =

n
i =1

P ( Xi Y )

(2)

There is a need of only 2n parameters to define P(Xi = xik | Y = yj) or necessary i, j and k
when Y and Xi are Boolean variables. This is reduction when this is compared to
2(2n 1) parameters where there is absolutely no conditional independence assumption.
This algorithm is derived after taking the assumption that Y is a variable with discrete
value and X1 Xn are attributes with real and discrete values. The main objective is to
train the classifier. The mathematical expression for the computation of probability of Y
will take its kth value, according to Bayes rule,
P ( Y = yk X 1 X n ) =

P ( Y = yk ) P ( X 1 X n Y = yk )

P (Y = y ) P ( X X
j

Y = yj

(3)

Here the sum is all possible values of yj of Y. if we assume that are conditionally
independent for a given Y, the modified equation will be
P ( Y = yk X 1 X n ) =

P(X X
P (Y = y ) P ( X

P ( Y = yk )

Y = yk )
Y = yj

(4)

This fundamental equation represents the NB classifier and it also gives a new instance
Xnew = (X1 Xn). This equation shows how to calculate the probability that Y will take on
any given value if the observer attribute value of Xnew and from training data distributions
P(Y) and P(Xi | Y) are estimated. The most probable value of Y according to NB
classification rule:
Y arg max
yk

P(X Y = y )
P (Y = y ) P ( X Y = y )
P ( Y = yk )
j

(5)

Which further get simplified to the following as the denominator in the above does not
depends upon yk

A comparative study of naive Bayes classifier and Bayes net classifier


Y arg max P (Y = yk )
yk

P(X
i

Y = yk )

123
(6)

Classification based on BN algorithm

Bayesian networks has set of variables, V and directed edge, E between the variables.
These form a directed acyclic graph (DAG) G = (V, E). The joint distribution of variables
is represented by the product of conditional distributions for every variable. Conditional
dependency between Ai and Aj is represented by a directed edge from Ai to Aj,
( Ai , A j ) E and random variable is represented by node. In Bayesian network every

variable is independent of its non-decedents for a given value of its parent in G and the
reduction of number of parameters required to characterise a joint distribution depends
upon the independence of G for the efficiency of posterior distribution.
In Bayesian network variables, V = {A1, A2 An}, the joint distribution of variable V
is the product of all the conditional probabilities of Ai for a given Pai which is specified in
the Bayesian networks as
P ( A1 , A2 , , An ) =

n
i =1

P ( Ai Pai )

(7)

The conditional probability of Ai for a given Pai which denotes parent sets of Ai is
represented by P(Ai / Pai). In this study, the BN and NB classifiers are used for
classification. The feature vectors were (maximum, mean, median, mode, range and
kurtosis) used as an input for training the classifiers. For a particular bearing fault
condition 550 sets of data points were considered. After the testing and validation of
classifiers is done, the accuracy of the classifiers with the statistical features is discussed
in the next session.

Results and discussion

The fault diagnosis of roller bearing is done using machine learning approach with
statistical features and NB and BN combination. The results of the study are discussed
below.

7.1 Statistical features selection and classification


The feature selection was carried out using decision tree, NB and BN classifiers. The
features were selected based on their appearance in the decision tree. The selected
features in the order of their precedence are maximum, mean, mode, median, range and
kurtosis. Other features were used directly as they appear in the feature list of Weka
(computational environment). The Decision tree algorithm is used instead of other
algorithm for feature selection because it has ability to handle training data with missing
attribute values and can also handle both discrete and continuous data. Decision tree is
used as it is simple, compact and easily understandable. For good feature selection in

124

R.K. Sharma et al.

order of their importance, decision tree technique is used in the present study. The
features selected from decision tree algorithm were used initially as they appeared in the
top of decision tree and as they have richer information. Features that did not appear in
the decision tree can be considered almost irrelevant for most of the algorithm. Including
these algorithms, can reduce the classification accuracy and hence they are used at the
end of study.
There are 13 statistical features in the sound signals received from the bearing. They
are mean, standard error, median, mode, standard deviation, variance, kurtosis, skewness,
range, minimum, maximum, sum and count. After using decision tree six effective
features for the machine learning approach were selected which appeared in the decision
tree and then ordered as per their importance. The order found is maximum, mean, mode,
median, range and kurtosis respectively. All these features were used for classification
using the above algorithms and the comparative classification accuracy results were
deduced in the present study. The effect of number of features on classification accuracy
for various algorithms (decision tree, NB and BN) are given in Table 1. The
computational environment used in this study is Weka 3.7.9 with a ten-fold cross
validation.
Table 1
No. of
features

Effect of number of features on classification accuracy


Classification accuracy (%)
C4.5 decision tree

Nave Bayes

Bayes net

57.50

54.16

48.33

78.33

74.16

77.50

77.50

75.00

76.66

78.33

75.83

74.16

86.67

76.66

88.33

82.50

75.83

89.16

82.5

80

86.67

82.17

82.5

85

84.16

81.67

81.66

10

85

85

86.67

11

83.33

84.17

84.17

12

85

85.33

87.5

13

85

85.83

87.5

As there were only 6 statistical features in the decision tree Figure 2. Therefore, the
possibility of achieving a higher classification accuracy lies in this range. Table 1
represents the classification accuracy achieved by a particular algorithm for different
number of features used. Selecting the right number of features depend upon the
requirement which can be either less computational time or maximum classification
accuracy. Factor can be decided based on the available computational resources and
system. Training of NB classifier is done using the sound signal of roller bearing and the
confusion matrix for NB for maximum classification accuracy is shown is shown in
Table 2(a). Cross validation approach is used with ten folds for training validation.

A comparative study of naive Bayes classifier and Bayes net classifier

125

7.2 Statistical features using NB algorithm


For the NB algorithm, the accuracy is maximum for fault diagnosis of roller bearing
using sound signal when all the 13 features are used for classification. Hence,
classification accuracy for the NB algorithm for 13 features are calculated as 85.83%.
Below is the confusion matrix for all the features by NB algorithm.
Table 2(a) Confusion matrix for NB algorithm
Good

IRF

ORF

IORF

Good

28

IRF

29

ORF

20

IORF

26

ORF

IORF

Table 2(b) Confusion matrix for BN algorithm


Good

IRF

Good

25

IRF

30

ORF

26

IORF

26

In the above confusion matrix terms used are Good, IRF, ORF and IORF. The best way
to visualise the result is confusion matrix as shown in Table 2(a) for NB algorithm for all
features. For each condition of the roller bearing minimum 30 samples were considered.
All the diagonal element of confusion matrix represents the number of correctly classified
data points. Here is the procedure for reading and understanding the confusion matrix. It
looks in the form of a square matrix. Referring to Table 2(a), the first row in the matrix
represents the total number of data points which corresponds to good bearing condition
(Good). The first column represents that among first row elements how many are
correctly classified as elements for good bearing condition. In the first row 28 out 30 are
correctly classified and there is no misclassification in IRFs and IORF. However, there
are two misclassification in ORFs. In the first row of Table 2(a) the IRF is zero that
means none of the data points are misclassified as IRF fault condition. The second row
represents the total number of data points corresponds to IRFs condition; out of which
the first column represents misclassification of those data points which are classified as
good condition. However they come under IRF condition. In this case one data point is
misclassified as good condition. Second row second column represents how many of
IRFs data points have been correctly classified as IRFs data points. In this case out of
30 data points 29 are correctly classified and one is misclassified as Good bearing
condition condition. It means one of those IRF conditions was misclassified as the
other fault conditions and so on.
An observation of Table 1 for NB classifier shows a very important behaviour of
classifier. The classification accuracy of NB classifier was increasing almost directly with
increase in number of features used for classification. Whereas the highest classification
accuracy in case of decision tree algorithm was seen when only top five features were
used for classification.

126

R.K. Sharma et al.

7.3 Statistical features using BN algorithm


The effect of number of features on classification accuracy for the BN algorithm is given
in Table 1. In the present study, BN algorithm was showing maximum classification
accuracy when top six features were used. Unlike NB, BN does not show an increase in
classification accuracy with increase in the number of features used for classification.
The BN algorithm is trained with statistical features of sound signal from roller
bearings. The confusion matrix for BN classification of statistical feature is shown in
Table 2(b). Ten folds cross validation approach is used training.
BN classifier has the highest classification accuracy of 89.16% with the specified
system. The reason for high accuracy of BN is no misclassification in IRF condition and
most of the data points lie on the diagonal of the confusion matrix. Classification
accuracy for BN algorithm was calculated as 89.16 % for top six features obtained from
decision tree algorithm.

7.4 Detailed accuracy by class


True positive rate (TP rate) stands for the number of correctly labelled items like data
belonging to the positive class and also called TP rate. False positive rate (FP rate) stands
for the result which shows a positive outcome; however actually the outcome is not
positive and hence, it is also called FP rate. When a classifier retrieves information and
recognises pattern in given data then the fraction of retrieved instances that are relevant is
Precision. While the fraction of relevant instances that are retrieved are Recall. Both
Precision and Recall are based on understanding and measure of relevance.
TP rate means the rate to which algorithm is classifying correct data as correct. FP
rate means the rate to which algorithm is classifying incorrect data as correct. Matthews
correlation coefficient is (MCC) which accounts true positive, false positive, true
negative and false negative and provides a balanced measure which can be used even
with different size classes. ROC area is the area under receiver operating characteristic
(ROC) curve. PRC area is area under precision-recall curve (PRC). A mathematical
equation that combines precision and recall is the harmonic mean of precision and recall
is known as traditional F-measure and it is denoted by F.
F = 2

Precision Recall
Precision + Recall

(8)

Here, precision and recall are,


Precision =
Recall =

{Correct data points} { All data points}


{ All data points}

{Correct data points} { All data points}


{Correct data points}

The detailed accuracy by class is calculated. The final result for the BN classifier and NB
classifier for the maximum accuracy are shown in Figure 3(a) and Figure 3(b)
respectively.

A comparative study of naive Bayes classifier and Bayes net classifier


Figure 3

127

(a) Detailed accuracy by class for NB algorithm (b) Detailed accuracy by class for BN
algorithm

(a)

(b)

Figure 3(a) shows the detailed accuracy by class for a NB classification used in the study
with all the 13 features. In the first row of confusion matrix in Table 2(a) 28 out of 30 are
correctly classified as good bearing condition (good). According to detailed accuracy by
class in Figure 3(a) precision for this is 0.778. Here two data points are misclassified as
ORF. Hence, the TP rate for this data is 93.3 % (0.933) and FP rate is 8.9 % (0.089). Fmeasure value is 84.8 % (0.848). Hence, the detailed accuracy of the data sets can be
easily studied by using these measures.
In Figure 3(b), the detailed accuracy for the Bayes classification by class is shown.
According to the BN algorithm, in the IRF row 30 out of 30 data points are correctly
classified as shown in confusion matrix Table 2(b) and there is no misclassification and
hence from Figure 3(b) TP rate and recall is 100 %. The FP rate is 3.3% (0.033) with
precision 0.909 and F-measure is 0.952. The overall accuracy of the BN was found to be
89.17 % and for NB 85.83% and hence the classification accuracy for BN algorithm is
higher than NB classification.
The final result of this study is the classification accuracy for the BN algorithm using
statistical features for the sound signal of roller bearing is comparatively higher. BN
classifier is achieving high classification accuracy only by using top six features
(maximum, mean, mode, median, range and kurtosis) selected by decision tree algorithm
which again saves computational time instead of using all the features. The results
obtained in this study were very specific to this particular dataset and this classification
does not assure similar performance for all feature datasets in other conditions also.

Conclusions

This study deals with sound signal based on fault diagnosis of roller bearing of a
machinery using machine learning approach. Four classification states were simulated on
the testing setup. Feature sets were extracted from the data using statistical feature
extraction method and feature classification has been done using NB, BN and decision

128

R.K. Sharma et al.

tree algorithm. The results from the NB and BN were compared. From the results of this
study one can confidently say that for the same dataset the classification accuracy using
BN is higher than NB classification. BN classifier gave good result with an accuracy of
89.16% when tested with six statistical features. From Table 2(a) and 2(b) it is found that
in BN, there are very less number of features in data points in first column with second,
third and fourth row are relatively lesser which means that very less number of bad
condition bearing is classified as good condition bearing. This is another advantage as, if
good condition bearing is classified as bad condition bearing loss may be cost of a
bearing but classification of a bad condition bearing as good condition bearing can lead to
a complete shutdown and heavy loss. In a study, Amarnath et al. (2013) found a
classification accuracy for 95.9% by decision tree algorithm using sound signal acquired
from a roller bearing. In similar study, we found a classification study of 86.67 % using
decision tree algorithm with representative dataset. The same dataset is achieving a
classification accuracy of 89.16% using BN algorithm. Comparing the result it is found
that for a given dataset BN can achieve higher classification accuracy then decision tree
algorithm. The results are calculated and the presented only for the representative dataset
and faulty conditions.

References
Addin, O. and Sapuan, S.M. (2008) A Nave-Bayes classifier for damage detection in engineering
materials, Materials and Design, Vol. 28, No. 8, pp.23792386.
Amarnath, M., Sugumaran, V. et al. and Kumar, H. (2013) Exploiting sound signals for fault
diagnosis of bearings using decision tree, Measurement, Vol. 46, No. 3, pp.12501256.
Ananthapadmanaban, T. and Radhakrishnan, V. (1983) An investigation of the role of surface
irregularities in the noise spectrum of rolling and sliding contact, Wear, Vol. 83, No. 2,
pp.300409.
Baydar, N. and Ball, A. (2003) Detection and diagnosis of gear failure via vibration and acoustic
signals using wavelet transform, Mechanical Systems and Signal Processing, Vol. 17, No. 4,
pp.787804.
Chen, J. and Huang, H. (2009) Feature selection for text classification with nave Bayes, Expert
Systems with Applications, Vol. 36, No. 3, pp.54325435.
Elangovan, M. (2010) Studies on Bayes classifier for condition monitoring of single point carbide
tipped tool based on statistical and histogram features, Expert Systems with Applications,
Vol. 37, No. 3, pp.20592065.
Fernndez-Francos, D. et al., Martnez-Rego, D., Fontenla-Romero, O. and Alonso-Betanzos, A.
(2013) Automatic bearing fault diagnosis based on one-class m-SVM, Computers &
Industrial Engineering, Vol. 64, No. 1, pp.357365.
Heng, R.B.W. and Nor, M.J.M. (1997) Statistical analysis of sound and vibration signals for
monitoring rolling element bearing condition, Applied Acoustics, Vol. 53, Nos. 13,
pp.211266.
Jin, Y. (2012) Design of hydraulic fault diagnosis system based on lab VIEW, Advanced
Materials Research, Vols. 457458, pp.257260.
Muralidharan, V. and Sugumaran, V. (2012) A comparative study of naive Bayes classifier and
Bayes net classifier for fault diagnosis of monoblock centrifugal pump using wavelet
analysis, Journal of Applied Soft Computing, Vol. 12, No. 8, pp.17.
Polat, K. and Gunes, S. (2009) A novel hybrid intelligent method based on C4.5 decision tree
classifier and one-against-all approach for multi-class classification problems, Expert Systems
with Applications, Vol. 36, No. 2, pp.15871592.

A comparative study of naive Bayes classifier and Bayes net classifier

129

Rajakarunakaran, S. et al., Venkumar, P., Devaraj, D. and Surya Prakasa Rao, K. (2008) Artificial
neural network approach for fault detection in rotary system, Applied Soft Computing, Vol. 8,
No. 1, pp.740748.
Sakthivel, N.R. et al., Indira, V., Nair, B.B. and Sugumaran, V. (2011) Use of histogram features
for decision tree based fault diagnosis of monoblock centrifugal pump, International Journal
of Granular Computing, Rough Sets and Intelligent Systems (IJGCRSIS), Vol. 2, No. 1,
pp.2336.
Sakthivel, N.R. et al., Sugumaran, V. and Nair, B.B. (2010) Application of support vector machine
(SVM) and proximal support vector machine (PSVM) for fault classification of monoblock
centrifugal pump, International Journal of Data Analysis Techniques and Strategies, Vol. 2,
No. 1, pp.3861.
Sakthivel, N.R. et al., Sugumaran, V. and Nair, B.B. (2012) Automatic rule learning using rough
set for fuzzy classifier in fault categorization of centrifugal pump, International Journal of
Applied Soft Computing, Vol. 12, No. 1, pp.196203.
Samanta, B. et al., Al-balushi, K.R. and Al-Araim, S.A. (2003) Artificial neural networks and
support vector machines with genetic algorithm for bearing fault detection, Engineering
Applications of Artificial Intelligence, Vol. 16, Nos. 78, pp.657665.
Shibata, K., Takahashi, A. and Shirai, T. (2000) Fault diagnosis of rotating machinery through
visualization of sound signals, Mechanical Systems and Signal Processing, Vol. 14, No. 2,
pp.229241.
Soman, K.P. and Ramachandran, K.I. (2005) Insight into Wavelets from Theory to Practice,
Prentice-Hall of India Private Limited, New Delhi, India.
Sugumaran, V. and Ramachandran, K.I. (2007) Automatic rule learning using decision tree for
fuzzy classifier in fault diagnosis of roller bearing, Mechanical Systems and Signal
Processing, Vol. 21, No. 5, pp.22372247.
Sugumaran, V. et al., Muralidharan, V. and Ramachandran, K.I. (2007) Feature selection using
decision tree and classification through proximal support vector machine for fault diagnostics
of roller bearing, Mechanical Systems and Signal Processing, Vol. 21, No. 2, pp.930942.
Wang, H. and Chen, P. (2011) Intelligent diagnosis method for rolling element bearing faults using
possibility theory and neural network, Computers & Industrial Engineering, Vol. 60, No. 4,
pp.511518.
Wang, H.Q. and Chen, P. (2007) Sequential condition diagnosis for centrifugal pump system using
fuzzy neural network, Neural Information Processing: Letters and Reviews, Vol. 2, No. 3,
pp.4150.
Yuan, F. and Chu, F-L. (2006) Support vector machines-based fault diagnosis for turbo-pump
rotor, Mechanical Systems and Signal Processing, Vol. 20, No. 4, pp.939952.
Yuan, S-F. and Chu, F-L. (2007) Fault diagnostics based on particle swarm optimization and
support vector machines, Mechanical Systems and Signal Processing, Vol. 21, No. 4,
pp.17871798.
Zhang, L. et al., Jack, L.B. and Nandi, A.K. (2005) Fault detection using genetic programming,
Mechanical Systems and Signal Processing, Vol. 19, No. 2, pp.271289.

Anda mungkin juga menyukai