1, FEBRUARY 2013 37
AbstractForecasting the future states of a complex system is Practically, states can be either continuous or discrete.
a complicated challenge that is encountered in many industrial 1) Continuous states generally represent the value of a signal
applications covered in the community of prognostics and health
management. Practically, states can be either continuous or dis-
(an observation or a feature), and its prediction can be made
crete: Continuous states generally represent the value of a signal by Kalman-like procedures or by neural networks [4], [5];
while discrete states generally depict functioning modes reflecting 2) Discrete states generally depict functioning modes re-
the current degradation. For each case, specific techniques exist. flecting the current degradation, and its prediction can
In this paper, we propose an approach based on case-based rea- be performed by state machines such as hidden Markov
soning that jointly estimates the future values of the continuous
signal and the future discrete modes. The main characteristics
models [21].
of the proposed approach are the following: 1) It relies on the In both cases, data-driven prognostics generally involves a
K-nearest neighbor algorithm based on belief function theory; training procedure where statistical models of the degradation
2) belief functions allow the user to represent his/her partial are built. However, in real applications, one is facing the lack of
knowledge concerning the possible states in the training data set, knowledge on the system since many unknown factors cannot
particularly concerning transitions between functioning modes
which are imprecisely known; and 3) two distinct strategies are
be identified, for example, environmental conditions and partic-
proposed for state prediction, and the fusion of both strategies is ular functioning modes. These factors may play a crucial role
also considered. Two real data sets were used in order to assess the in the data collection process and, therefore, in the degradation
performance in estimating future breakdown of a real system. modeling, limiting the applicability of statistical models. In [7],
Index TermsBelief functions, pattern analysis, partially- this problem is underlined and tackled in the context of neuro-
supervised learning, Prognostic, similarity-based reasoning. fuzzy systems.
To cope with the problem of lack of knowledge, case-based
reasoning (CBR) was proposed as an efficient alternative to
I. I NTRODUCTION perform prognostics. For example, the method described in
A. Problem Statement [3] demonstrated better performance than neural network for
continuous state prediction in a turbofan engine. For that,
parameters and has the following characteristics. classification process which enables one to assess the discrete
state (functioning mode) of the system while allowing the use
1) EVIPRO-KNN is a new prognostics approach based on
of multidimensional degradation signal [6].
belief functions: A trajectory similarity-based approach
The remainder of this paper is organized as follows.
based on belief functions is proposed for prognostics.
Section II presents the related work. Section III is dedicated
Belief functions were justly proposed to cope with the
to the description of the basics of belief functions and of the
lack of data in data representation, combination, and
notations. The proposed algorithm is described in Section IV.
decision making [8][11] and account for both variability
Finally, Section V demonstrates the performance of the ap-
and incomplete knowledge by drawing benefits from both
proach on real data.
probability theory and set-membership approaches in one
common and sound framework.
II. R ELATED W ORK
2) EVIPRO-KNN takes into account partial labeling on
states: In some applications, the training data set is In PHM applications, approaches generally aim at computing
composed of continuous trajectories and of a set of labels long-term predictions of continuous observations followed by
reflecting the current system state. These labels can be a thresholding in order to detect the fault mode and to finally
obtained by a manual annotation as it is commonly done estimate the RUL of the system. Long-term prediction of con-
in supervised classification [4], by a clustering method tinuous observations can be made by several data-driven tech-
[12] or by a posteriori classification [6]. If these labels niques such as neural networks and statistical models. These
are known only partially, then belief functions can be approaches require a significative amount of representative data
used [13]. and generally generates black boxes. However, in PHM appli-
3) EVIPRO-KNN manages trajectories with different tempo- cations, the lack of data is an important problem, particularly
ral length: The weighted sum of trajectories used to com- for faulty states which are difficult to obtain without degrading
pute the prediction of observations requires trajectories the system. The cost to obtain these data may be important, and
with the same length, which is generally false in most solutions have thus to be proposed. The K-nearest neighbor-
of the applications. As far as we know, this problem of based approach, which is a well-known nonparametric solution
length was not treated in the past while it can have strong for pattern analysis [4], was shown to be adapted because it
influence on the final result. We described two approaches fully exploits the training data set and is adapted when the
to solve it. latter is small. For example, in [3], KNN demonstrated high
4) EVIPRO-KNN is able to predict jointly continuous and performance better than the previous techniques with lower
discrete states: The prediction of the future sequence of complexity and better interpretation.
states is performed jointly with the prediction of continu- For prognostics applications [1][3], the estimation of the
ous observations. As far as we know, the joint prediction RUL by KNN-based approaches generally involves three tasks
of discrete states and of continuous observations was not [3]: instance retrieval in the training data set (as in the usual
considered jointly in PHM applications nor in CBR-based KNN), prediction through local models, and aggregation of
prediction. local predictions. The actual life of the training instance is gen-
erally used as the prediction of the test instances life. The local
The possibility to manage prior information on possible predictions are aggregated to obtain the final RUL estimate
states in the training data set is one of the main advantages using the weighted sum of the local predictions. Recent works
of EVIPRO-KNN. Indeed, in most papers, only two states are such as [3] and [15] are focused on associating a confidence
considered: normal and faulty. However, in many real cases, information to predictions made by KNN, but solutions gener-
more states have to be considered. When knowledge on states ally rely on density estimation or require representative prior
is not always certain and precise, then belief functions can be information in the training data set.
used as described in this paper. One can note that KNN-based systems for prediction were
The other main asset of EVIPRO-KNN is the possibility to proposed in the 1990s, such as [16], where an empirical and
predict the sequence of continuous observations jointly with comparative studies are made between CBR and other alterna-
discrete states. These sequences allow the user to have access to tive approaches. Several applications in time-series forecasting
the online segmentation of the current observed data and may be were also developed based on KNN. For example, in telecom
practically useful. As shown in experiment, sequences of states industry, customer churn prediction represents a key priority.
generate accurate estimate of the RUL of the system. In [17], Ruta et al. addressed the weakness of static churn
Finally, RUL estimation is generally based on the study of prediction and propose new temporal churn prediction system.
a 1-D degradation signal. If this signal becomes greater than a It uses K-nearest sequence algorithm that learns from the whole
given threshold, then the system is said to enter in a potential available customer data path and is capable to generate future
dangerous mode. This systems health assessment [14] requires data sequences along with precisely timed predicted churn
the threshold to be tuned precisely, and this can be a practical events. Many applications concern finance, for example, with
problem, particularly when the signal does not have a physical [18] where KNNs are used to build a prediction model of the
meaning. Moreover, the use of multidimensional degradation return on assets of a company, in [19] where the ensemble
signals is preferred to ensure reliable RUL estimates making of KNNs is exploited for bankruptcy prediction, and in [20]
the use of thresholding techniques practically difficult. In com- where KNN is used for forecasting the final price of an ongoing
parison, the proposed EVIPRO-KNN algorithm is based on a online auction.
RAMASSO et al.: JOINT PREDICTION OF CONTINUOUS AND DISCRETE STATES IN TIME-SERIES 39
|T |
The K selected trajectories Tk = {(Xtk , mkt )}t=ck
k
, k = 1 . . . K,
are composed of both a set of features Xt
and knowledge
Q
B. Step 2Predicted Observation Trajectory Fig. 2. Illustration of the observation prediction process. Each data point is a
value in the space of reals.
A simple and usual way (common in KNN-based ap-
proaches) to define a predicted observation trajectory Xt linked
to the observation block Yt is to compute the weighted average
of the K sets of features
K
Xt+h = F k Xlk , l = ck . . . |Tk |, h = 1...P (15)
k=1
where
P = |Tk | ck + 1 (16)
2) Bold Strategy: Let consider that the kth best trajectory observation. The main feature of this classifier is the possibility
Tk starts from ck (time instant corresponding to the beginning to manage belief functions mit provided in the training data
of the best block in Tk ). Therefore, trajectories in L are trun- set L (partially supervised classification). Moreover, it is a
cated (from ck ) before computing the prediction, and thus, the model-free classifier which is a well-suited approach for PHM
length of trajectories |T(k) | is reduced to |T(k) | ck + 1. Tra- applications where we generally face imbalanced data [32].
jectories (truncated) are then sorted according to their length: The CPS thus applies the classifier on the predicted obser-
|T(1) | < |T(2) | < . . . |T(k) | < . . . < |T(K) |. The average [simi- vations Xt+h , h given the training data set L and provides a
lar process to (15) but adapted for the bold strategy in (20)] belief mass on the possible states
is then taken on the K trajectories from their beginning to
t+h EvKNN classifier(L, Xt+h )
mCP S
|T(1) |, k = 1 . . . K, then it is taken on K 1 trajectories (22)
(T(2) . . . T(k) . . . T(K) ) from |T(1) | + 1 to |T(2) |, k = 2 . . . K,
and so on until the last trajectory T(K) for which the average where CP S stands for classification of prediction strategy.
equals T(K) from |T(K1) | + 1 to |T(K) |. From this belief mass, a hard decision can be made to estimate
The predicted features at time t + h are thus formally the state of the current block by using the pignistic transform
given by [10] which computes a probability distribution (suited for deci-
sion making) from the belief mass mCP S
t+h [see (8)]. Repeating
K
BS
Xt+h = Fik Xlki , i = 1...K this process on blocks composing the predicted observation Xt ,
i
k=i
one simply obtains a sequence of states.
2) DPS: The previous strategy for state sequence prediction
hi = T(i1) + 1 . . . T(i) is efficient if the prediction of observations Xt is sufficiently
li = ck . . . T(i) (20) reliable. Indeed, if predictions Xt are far from the true trend,
then the estimated states will be far from the truth too. However,
where BS stands for bold strategy and |T(0) | = 0. The value even if, in some cases, the reliability can be questioned, the state
of weight Fik is given by sequence predicted by the first strategy allows the user to have
a rough trend of the future states.
(k)
exp Dj In order to avoid the dependence between state sequence
Fik =
, k = 1, . . . , K (21) prediction and observation prediction, we propose to exploit
K (k )
k =i exp Dj another strategy that is the DPS. This second strategy draws
benefits directly from the training data set. The main idea is to
(k) (k)
where Dj is the distance between block Bj and Yt after apply a similar reasoning as for features Xt but now for belief
sorting distances as detailed in the previous step. mass mt .
The main advantage of this strategy is to provide long- To go further in details, let us consider the set of belief
term predictions (as long as the length of the largest selected masses for the K nearest neighbors (gray-filled squares in
trajectories), while the main drawback is the possible lack of Fig. 4, i.e., mkt , k = 1 . . . K, t = ck . . . |Tk |). These K belief
reliability according to the horizon. masses can be considered as coming from distinct pieces of
At the end of Step 2 (represented in lines 713 of Algorithm evidence so that the conjunctive rule of combination [see (2)]
1), the prediction of observation trajectory Xt is known accord- can be used
ing to the observation block Yt and to the training data set L.
t+h = k=1 ml ,
mDP l = ck . . . |Tk |
Note that exponential smoothing using past prediction (Xt1 ) S K k
b) Managing the conflict: The amount of conflict may faulty state indicating a potential damage of the system
increase when the BBAs present contradiction, e.g., when two monitored.
BBAs give important mass to subsets with empty intersection. 1) CPS and DPS Fusion: To draw benefits from both CPS
Moreover, the conflict is potentially higher when the BBAs are and DPS approaches, the BBAs mCP S
t+h [see (22)] and mt+h
DP S
focused on singletons. [see (23)] are combined, and the resulting BBA is converted
To decrease the amount of conflict during the fusion process, into a probability distribution from which a decision can be
we propose to use a process called discounting [9], [33], whose made [34]. Dempsters rule is not adapted for the fusion of
interest is to transfer a part of all masses onto the set . By CPS and DPSs BBAs because mCP S DP S
t+h and mt+h cannot be
doing so, Dempsters rule generates less conflict. The part is considered as coming from distinct bodies of evidence. Indeed,
proportional to the weights [see (17)] so that the discounting the following statements are true.
becomes 1) CPS is a classification of predictions resulting from the
weighted combination of continuous predictions.
mkt (A) F k mkt (A), A 2) DPS generates belief masses discounted by the weights.
Therefore, both approaches depend on the weights. More-
mkt () (1 F k ) + F k mkt (). (24)
over, both rely on the BBAs in the training data set L.
Thus, the fusion may be performed using the cautious rule
The higher the weight, the less the discount, meaning that the
[see (3)]. The main disadvantage of the cautious rule comes
related BBA is trusted. Once the BBAs have been discounted,
from the fact that the neutral element is not always the vacuous
the estimated belief mass at time t in DPS is given by (23).
BBA [26]. For EVIPRO-KNN, this can be a problem in practice
Fig. 5 shows an illustration of a particular DPS proposal
if the belief masses are not available in the training data set
where the belief masses have only categorical assignments (i.e.,
(for DPS). Indeed, in this case, the fusion process between
all masses are focused on only one singleton at each time step).
CPS and DPS does not lead to CPS. However, for a particular
The state is precisely known for each data in the training set,
BBA called separable BBA [26], [35], the neutral element is the
and therefore, the states at each time step of the prediction
vacuous BBA, thus leading to an expected behavior. Moreover,
(t + h) can be estimated by direct projection of their values
when using the EvKNN classifier as proposed in this paper, the
using a similar process as for the CPS (see Fig. 2). The dashed
BBAs generated are separable, and this is also the case for other
line represents the sequence obtained after fusion (Step 4), and
commonly used classifiers [24], [26].
the continuous bold line depicts the sequence obtained after
Conclusively, the fusion process relies on the cautious rule
fusion at the current iteration of the algorithm. The median
and leads to a predicted BBA [see (5), (4), and (3), followed by
value and the standard deviation of time instants of transitions
(6) and (7)]
from states q to r within the set of sequences computed at each
iteration can be computed to locate transitions. In this case, the
t+h m
mt+h = mCP DP
S S
t+h (25)
direct propagation seems like a projection of possible sequences
of states. from which a decision concerning the state at time t + h can
Step 3 is represented in lines 1415 of Algorithm 1. be made using (8). The result of this phase is the estimation of
a sequence of states t+h . An illustration of such sequences is
shown in Fig. 5.
D. Step 4RUL Estimation
2) RUL Estimates: Let us now consider this sequence of
Step 4 is the final step of the EVIPRO-KNN algorithm. The states and also all previous predicted sequences (see Fig. 5).
goal of this step is to estimate the transition to a potential Based on this set of sequences, we propose to compute some
44 IEEE TRANSACTIONS ON CYBERNETICS, VOL. 43, NO. 1, FEBRUARY 2013
Algorithm 1 EVIPRO-KNN
Require: Training data set L {Set of trajectories with obser-
vations and belief masses, (9) and (10)}
Require: Current window Yt {W [20 40], overlap: W/2}
Require: Number of nearest neighbors {K = 3}
Require: Labels of critical states {if S = 4, then 3 and 4 }
Require: Averaging strategy {Cautious or Bold}
Require: A classifier of states {For CPS}
Ensure: Prediction of observations X
Ensure: RUL estimation
{Step 1Find nearest neighbors}
1: for all all trajectories i do
2: Find the closest block to Yt {(11) and (12)}
3: Store distances {(13)}
4: end for
5: Keep the K best trajectories {(14)}
6: Compute weights {(17)}
{Step 2Compute prediction of observations}
7: if Tk have same length k = 1 . . . K then
8: X Apply (15)
9: else if Cautious strategy then
10: X Apply (19) (and (18), (17))
11: else if Bold strategy then
12: X Apply (20) (and (21))
Fig. 6. Sequence of operations involved in EVIPRO-KNN. 13: end if
{Step 3Prediction of states}
expected sufficient statistics of time instants of transition be- 14: mCPt+h
S
Apply the classifier {Step 3-1, (22)}
tween states. Since each sequence is composed of possible 15: mDPt+h
S
Apply direct projection {Step 3-2, (23), (24)}
transitions between some states q and r, the set of time instants {Step 4RUL estimation}
of transitions between both states is 16: mt+h (25) {Fusion of mCP S DP S
t+h and mt+h }
17: Iqr Find transitions between critical states and store
Iqr = {t : t1 = q and t = r}. (26) time instants {(26), (8)}
18: Find median and standard deviation {(28)}
To estimate the RUL of the system, it is sufficient to de- 19: RUL Apply (27).
termine the location of the critical transition from state q =
degrading state" to state r = q + 1 = fault state" V. E XPERIMENTS
transition q r critical RU L = q,r t (27) The goal is to illustrate the capability of the EVIPRO-KNN
algorithm to provide reliable health assessment and long-term
where q,r is the estimated time from t to the transition between predictions.
the degrading state q and the faulty state r that can be computed We considered the challenge data set concerning diagnostic
by a median. It can be associated to a dispersion qr that we and prognostics of machine faults from the first International
computed using the interquartile range Conference on PHM [36]. The data set is a multiple multivariate
time series (26 variables) with sensor noise. Each time series
qr = median(Iqr ) was from a different engine of the same fleet, and each engine
started with different degrees of initial wear and manufacturing
qr = Q3 Q1 (28) variation unknown to the user and considered normal. The
engine was operating normally at the start and developed a fault
where Qi is the ith quartile and nI = |Iqr | is the number of
at some point. The fault grew in magnitude until system failure.
elements in the set of time instants of transition Iqr . Step 4 is
In this paper, we used the first experiment found in the text
represented in lines 1619 of Algorithm 1.
file train_FD001.txt that is composed of 100 time series, and
Therefore, both methods for sequence prediction, CPS (clas-
we considered only five features among 26 (columns 7, 8, 9,
sification) and DPS (direct projection), assume that each trajec-
13, and 16 in the text file, as proposed by Wang et al. [37] for
tory in the training data set is made of at least two states, for
the first five features2 ).
example, normal state and abnormal state, and knowledge
on these states can be uncertain and imprecise and represented
2 In [37], the authors proposed to use sensor measurements 2, 3, 4, 7, 11, 12,
by belief functions.
and 15 corresponding to columns 7, 8, 9, 12, 16, 17, and 20, and we believe
The final algorithm is given in Algorithm 1, and the plot chart that 12 should be replaced by 13 because 12 is almost flat and does not bring
of the whole algorithm is shown in Fig. 6. information.
RAMASSO et al.: JOINT PREDICTION OF CONTINUOUS AND DISCRETE STATES IN TIME-SERIES 45
3 The segmentation is available at http://www.femto-st.fr/emmanuel. The meaning of thresholds is represented in Fig. 8. In the
ramasso/PEPS_INSIS_2011_PHM_by_belief_functions.html. sequel, interval [tF N , tF P ] is denoted by I.
46 IEEE TRANSACTIONS ON CYBERNETICS, VOL. 43, NO. 1, FEBRUARY 2013
We considered I = [10, +13] which is a severe condition on Fig. 10. Performance with report to K and W for midterm prediction for the
bold strategy.
the prediction compared to the literature [3], [38]. These con-
ditions are similar to the ones considered in the PHM08 data
challenge [37] as well as in the PHM12 data challenge [39]
except that, in both papers, a particular parameterized function
was used which decreases with report to the bounds of I.
The performance of the algorithm is quantified by the per-
centage of predictions provided at the so-called critical tc and
falling in I. The critical time sets the difficulty of the prediction
task by specifying how far the prediction has to start [38]. For
example, given a trajectory T with a total length equal to |T | (on
which the algorithm is applied), setting tc = 20 means that the
RUL estimated at |T | 20 is evaluated. Therefore, the greater
the value of tc , the more difficult is the task. However, tc is
generally set arbitrarily.
Since the variability of the length of the trajectories in the
data set is important [as shown in Fig. 7(b)], we proposed to
set tc = |T |/4, meaning that the value of the critical time tc Fig. 11. Performance with report to K and W for midterm prediction for the
is adapted according the trajectorys length (e.g., if |T | = 240, cautious strategy.
then tc = 60). The distribution of the critical times will be given
in the sequel (see Fig. 9), and we will show that tc > 22 which when studying the effect of the number of neighbors. Indeed,
corresponds to mid/long-term predictions. low values of W (for example, < 25) require low values for K
This way of evaluating the predictions is of practical interest (for example, < 3). For high values of W (for example, 25),
because, in a real-world application, one does not know a priori K = 3 or K = 5 yields satisfying performances. Practically, W
the length of the current trajectory (which is under analysis). should not be too high to avoid missing changes in the trajec-
For example, for the second application (next section), only tories, and therefore, its settings are application dependent. In
pieces of trajectories with different length are available for the considered application, the fastest changes took between 20
testing, and nothing is known about the associated systems and 30 time units as illustrated in Fig. 7(a) (see the shortest
status. In this second application, the evaluation of the RUL trajectories).
proposals made by the EVIPRO-KNN algorithm will thus As mentioned previously, the distribution of the critical times
inherently consist in setting tc equal to the length of the current is given in Fig. 9 which shows that tc > 22. These critical times
piece of trajectory. can be considered as difficult ones compared to the literature
2) Sensitivity of K and W : Fig. 10 shows the performance on PHM.
of EVIPRO-KNN with report to K = [1 3 5 9] (number of 3) Cautious or Bold Strategy?: Fig. 11 pictorially repre-
neighbors) and W = [10 20 30 40] (windows size), with the sents the evolution of the performance of EVIPRO-KNN with
bold strategy. The algorithm is robust to both parameters, with report to K = [1 3 5 9] and W = [10 20 30 40], with the
a performance close to 92% for K > 9, W > 30. For low cautious strategy. Remembering that both strategies differ from
values of W , the performances are the lowest (around 68%); the way that the farthest predictions are computed, the small
however, these cases correspond to greedy ones since the algo- differences with Fig. 10 can be explained by the relatively small
rithm requires more iterations and therefore are not practically values taken by the critical times. Even though these critical
interesting. Increasing W makes EVIPRO-KNN less sensitive times correspond to quite difficult cases (midterm predictions),
to K (e.g., W = 40) because the windows size provides the they do not enable one to discriminate the prediction capability
best choice of the nearest neighbors. This effect is emphasized of both strategies.
RAMASSO et al.: JOINT PREDICTION OF CONTINUOUS AND DISCRETE STATES IN TIME-SERIES 47
Fig. 13. Differences between estimated RUL and real RUL for each window.
Fig. 12. Expected horizon for long-term prediction with the bold and cautious Fig. 14. (Top) Error between predictions and real observations and (bottom)
strategies. (a) Bold strategy. (b) Cautious strategy. the length of the predictions.
Fig. 16. Trajectory hits describing which trajectory is chosen at each window. Fig. 18. Distribution of the RULs in the second data set.
Fig. 17. Distribution of the length of the trajectories in the second data set. Fig. 19. Distribution of errors of the RULs estimates for the second data set.
current observed data and may be practically useful. The joint [21] M. Dong and D. He, A segmental hidden semi-Markov model (HSMM)-
prediction is made by two strategies which finally provide an based diagnostics and prognostics framework and methodology, Mech.
Syst. Signal Process., vol. 21, no. 5, pp. 22482266, Jul. 2007.
estimate of the RUL: the CPS and the DPS. [22] O. Ayad, M. S. Mouchaweh, and P. Billaudel, Switched hybrid dynamic
To increase the accuracy of the predictions, particularly systems identification based on pattern recognition approach, in Proc.
made by CPS, we are currently studying classifiers and fusion IEEE Int. Conf. Fuzzy Syst., 2010, pp. 17.
[23] K. Goebel, N. Eklund, and P. Bonanni, Fusing competing predic-
rules [40]. However, to perform online detection and prediction tion algorithms for prognostics, in Proc. IEEE Aerosp. Conf., 2006,
as considered in this paper, it is required to develop online pp. 110.
classifiers and online fusion tools. [24] T. Denoeux and P. Smets, Classification using belief functions: The
relationship between the case-based and model-based approaches, IEEE
Trans. Syst., Man, Cybern., vol. 36, no. 6, pp. 13951406, Dec. 2006.
[25] L. Serir, E. Ramasso, and N. Zerhouni, Evidential evolving Gustafson
ACKNOWLEDGMENT Kessel algorithm for data streams partitioning using belief functions, in
Proc. 11th Eur. Conf. Symbolic Quantitative Approaches Reason. Uncer-
The authors would like to thank the anonymous reviewers for tainty, vol. 6717, LNAI, 2011.
their helpful comments that greatly improved this paper. [26] T. Denoeux, Conjunctive and disjunctive combination of belief functions
induced by non distinct bodies of evidence, Artif. Intell., vol. 172, no. 2/3,
pp. 234264, Feb. 2008.
R EFERENCES [27] A. Dempster, A generalization of Bayesian inference, J. Roy. Statist.
Soc., vol. 30, no. 2, pp. 205247, 1968.
[1] P. Bonissone, A. Varma, and K. Aggour, A fuzzy instance-based model [28] P. Smets, Analyzing the combination of conflicting belief functions, Inf.
for predicting expected life: A locomotive application, in Proc. IEEE Int. Fusion, vol. 8, no. 4, pp. 387412, Oct. 2007.
Conf. Comput. Intell. Meas. Syst. Appl., 2005, pp. 2025. [29] R. Kennes and P. Smets, Fast algorithms for DempsterShafer theory, in
[2] A. Saxena, B. Wu, and G. Vachtsevanos, Integrated diagnosis and prog- Uncertainty Knowl. Bases, vol. 521, Lecture Notes in Computer Science,
nosis architecture for fleet vehicles using dynamic case-based reasoning, L. Z. B. Bouchon-Meunier and R. R. Yager, Eds. Berlin, Germany:
in Proc. Autotestcon, 2005, pp. 96102. Springer-Verlag, 1991, pp. 1423.
[3] T. Wang, Trajectory similarity based prediction for remaining useful life [30] J. Denker and Y. Lecun, Transforming neural-net output levels to prob-
estimation, Ph.D. dissertation, Univ. Cincinnati, Cincinnati, OH, 2010. ability distributions, in Advances in Neural Information Processing
[4] C. Bishop, Pattern Recognition and Machine Learning. New York: Systems 3. San Mateo, CA: Morgan Kaufmann, 1991, pp. 853859.
Springer-Verlag, Aug. 2006. [31] T. Denoeux, A k-nearest neighbor classification rule based on
[5] K. P. Murphy, Dynamic Bayesian networks: Representation, inference DempsterShafer theory, IEEE Trans. Syst., Man, Cybern., vol. 25, no. 5,
and learning, Ph.D. dissertation, UC Berkeley, Berkeley, CA, 2002. pp. 804813, May 1995.
[6] E. Ramasso and R. Gouriveau, Prognostics in switching systems: Evi- [32] H. He and E. Garcia, Learning from imbalanced data, IEEE Trans.
dential Markovian classification of real-time neuro-fuzzy predictions, in Knowl. Data Eng., vol. 21, no. 9, pp. 12631284, Sep. 2009.
Proc. IEEE Int. Conf. Prognostics Syst. Health Manag., Macau, China, [33] P. Smets, Beliefs functions: The disjunctive rule of combination and the
2010, pp. 110. generalized Bayesian theorem, Int. J. Approx. Reason., vol. 9, no. 1,
[7] M. El-Koujok, R. Gouriveau, and N. Zerhouni, Reducing arbitrary pp. 135, Aug. 1993.
choices in model building for prognostics: An approach by applying [34] P. Smets, Decision making in the TBM: The necessity of the pignistic
parsimony principle on an evolving neuro-fuzzy system, Microelectron. transformation, Int. J. Approx. Reason., vol. 38, no. 2, pp. 133147,
Rel., vol. 51, no. 2, pp. 310320, 2011. Feb. 2005.
[8] A. Dempster, Upper and lower probabilities induced by multiple valued [35] P. Smets, The canonical decomposition of a weighted belief, in Proc.
mappings, Ann. Math. Statist., vol. 38, no. 2, pp. 325339, Apr. 1967. Int. Joint Conf. Artif. Intell., San Mateo, CA, 1995, pp. 18961901.
[9] G. Shafer, A Mathematical Theory of Evidence. Princeton, NJ: Princeton [36] A. Saxena, K. Goebel, D. Simon, and N. Eklund, Damage propagation
Univ. Press, 1976. modeling for aircraft engine run-to-failure simulation, in Proc. Int. Conf.
[10] P. Smets and R. Kennes, The transferable belief model, Artif. Intell., Prognost. Health Manage., Denver, CO, 2008, pp. 19.
vol. 66, no. 2, pp. 191234, Apr. 1994. [37] T. Wang, J. Yu, D. Siegel, and J. Lee, A similarity-based prognostics
[11] P. Smets, What is DempsterShafers model? in Advances in the approach for remaining useful life estimation of engineered systems, in
Dempster-Shafer Theory of Evidence, I. R. Yager, M. Fedrizzi, and Proc. IEEE Int. Conf. Prognost. Health Manage., 2008, pp. 18.
J. Kacprzyk, Eds. Hoboken, NJ: Wiley, 1994, pp. 534. [38] K. Goebel and P. Bonissone, Prognostic information fusion for constant
[12] L. Serir, E. Ramasso, and N. Zerhouni, Time-sliced temporal evidential load systems, in Proc. 7th Annu. Conf. Inf. Fusion, 2003, pp. 12471255.
networks: The case of evidential HMM with application to dynamical [39] L. Serir, E. Ramasso, P. Nectoux, O. Bauer, and N. Zerhouni, Eviden-
system analysis, in Proc. IEEE Int. Conf. Prognostics Health Manag., tial Evolving Gustafson-Kessel algorithm (E2GK) and its application to
2011, pp. 110. PRONOSTIAs data streams partitioning, in IEEE Int. Conf. on Decision
[13] E. Come, L. Oukhellou, T. Denoeux, and P. Aknin, Learning from par- and Control, 2011, pp. 82738278.
tially supervised data using mixture models and belief functions, Pattern [40] E. Ramasso and S. Jullien, Parameter identification in Choquet integral
Recog., vol. 42, no. 3, pp. 334348, Mar. 2009. by the KullbackLeibler divergence on continuous densities with appli-
[14] O. E. Dragomir, R. Gouriveau, N. Zerhouni, and R. Dragomir, Frame- cation to classification fusion, in Proc. Eur. Soc. Fuzzy Logic Technol.,
work for a distributed and hybrid prognostic system, in Proc. 4th IFAC Aix-Les-Bains, France, 2011, pp. 132139, Ser. Advances in Intelligent
Conf. Manag. Control Prod. Logist., Sibiu, Romania, 2007. Systems Research.
[15] H. Papadopoulos, V. Vovk, and A. Gammerman, Regression conformal
prediction with nearest neighbors, J. Artif. Intell. Res., vol. 40, no. 1,
pp. 815840, Jan. 2011. Emmanuel Ramasso received the B.Sc. and M.Sc.
[16] G. Nakhaeizadeh, Learning prediction of time seriesA theoretical and degrees in automation science and computing from
empirical comparison of CBR with some other approaches, in Proc. 1st PolytechSavoie (engineering school of the Univer-
Eur. Workshop Topics Case-Based Reason., 1993, pp. 6576. sity of Chambry, France) in 2004. He defended
[17] D. Ruta, D. Nauck, and B. Azvine, K-nearest sequence method and its a Ph.D. (University of Grenoble, France) in 2007,
application to churn prediction, in Proc. Intell. Data Eng. Autom. Learn., focused on human motion analysis in videos based
2006, pp. 207215. on the belief function theory.
[18] Q. Yu, A. Sorjamaa, Y. Miche, and E. Severin, A methodology for time During 2008, he worked at the Commissariat
series prediction in finance, in Proc. Eur. Symp. Time Series Prediction, lEnergie Atomique et aux Energies Alternatives,
2008, pp. 285293. Saclay, France. Since September 2008, he has been
[19] Q. Yu, A. Lendasse, and E. Severin, Ensemble KNNs for bankruptcy working at the engineering school of mechanics and
prediction, in Proc. Int. Conf. Comput. Econ. Finance, 2009, pp. 19. microtechniques of Besancon (France) as an Associate Professor. His research
[20] S. Zhang, W. Jank, and G. Shmueli, Real-time forecasting of online activities have been carried out at the FEMTO-ST Institute on the development
auctions via functional k-nearest neighbors, Int. J. Forecast., vol. 26, of new methodologies for prognostics and health management solutions using
no. 4, pp. 666683, Oct./Dec. 2010. pattern recognition tools.
50 IEEE TRANSACTIONS ON CYBERNETICS, VOL. 43, NO. 1, FEBRUARY 2013
Michle Rombaut received the B.Sc. degree from Noureddine Zerhouni received the engineer degree
the Ecole Universitaire dIngnieurs de Lille, Lille, from the National Engineers and Technicians School
France, and the Ph.D. degree in electronic systems of Algiers (ENITA), Algiers, Algeria, in 1985. After
from the University of Lille, Lille. a short period in industry as an engineer, he re-
Since 1985, she has been with Universit Tech- ceived the Ph.D. degree in automatic control from the
nologique de Compigne, Compigne, France as an Grenoble National Polytechnic Institute, Grenoble,
Associate Professor and, since 1994, as a Profes- France, in 1991.
sor. Since 2002, at the University Joseph Fourier In September 1991, he joined the National Engi-
of Grenoble, Grenoble, France, she has been con- neering School of Belfort (ENIB), Belfort, France,
ducting her research at the Grenoble Images Speech as an Associate Professor. At this time, his main
Signals and Automatics Laboratory, particularly in research activity was concerned with the modeling,
data fusion and transferable belief model. analysis, and control of manufacturing systems. Since September 1999, he has
been a Professor at the national high school of mechanics and microtechniques
of Besancon. As for investigation, he is the Head of the Automatic Control and
Micro-Mechatronic Systems (AS2M) Department of FEMTO-ST Institute. His
main research activities are concerned with intelligent maintenance systems and
maintenance support based on information and communications technologies.
He has been and is involved in various European and national projects on
intelligent maintenance systems like the FP5 European Integrated Project of
the Information Technology for European Advancement program PROTEUS,
the Naval E-Maintenance Oriented System with Direction des Constructions
Navales, Systmes et Services, and the Reliability Improvement of Embedded
Machines (AMIMAC-FAME) with ALSTOM.