Eka 1

300
IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 3, NO. 3, AUGUST 1995
Optimization of Fuzzy Expert Systems Using Genetic Algorithms and Neural Networks
Christiaan Perneel, Member, IEEE, Jean-Marc Themlin, Jean-Michel Renders, and Marc Acheroy, Member, IEEE
Abstract--In this paper, the fuzzy logic theory is used to build a specific decision-makingsystem for heuristic search algorithms. Such algorithms are typically used for expert systems. To improve the performance of the overall system, a set of important parameters of the decision-makingsystem is identified. Two optimization methods for the learning of the optimum parameters, namely genetic algorithms and gradient-descent techniques based on a neural network formulation of the problem, are used to obtain an improvement of the performance. The decision-makingsystem and both optimizationmethods are tested on a target recognition system.
I. INTRODUCTION
ECISION-making systems and expert systems are usually designed to solve complex problems with numerous candidate solutions. The exploration of all possible solutions and uninformed search methods like depth-first or breadth first (see, e.g., [l], chapter 2) are often unpractical. To guide the expert system toward a solution in a more efficient manner, knowledge has to be given by experts. This can be done by heuristics, when the complex problem can be considered as a graph-search problem in uncertain environment. Then, well-known heuristic search techniques (see, e.g., [l] and [2]) can be used to find a solution in an efficient way. To deal with imperfect information in expert systems, three main classes of approach can be used (see [3], chapter 1). The most classic approach to model uncertainty is the probabilistic one (see, e.g., [4] and [ 5 ] ) . A second approach is based on the Demster-Shafer theory ([6]) of evidence (e.g., [7]). Bellman and Zadeh [8] and Zadeh [9] introduced the fuzzy logic approach. A survey of the different approaches is made by Prade [lo]. These approaches are criticized by Lindley [ l l ] and Cheeseman [ 121 from a probabilistic viewpoint. A defence of the fuzzy logic approach is made, however, by Zadeh [I31 and Kosko [14]. The interest in fuzzy expert systems has grown considerably over the past few years. The fuzzy reasoning approach is motivated by the following advantages: a) it provides an efficient way to cope with imperfect information, especially imprecision in the available knowledge, b) it offers some kind of flexibility in the decision-making process, and c) it
Manuscript received August 12, 1993; revised January 25, 1995. C. Perneel and M. Acheroy are with the Signal and Image Center, Royal Military Academy, B-1040 Brussels, Belgium. J.-M. Themlin is with the Groupe de Physique des E m s C o d e n d i s , Sciences Faculty of Luminy, URA CNRS 783, Case 901, F-13288 Marseille, France. J.-M. Renders is with the Artificial Intelligence Group, Computer Systems Department of Tractebel, B-1200 Brussels, Belgium. IEEE Log Number 9412930.
gives an interesting madmachine interface by simplifying rule extraction from (human) experts and by allowing a simpler a posteriori interpretation of the system reasoning. The design of fuzzy expert systems, however, relies on a particular modelling of imperfect information. Now, modelling imperfections usefully and efficiently remains a delicate and sometimes critical task. Moreover, fuzzy reasoning is based on definitions and conventions chosen somewhat arbitrarily. For example, a large choice of membership functions can be used in the definition of the linguistic terms; in the same way, there are several possible definitions of fuzzy operators such as AND, OR, fuzzy implication and defuzzification scheme (see, e.g., [15]). The potential user is thus confronted with a variety of choices in the design of its fuzzy expert system for its particular application and does not know the optimum choice in advance, To mention a few interesting contributions, e.g., Sugeno and Kang [16] and Sugeno and Yasukawa [ 171 employ heuristic rule-of-thumb to identify a good structure of a fuzzy model. Once a fuzzy expert system has been designed, it depends on a large set of parameters, e.g., the shape of the membership functions, weights, etc. To improve the performance of the overall system (to decrease the computing time and to obtain better global results), the parameters have to be tuned by an appropriate optimization method. To solve a similar problem, in the field of the optimization of fuzzy logic controllers, gradient-descent methods are used by, e.g., Nomura et al. [18] and Bersini et al. [19] and genetic algorithms (GAs) by Karr [20], Karr and Gentry [21] and Thrift [22]. The aim of this paper is twofold: a) to build a fuzzy expert system in the field of decision making with imperfect information solving hierarchical graph-search problems and to identify a set of important parameters whose automatic tuning should improve the performance of the overall system, and b) to use and to compare two optimization methods for the learning of the optimum parameters, namely GAs and gradient-descent techniques based on a neural network formulation of the problem. This paper is organized as follows: In Section I, the decision-making problem and the traditional approaches to solve it are developed. A specific decision-making system for heuristic search algorithms, based on the fuzzy logic approach, 1 1 , two optimization is presented in Section 11. In Section 1 methods for the learning of the optimum parameters of the decision-making system are presented. After the statement of the optimization problem (Section 111-A), the first optimization Section IIImethod based on GAs is described (Section 111-B).
1063-6706/95$04.00 0 1995 IEEE
PERNEEL et al.: OFTMIZATION OF FUZZY EXPERT SYSTEMS USING GENETlC ALGORITHMS
301
C presents the second optimization method based on the neural network approach. The particular decision-making application is described in Section IV, while the results obtained with the application for the described methods are presented in Section V.
11. PROBLEM STATEMENT
The initial decision-making problem can now be formalized as an optimization problem: to find the maximum of L ( D ) . In the remainder of this paper, the decision-making problem will be considered as a hierarchical graph-searching problem: the graph consists of several nodes grouped by levels, each node of level IC representing a partial decision
We consider the following decision-making problem: to find a decision consisting of a sequence of decision elements (or hypotheses) optimizing some criteria in an environment characterized by imperfect information. Let D be a candidate decision consisting of n decision elements d i ; each decision element di belongs to a finite, discrete set D'
d l 4 = (dlrd2,"',dk)
D = dl-n can be associated with a specific path in the

decision graph. At this stage, there is no means of evaluating the quality of a partial decision. In other words, only the terminal nodes (the nodes of the last level n) are given a rating, which can be used to guide the search for the best decision. Let us further assume that knowledge is revealed partially at each level in an incremental fashion; this means that Q can be decomposed in the following manner
D = (dli&,*.*,dn)
di E D'.
Let D be the discrete set of all the global decisions, so
DD=D1xD2~-..~Dn ni = lDil
n
Q
with
ID( = n n i . i=l
As a link between the decision and the imperfect information, a number M of measurements (or observations) are available mi: D
+
q 2 M z (d1+2), . . *
qnl(dl-n),
* * * 7
qnM,
(dl-n)]'
k M i = M i=l
and Mi the number of heuristics of level i. Each q i j is the individual (or local) rating of heuristic j at level i and depends only on the partial decision d ~ + At ~. each level i, it is assumed that the individual heuristic ratings q i j can be combined into a partial rating Li(dl+i) which represents the quality of the partial decision up to this level, given the partial knowledge (heuristics) of this level i. The global rating can be expressed as a combination of the partial ratings L;
M : D + mi(D)
i =l,.**,M
where M is the measurement space. Heuristic functions are rating the different candidate decisions according to these measurements. These ratings describe how well (or how likely) a decision (and its associated measurement) fits in with the environment
h,i:M
-+
R: m i
+ hi(mi)
i = 1 .* . , M
7
with R the space of the possible rating values (mostly a subset of the real numbers). Each heuristic can be considered as a piece of knowledge usually coming from an expert which partially assesses the quality of the decision, taking into account the stochastic nature of the environment. Heuristics are combined to form a global rating T , which is a measure of the quality of the decision
L[Ll(dl+l),L~(d1+2),. * * 7 Ln(dl-rn)].
The previous assumptions allow more efficient strategies to guide the search in a smarter way by exploiting the partial information (partial rating) available at each node. For instance, classic graph-search strategies such as A*, may be well adapted in certain cases (see, e.g., [ l ] and [2]). In particular, Branch and Bound methods can be applied provided that, for T = O[hl(ml(D)) h, 2 ( m ( D ) ) . , . . ,h M ( m M ( D ) ) I every node of level k, it is possible to find an upper limit where 0 is the combination operator across all heuristics. Let to the global rating of all the decisions containing the partial decision represented by the node considered. Let LSUP(dl+k) qi(D) = hi(mi(D)) i = 1,... , M . be this upper limit. Let { N I ,N 2 , . . . , N L } be the set of The set of the ratings of the M different heuristics, which are nodes which are still to be developed into their successors based on observations and measurements, for the decision D (or the partial decisions which are still to be completed). is represented by Initially, these nodes are formed by the nl possible decision elements of level 1: {d;+l, d;-rl,. . . ,q L l } . We then Q(D) = (qi(D), q2(D),. . . , q M ( D ) ) is consider the node Ni (or (dl-k)) for which LSUP(dl+rc) and be L the function which associates a global rating T to greater than or equal to the upper limit Lsup of all other each global decision D nodes in the list. This particular node is developed into its successors (dl+k7d;+!), (di-rc, d2,+,), . . . , (dl+k7 d i z l ) L: D + R:D + T = L(D) = O ( Q ( 0 ) ) . which are evaluated. Durrng the search, when a final node (a We suppose that the set of heuristics can be chosen in such complete decision) is evaluated and its (global) rating is greater a way that: than or equal to the (partial) ratings of all other nodes in the list, the search is halted and this final node is guaranteed to 1) L can be defined for each element of D. 2) L( D ) takes its maximum value at the optimum decision. be an optimum decision.
302
IEEE TRANSACTIONS ON FUZZY SYSTEMS,VOL. 3, NO. 3, AUGUST 1995
111. DECISION MAKINGIN AN UNCERTAIN

ENVIRONMENT WITH FUZZY LOGIC
To emphasize the structure and the parameters of the global rating function
L(dl+n)
= LILl(dl+l),LP(d1-+2), . 1 Ln(dl+n)l = O(Q(d1-n))
we derive in this paragraph an expression of the global rating function in the framework of fuzzy logic (see [23], 1141, and [SI). We have explained how the heuristics rate the candidate decision based on some measurements or observations. Very often, the range of the measured variable is more important than its exact value. The range of the space of the measurement values can also be divided in a number of classes, each characterized by a membership function and a linguistic variable, describing how well it fits the hypothesis that the candidate is the solution to the problem. Furthermore, the transitions between these categories are often blurred. Therefore, there must also be a smooth transition in the rating of the different candidate solutions. The decision elements d k are assumed to belong to nonfuzzy sets Dk. The measurements or observations are assumed to provide crisp values as well. Each heuristic returns a fuzzy vector of size T, the number of different linguistic terms. The elements of this fuzzy vector correspond with the degree of membership to which the measurement or observation belongs to the different classes. Mathematically
Assuming that the rating values must belong to the interval [K,in, K,,,] then gt(K,in) = 1 if t is the worst degree of fit linguistic term and g t ( Kma) = 1 if t is the best degree of fit linguistic term. An example of the rating membership functions for T = 3 is given in Fig. 2. Given a measurement or observation m, given a heuristic h with his membership functions, we can define the cumulated rating membership function g c ( r ) ,using, e.g., the min operator as fuzzy inference rule
gc(r) =
t
min ( g t ( r ) P ; W ) .
For example, the well-known fuzzy centroid defuzzification scheme can be chosen to Obtain a nonfuzzy rating 4 for the heuristic h
1,
+m
rgc(r)dr
-
C L r r min ( g t ( r ) P , t
dr
4=
Now, individual heuristic ratings must be combined for each level. Suppose that (d1-i) is a partial decision up to level i , the partial or level rating L i ( d l + i ) is the combination of the Mi heuristic ratings qij. To avoid the too pessimistic (see [ 151) classical combination rule, that consists in taking the minimum heuristic rating as the level rating (see, e.g., [8], [91, and [231), the weighted sum approach will be used, were the weight wij is given to each heuristic according to its reliability
Li(d1-i)
Each linguistic term is a fuzzy set which designates a category partially qualifying a candidate solution in the sense of the considered heuristic (e.g., good, average, and bad: see Fig. 1). The set of heuristics forms a base knowledge of fuzzy rules whose antecedents are related to the measurements or observations and whose consequent part determines the fuzzy (partial) quality of the decision. Starting from some measurements or observations, a nonfuzzy global rating must be inferred using the rules of the base knowledge and fuzzy inference mechanisms. This global rating function should logically reinforce candidate solutions whose heuristics give mostly a good rating and should disadvantage candidate solutions whose heuristics give mostly a bad rating. To build such a global rating function L ( D ) , we will start with the design of the rating function corresponding to one heuristic h. Each heuristic returns a fuzzy vector of size T. These T different membership values must be combined in one unique nonfuzzy value which is the rating given by the heuristic. The transformation of the fuzzy vector hT(m) into a unique nonfuzzy rating value is called defuzzification. This is usually done by assigning to each linguistic term t (e.g., bad, average, and good) a rating membership function, g t ( r ) , where r is the nonfuzzy rating value, obeying the following rules gt:R [O, 11. dr, the rating must increase when the linguistic term expresses an improvement and must decrease when the linguistic term expresses a worsening.
+
To get a rating of the partial decision
(dl-i)
up to level
i, different methods (e.g., weighted sum combination, centroid level combination, minimum level combination, etc.) can be used. This rating will be a combination of the i different level ratings Ll(d1-i) (= Ll(d1, d2,. . . ,d i ) = L l ( d l ) ) L2(d14), , . f . , L i ( d l + i ) . Assuming that the first approach is used, the rating up to level i will be
L1-i (di-+i)
a
k=l
To compare two partial decisions at different levels, the rating of the decision at the lowest level must be extrapolated up to the highest level. To compare partial decisions at every
PERNEFL ef al.: OPTIMIZATION OF FUZZY
EXPERT SYSTEMS USING GENETIC ALGOIUTHMS
303
Bad
Average
Good
Average
Bad
Fig. 1. Example of heuristic fuzzy membership functions (T = 3: Bad, average, and good).
with
Lk(d1-p)
- Bad
Average
Good
TO
Ti
Tz
73
1
%id
T4
75
7 ' 6
Kmin
KmXX
Fig. 2. An example of membership functions with T = 3.
Taking into account that several estimations have to be done, the global rating can be written as
L(d1-m)
= L(D)
where E [ p & ( ~ ~ k j ( d 1 . . + ~ ) )are ] the estimated membership values for the heuristics of the levels p 1 up to the highest level. Different ways can be followed to obtain these estimated values: the best case approach consists in taking the maximum value for the best linguistic term
( C L k j ( 4 , P Z j ( m ) , * * . 7 Pkj"
= (O,O,.
. . , 1)
A complete and detailed description of the mathematical background of this method can be found in Pemeel et al. [24].
if the linguistic variable T is the linguistic term corresponding to the best degree of fit. This approach provides an upper limit to the global rating of all partial decisions
U
Iv. OlTlhWATION OF THE FUZZY EXPERT SYSTEM

A. Introduction
-k
k=p+l
2 bk
The best case approach is a cautious one, because the partial decisions at the lower levels might be advantaged with respect to the decisions at the higher levels. The upper limit rating can be used, e.g., the Branch and Bound search method (see, e.g., [l]). If a less cautious approach is desired, one can choose an arbitrary value
In the decision-making system described above, the heuristic set plays a crucial role and strongly affects the quality of the decision adopted by the search method. Now, it is not an easy task to determine an optimal set of heuristics, especially as the heuristics often rely on implicit representation of experts and, consequently, on uncertain, imprecise or inconsistent knowledge. One can usually isolate a number of heuristic parameters modelling the shape of the heuristic membership functions or describing the way of combining heuristics (operator 0). It is then possible to express the global measure of quality or global rating as a function of a decision D and of a parameter vector 8
T
= L(D,8).
From the fuzzy logic approach described in Section 11, we can extract the parameter vector
(Pkj(m)>Pij(m)>**. >PZj(m))
This approach is useful, e.g., the A-algorithm (see, e.g., [I]).
8=(Pi,..
,Pn,Wli,
'
' ,W n M , , e H I 1 ,
' ' '
>
~ H , M , )
with Pi the weight corresponding with the level i, w i j is the weight associated with heuristic j of level i and 8 H , ,
304
the vector which describes the shape of the T membership functions of the heuristic j of level i . For the example in Fig.1, 8~ would be (ml,m2, . . . ,m7). Normally, these parameters must be properly set by statistics on a large number of experiments. Unfortunately, this task is very time consuming in practice. In this section, two different methods will be used to optimize the parameter set of the expert system, namely GAs and gradient-descent techniques.
B. Optimization Using GAs

If the first approximation of the parameter set 8, usually derived from rule-of-thumb or given by experts, is not satisfactory, it is preferable to have recourse to some automatic tuning of the heuristic parameters. When adopting such an approach, we are faced with two interwoven optimization problems. The first (inner) problem was considered in Section I; it consists of finding a good decision D , preferably D* which maximizes the global rating (or global measure of the quality) L
L(D*,8) = maxL(D,8). D
This can be solved, e.g., by the fuzzy logic method described in Section 11, with the best case approach (Branch and Bound). The second (outer) problem consists of finding the parameter vector 8 which minimizes some kind of error E ( 8 )between the optimum decision D * ( 8 ) resulting from the solution of the first problem for a given parameter set 8,and a reference (desired) decision given by a teacher for a set of particular decision problems constituting a learning database
E(@*)= m i n E ( 8 )
8
with the error function E ( 8 ) defined as
E ( 8 )=
2
IlDT(8) - D Y h e r l l a
(1)
Where Dkeacher is the desired decision for the ith particular decision problem of the learning database, D: (8) is the solution of the inner optimization problem related to the ith particular decision problem, using 0 as heuristic parameters, and I IX - Y I la denotes some user-defined distance between X and Y. is hoped that solving the whole problem with a limited learning database will result in heuristic parameters well fitted to a larger number of decision problems (generalization capability). As the set of possible decisions is finite and discrete (for any e), the landscape of the global error function is composed of terraces or flat plateaus. This peculiar landscape jeopardizes the applicability of traditional methods relying on hill-climbing principles. For example, steepest-descent methods (see, e.g., [25]) will fail because the derivatives are generally zero (except on the terrace boundaries). The simplex method (see [26]) will also fail as soon as all points of the polyhedron lie on the same terrace. On the contrary, this kind of landscape does not constitute an obstacle to the robustness of the G A S . It has indeed been
shown that a GA can solve similar optimization problems such as the optimization of the third de Jong function (see [27], chapter 4). The GA will be studied to solve the outer optimization problem. First, GA theory will be refreshed. I) The GA: The GA method (see [27]) is an iterative search algorithm based on an analogy with the process of natural selection (Darwinism) and evolutionary genetics. The search aims to optimize some user-defined function called the fitness function. To perform this task, GA maintains a population of candidate points, called individuals, over the entire search space. At each iteration, called a generation, a new population is created. This new generation generally consists of individuals which fit better than the previous ones into the external environment as represented by the fitness function. As the population iterates through successive generations, the individuals will in general tend towards the optimum of the fitness function. To generate a new population on the basis of a previous one, a GA performs three steps: It evaluates the fitness score of each individual of the old population, It selects individuals on the basis of their fitness score, and It recombines these selected individuals using genetic operators such as mutation and crossover which can be considered, from an algorithmic point of view, as a means to change the current solutions locally and to combine them, respectively. What makes a GA attractive is its ability to accumulate information about an initially unknown search space and to exploit this knowledge to guide subsequent search into useful subspaces. The fundamental implicit mechanism underlying this search consists of the combination of high-performance building blocks discovered during past trials. Three major differences from classical optimization methods, e.g., steepest-descent, simplex, etc., are to be noted: The GA works in parallel on a number of search points (potential solutions) and not on a unique solution, which means that the search method is not local in scope but rather global over the search space. GA requires from the environment only an objective function measuring the fitness score of each individual and no other information nor assumption such as derivatives and differentiation. Both selection and combination steps are performed by using probability rules rather than deterministic ones to maintain a globally exploratory search. 2) GA and the Optimization of the F u u y Expert System: TO solve the parameter tuning problem with a GA, a population of individuals is formed. Each individual (or chromosome) consists of a particular parameter vector 8 i grouped in the population
{81,82;..,8~}.
The chromosome length depends on the number of levels, the number of heuristics and on the number of parameters describing the set of heuristic membership functions. The cost function of the individual 8i is taken as the error function
PERNEEL ef al.: OPTIMIZATION OF FUZZY EXPERT SYSTEMS USING GENETIC ALGORITHMS
305
TABLE I
Learning Database
DATABASE FOR THE NEURAL A~PROACH
~ D I Heuristics
Desired rating
0, +
Branch& Bound
D,I(@,)
or
"'
Population P .
E(@,)
[ ( P ; (m1 ( D ) ). , . . 9 X ( m 1( D ) ) ).,. . >
(Pkf(mM(D),' ' p5(mM(D)))1

1
Fig. 3. Schematic diagram of the heuristic parameter optimization method using GA's.
Learning Database
Fig. 4. Schematic diagram of the heuristic parameters optimization method using the neural network approach.
E(@i) defined by (1). Fig. 3 outlines the principle of the

method. The optimization consists in minimizing the cost function or error. C. Optimization Using the Gradient-Descent Method Formulated as a Neural Network Problem For tuning the parameters, a neural like (NN) approach can be used instead of the GA. Indeed, the optimization problem can be seen as the training of a neural net in such a way that the network represents a global rating function L(D) whose maximum value is obtained when the true optimum decision is provided as input of the network. A schematic diagram sketching the neural like approach is given in Fig. 4. The learning database is made of Several (C) decision problems (&,k = 1,2, . . . , A group of p k decisions D for each problem 4 , and The desired global rating Lteacher(D) for each decision D , which is chosen to be maximum for the true optimum decision of each problem I and to associate lower global ratings to poorer decisions. The transformation module Q implements both measurements and evaluation of the corresponding heuristic ratings, according to the definition of Section I. The output of the module consists of a vector
for the fuzzy reasoning system. The structure of the database including the transformation module Q is illustrated in Table I, where data used for training the NN (inputs and desired outputs) are emphasized. The NN structure is designed in such a way that the adjustable connection weights are nothing else than the parameter vector 8. The NN is trained to realize a global rating function L ( D ) as close as possible to the Once the NN is trained, desired global rating Lteacher(D). the optimized weights (or parameters) are extracted and can be used in selected search methods to solve the decisionmaking problem (inner problem). One may consider that the transformation module Q contains tunable parameters as well, becoming part of an "extended" neural network. Indeed, it could be necessary in certain cases to refine the heuristic parameters such as the parameters describing the shape of the heuristic membership functions. In this work, we considered only the adjustment of heuristic weights (which do not intervene in the transformation module Q, but only in the combination of heuristic ratings). To transform the fuzzy reasoning scheme, presented in Section 11, into a network (this is always possible, see, e.g., [28] and [29]), we can rewrite the final equation of Section II into
c),
The saturation function in the central part of Fig. 5 is not explicitly calculated but results from the previous formula which gives a value limited to [Kmin, K , ] (see also Fig. 2). Let us introduce the following notations
306
with
and
91 M I (D)
t=l
\ J -
The training of the neural network is performed by using a generalized backpropagation algorithm (see, e.g., [30], chapter 5); it aims to minimize the global error with respect to 8
c
E ( @ )=
X [ L ( D -~ ~ , ~ )
p,
Fig. 5. Schematic representation of the tunable part of the neural network
~ ~ ~ ~ ~ h e r (3) ( ~ approach. ~ ~ ) ] 2
i=l j=1
by a gradient descent
where 77 (0 < 77 < 1) is a learning rate and
c
de
with
=2
p*
i=l j=1
C [ L (e) D - ~ ~ ~
Lteacher( Dij) l a L ( D i j ,
a e e)
and
j=1
v.
DESCRIPTION OF THE APPLICATION
Both methods were tested on a target recognition system. The recognition problem consists in identifying armored vehicles from short distance two-dimensional infrared images. The major difficulty lies in the lack of knowledge of the position and orientation of the vehicle with respect to the camera. Indeed, it is completely impracticable to implement a close matching of the image of the vehicle with a template given
all the possible positions and orientations of the vehicle, and also given all the possible vehicles. Therefore, the problem is divided in two subproblems: the first subproblem is the computation of the orientation and position of the vehicle based on a crude model of the vehicle and the second subproblem is the identification of the vehicle in a reference position and orientation. An expert system was designed to accomplish these two tasks. The first task, position and orientation detection, consists of putting a system of three axes, {X,Y,2 ) . on the image of the vehicle according to predefined conventions. In our application, the conventions are the following: The X axis is the line on the side of the vehicle between the wheel train and the ground. The Y axis is the line on the front or on the rear of the vehicle between the front or rear wheels and the ground. The origin of the axes system is, by convention, at the extremity of the X axis on the engine side if the image gives a side view of the vehicle. The origin is on the left of the Y axis if the image gives a front or rear view of the vehicle and when the engine is at the front of the image. The origin is on the right of the Y axis if the image gives a front or rear view of the vehicle and when the engine is at the rear of the image. The 2 axis is the vertical direction of the vehicle, normalized on the wheel train height if the image gives a side view of the vehicle or on the distance between the floor of the vehicle body and the ground if the image gives a front or rear view of the vehicle. An illustration of these conventions can be found in Fig 6. The position and orientation detection is then divided into five ( N = 5) subtasks. These five subtasks correspond to five different levels of increasing knowledge of the position and orientation of the vehicle. Level 1: Determination of one principal direction out of the eight most important orientations found on the image. This principal orientation is either the direction of the X axis if the image is a side view, or the direction of the Y axis if the image is a front or rear view.
PERNEEL er al.: OF'TIMIZATION OF FUZZY EXPERT SYSTEMS USING GENETIC ALGORITHMS
307
TABLE I1
HEURISTIC !SCRIPTION OF THE RECOGNITION SYSTEM

Description o f the heuristic importance of the orientation ( s e i Pemeel e l a / [31]) rejects orientations near the orientation horizon + 90 Iavours the orientations of candidate wheels (collinear regions) length measure of the line favours X-candidates if candidate wheels have been found for the orientation rejects Y-candidates if candidate wheels have been found for the orientation measure of length-height ratio measure of the quantity of white areas below the line measure of the gradient of the line measure of the quantity of white areas above the line comparison of the relative position of the lines measure of the relative position of the wheels with respect to the line measure of the absolute position of the line memure of gray level statistics at the outer points of the line comparison of the orientation of the line with the chosen orientation measure of the quantity of white areas or at the outer points of the X-line, or above the Y-line importance of the orientation (global and local] comparison with the orientation Y a r i s 90 comparison with the orientation horizon 90 importance of the orientation relative orientation of the third axis with the first one relative orientation of the third axis with the vertical one
Fig. 6. Axes conventions. TABLE 111 POSITION AND ORIEUTATION DETECTION RESULTS (MANUAL TUNING)
Orientation first axis ( X or U) Orientation second XIS ( Z ) Orientation third axis (U o r X) &ition of the coordinate system
1 S i i%
I 96 3% I 45 2% I 80.0% I
+ +
TABLE IV NUMERICALVALUES USEDFOR THE OFTMIZATION

p : Number of individuals U : Number of learning images Probability of mutation
WITH
GA
Selection pressure Elitism strategy
Level 2: Determination, out of two sets of lines (a set of X-candidates and a set of Y-candidates), of the line of the X axis if it is a side view, or the line of the Y axis if it is a front or rear view. At this level, the expert system has to decide if it is dealing with a side or fronthear view, and it also chooses the correct X or Y line. Level 3: Determination of the position of the origin if the image is a side view and of the engine position if the image is a fronthear view. Level 4: Determination of the 2 axis from nine candidates. Level 5: Determination of the third axis from 16 candidates, either the Y axis if it is a side view, or the X axis if it is a front or rear view. The size of the solution space is 69 120. Table I1 gives some details about the heuristics of the expert system. Once the first task of detecting the position and orientation of the vehicle is accomplished, it remains to identify the type of vehicle. This is done by defining characteristic details for each vehicle to be recognized, such as number of wheels, engine position, track size, exhaust system, etc. The location and the shape of these characteristic details is known in advance. Therefore, templates can be created with their location on the vehicle specified. Since the position and orientation of the vehicle is known, it is sufficient to verify that the templates match the corresponding areas on the image. This is done by using pattern recognition techniques such as cross-correlation or neural networks (cf. [32] and [33]. Only the first part of this method, position and orientation detection, will be described in this paper. VI. RESULTS The fuzzy reasoning method described in Section I1 is implemented in the automatic target recognition system. Trape-
Yes 0.0 , 1.0 discrete values with interval of 0.05
zoidal type rating functions, as in Fig. 1, are used as rating functions. Three linguistic terms have been selected: good, average and bad. The overlapping factor is 50% for all heuristics and rating membership functions. The best case value approach is used (Branch and Bound).
A. Manually Tuned Fuzzy Expert System
The results for a database containing 135 infrared images of eight different vehicles in different positions are shown in Table III. These results are obtained with a manually-tuned system. The tuning of these parameters was based on common sense so a learning database was not necessary. Note that the knowledge of two axes is sufficient to perform the second phase of the recognition problem, the identification of the vehicle.
B. System Tuned with GA's

We chose to limit the parameter vector 8 to
e = (w11
I ' ' ' I
%A{,
The chromosome length is M (21), each gene w!~ of individual k taking 20 discrete allele values (integers from 1-20). A set of U different images Ij is considered as learning database. For each of these images, the desired axis system Dyher has been determined to compare it with the coordinate system D , * ( e ; )= f(IjIei) found by solving the inner problem. The norm IIX - Ylla appearing in the definition of the error function E ( e i ) is a measure of the distance between the two coordinate systems X and Y.The numerical values used in our application are given in Table IV. On Fig. 7, the
308
EEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 3, NO. 3, AUGUST 1995
TABLE V GLOBAL COMPARISON OF THE RESULTS
x 1EO
0.2812
TABLE VI POSITION AND ORIENTATION DETECTION RESULTS
cost of the best individual (lower curve) is represented as well as the mean cost (upper curve) of the population. The learning database contains 46 images, eight different vehicles in different positions, of different light environments (different adjustments of the IR camera). After the learning process, the new weights were tested on a larger database of 135 different images, containing the leaming database. The results are compared with those obtained for the manually tuned weights. The results are given in Tables V and VI. Table V expresses the percentage of same, better, and worse results obtained with the new weights. A better result is obtained when at least one of the orientations or positions of an axis is more exact. A worse result is obtained when at least one of the orientations or positions of an axis is less exact. Table VI shows the improvement more in detail. It has to be mentioned that this optimization method is time consuming. In the considered case, the problem of putting a coordinate system on a vehicle is solved 368000 times (46 images, 40 individuals and 200 generations) to tune the different parameters. To reduce the computing time, solving one problem needs on the average about 15 minutes on a workstation HP425 (64 Mb RAM), all the image Processing work was solved previously. TO optimize the fuzzy system (without any image processing job), three weeks Computing time was needed on a same workstation. This high computing time is not a real disadvantage of this method because the optimization has to be done only once. The advantage of this optimization method is its simplicity. C. System Tuned with Neural Network Approach
o.1562
o.
125
I<-;
0 . 0.5
Generation
1.
1.5
2.
x lE2
Fig. 7. Fitness of best individual and mean fitness.
A
Lteache?
DLeechcr
Fig. 8. Schematic represenration of the function Lteacher(D).
of the total set of possible solutions (69 120). To obtain a representative database, containing about 50% good examples and 50% bad examples, we have used a p m of the find population given by GA solving the decision-making problem related to the Same image. This method of solving the decisionmaking problem is explained in detail in Appendix A. rating is chosen as nedesired
Lteacher
( D )=
Instead of searching the parameter vector 6 with GAs, we use the neural-like method with gradient descent to determine the parameter vector Although the application is the Same where I I . I IQ has been used previously (see Section V-B) and for both methods, it must be kept in mind that the strategies Dmax is a threshold. ne function Lteacher(D) is schematically adopted to the problem are completely different. On represented in Fig. 8. Another choice which comesponds to the One the problem as Seen by GAs is a Some kind of classification, could also be used (see Fig. 9) optimization problem and the role of GAs is to perform the outer optimization on the basis of a distance criterion (see if D is a correct decision Section 111-B). On the other hand, the problem as seen by Lteacher (as judged by the teacher) ( D )= neural networks is the learning of a mapping (L(D)) so that otherwise. the learned mapping can be used afterwards in the original decision-making problem. The learning is performed in batch mode: one iteration The learning database is made of 69 images I (eight vehicles in different positions) where the coordinate system of an consists of the presentation of 69*20 = 1380 pattems followed armored vehicle has to be determined (see Table I). For each by one generalized backpropagation step. The learning rate of these images, a group of 20 decisions is provided out was chosen to be 77 = 0.01 to ensure stability.
; ! ( if [ID l J - Dteacher Dteacher I <[ lD,, a)
Dmax
la
rmax
PERNEEL et al.: OFTIMIZATION OF FUZZY EXPERT SYSTEMS USING GENETIC ALGORITHMS
309
TABLE VI1 GLOBAL COMPARISON OF THE RESULTS
3.0625
--
3.
BB
Requirements for Efficiency Alternative Solutions Unstructured information Strong No No
GA Medium Yes Yes
2.9375
__
__
2.875
2.8125
0.
I
Generation
0:5
1 ' .
115
2 ' .
215
x lE3
DLeacher
Fig. 9. Schematic representation of another function Lteacher(D).
Fig. 10. Learning curve of the network.
Fig. 10 shows that, after a few iterations, a mean relative error of
is reached. Contrary to usual applications of neural networks in function approximation which try to render the error as small as possible by increasing the network complexity (e.g., the number of nodes), the structure and the parameter number are fixed beforehand. This explains the relatively high error rate. After the learning phase, the optimum weights found by generalized backpropagation were tested on a larger database of 135 different images (including the learning database) using the fuzzy Branch and Bound method for solving the decision problem. The results are compared in Table VU with those obtained for the manually-tuned weights. It is observed that the neural-like tuning outperforms the manual tuning. The performance of the NN method, however, does not achieve the performance level of the tuning with GA's. Three reasons can be set forth to explain this inferiority: The difficulty of choosing an adequate learning database. For example, one could choose learning pattems at random. This generally results in a poor discrimination among good decisions, since the number of learning pattems representing good decisions (close to the maximum of Lteacher ) is insufficient. On the contrary, choosing only the learning pattems among good candidate decisions would result in the penalty of some heuristics which could nevertheless be necessary to discriminate decisions of poor quality. For the two possibilities mentioned before, the obtained results were very poor. To combine both possibilities to have enough good examples and to take some examples at random, the final population given by
a GA solving the decision making problem was used (see the Appendix). The difficulty of choosing an adequate desired global . rating function (Lteacher). The definition of Lteacher1s arbitrary and somewhat artificial; some definitions may be better than others, given the structure and the limited number of parameters of the neural network. It must be kept in mind that the real objective of the learning is to build a rating function L ( D ) whose maximum coincides with rather than approximating the maximum of Lteacher(D), Lteacher (D) as closely as possible everywhere. There exists always a chance to find a local minimum with the neural-like approach instead of the global one. This problem does not exist with the method which is based upon GA because many candidates are considered in parallel. Also, due to the cross-over and the mutation, jumps are made in the solution space. The time complexity to train the NN is limited in comparison with the GA. It has to be mentioned, however, that the time needed to build a representative database may not be ignored. VII. CONCLUSIONS In this paper, the fuzzy logic theory has been used to build a specific decision-making system for heuristic search algorithms. To improve the performance of the overall system, two appropriate optimization methods, namely GA's and gradient-descent techniques based on a neural-like formulation of the problem have been tested to tune the parameters of the decision-making system. The decision-making system has been tested on a target recognition problem with good results. The genetic algorithm optimization method provides a marked improvement of the results. Due to the difficulty of choosing an efficient database for the neural-like approach, the improvements obtained for this optimization method are not so spectacular.
310
APPENDIX DECISION MAKING IN UNCERTAIN ENVIRONMENT WITH GENETIC ALGORITHMS We have investigated the usefulness of GA's to the resolution of the graph-search problem. The application of GA's to a simpler decision problem, namely a binary multiple-fault diagnosis problem, has already been proposed (see [34]). The GA's approach is motivated by the following arguments: The basic requirements for efficiency of GA are less stringent than those of Branch and Bound or other heuristic methods. At the end of the graph-search, GA's result in a family of alternative solutions. This can be useful in several applications with human interface where the decision maker prefers to have a set of alternative strategies and to make himself the final decision. In our application, GA's turn out to be very useful by providing interesting solutions used as a database for training learning systems such as neural networks. As we have already said, GA's do not require a structured information (availability of partial rating). The time before finding a satisfactory solution is variable for Branch and Bound methods (short time if the efficiency requirements are satisfied, to long time otherwise), whereas GA's need an approximately constant time in between. Although GA's are well known to be highly reliable, however, they are not absolutely guaranteed to find the global best solution, which will certainly be found by Branch and Bound methods. To solve the graph-searching problem with GA's, a population of individuals is formed. Each individual (or chromosome) consists of a particular sequence of decision elements
TABLE M GA PARAMETERS USEDI N OURAPPLICATION
x 1E2
4*5
4*
3.5 3.
* t' I
t
2.5 2.
1 il/r.?l".-li..iiiiry
l 1. a5
0.5
1
1
Generation
$.
1 ' .
2'.
3'.
4'.
5'.
6'.
7'.
8'.
9'.
d.
x 1El
Fig. 1 1 . Learning curves for the fuzzy logic approach.
Di = (d"l,di2 > . . . , 4 J = 4+n

grouped in the population
P = {D1,D*,-.,Dp}.
The chromosome length is thus n, each gene d i taking allele values. The fitness function of individual D, is taken as the global measure of quality (rating) of the decision L(D;). The fitness function therefore depends on the particular approach adopted to cope with the uncertainties (Bayesian approach (see [35]), Fuzzy approach (paragraph 11), Belief functions (see [36]), etc.) When a structured information is available, it is also possible to exploit it by designing more efficient strategies based on GA's which involve a GA-based search several times and in an incremental fashion (multi-stage GA). For example, a multi-stage GA can be designed which first solves the partial decision problem limited to the first IC levels, and then solves the complete decision problem by gradually adding grouped levels of decision.
ni
A. Results
The method described above is tested on the armored vehicle recognition system, described in the main part of the text, solving a hierarchical graph search problem. The GA
parameters used in our application are summarized in Table IX:The fitness value of each individual is the global rating of the decision represented by the individual. The fitness values are scaled to belong to the interval [0, 5001. For computing fitness values, Bayesian and fuzzy logic approaches are examined and compared. I) Single-phaseGA: Fig. 11 shows the learning curves (evolution of the fitness) for the fuzzy logic approach (see Section 11) over 50 experiments with different initial populations. The diagrams are given for one particular image but similar results are observed for most of the available images in our database. Starting from top to bottom of the diagrams, the curves represent, respectively: Curve 1: the highest value (over 50 experiments) of the best fitness of each population, Curve 2: the mean value (over 50 experiments) of the best fitness of each population, Curve 3: the lowest value (over 50 experiments) of the best fitness of each population, Curve 4: the highest value (over 50 experiments) of the mean fitness of each population, Curve 5: the mean value (over 50 experiments) of the mean fitness of each population, and curve 6: the lowest value (over 50 experiments) of the mean fitness of each population. As expected, the reliability of the GA search is excellent since all experiments result in the optimum (or near-optimum)
PERNEEL er al.: OFTMEATION OF FUZZY EXPERT SYSTEMS USING GENETIC ALGORITHMS
31 I
1El
x 1E2
10.
7 .
6 .
5.
4.
3.5 4 i !
2.5
3.
2.
1.
0.
-_--A .
39
0.
1I
1
d.
1 . 2.
Generation
0.
1 .
2.
3.
4.
5 . 6 .
7.
8.
9.
14.
x 1El
3 . 4.
5 . 6 . 7.
8.
9 .
id.
x 1El
Fig. 12. Comparison of the success rates of the Bayesian and fuzzy logic approaches.
Fig. 13. Leaming curves for the multi-stage GA.
partial decision ( d l , 4 ) .Only the level ratings L1 and L2 (respectively of level 1 and 2) are used and combined to compute the fitness of a partial decision. At the beginning of phase 2, a new population of complete decisions is formed from the last generation of phase 1 by randomly adding three decision elements. During phase 2, which involves a normal GA-search on the five levels, the first two genes (or levels) are still allowed to change. Phase 1 can thus be considered decision. Curves 1 and 2 of Fig. 11 show that a satisfactory as a means to choose a biased starting population for usual result can already be expected after 40 generations corre- single-phase GA. Additional GA-parameters for the multisponding to at most 1600 tested nodes (well below the 69 120 stage search are given in Table X. Fig. 13 shows the learning terminal nodes of the overall tree). The success rates of the curves for a multi-stage GA using the same conventions Bayesian (see [35]) and fuzzy logic approaches are compared than Fig. 11. Looking again at Fig. 12, it is observed that in Fig. 12. In this figure, the lowest curve corresponds to the two-stage GA outperforms the single-stage methods: a faster success rate using Bayesian rating, while the curve in the convergence is achieved. This efficiency is easily explained middle relates to the fuzzy-derived rating. The third curve by the fact that the particular structure of the information is corresponds to the multi-stage GA and will be described exploited in an improved way. later. Considering the first two curves, it turns out that the fuzzy logic approach converges more rapidly. This could be REFERENCES explained by the fact that the operators used to build the global f Art$cial Intelligence. San Mateo, CA: [I] N. J. Nilsson, Principles o rating are more adapted to the fundamental GA mechanisms Morgan Kaufmann, 1980. and requirements. As already mentioned, similar results were [2] I. Pearl, Heuristics: Intelligent Search Straregies for Computer Problem Solving. Redwood City, CA: Addison-Wesley, 1984. obtained with other armored vehicles on different images. [3] P. Torasso and L. Console, Diagnostic Problem Solving-Combining The percentage of success (frequency of achieving a deHeuristic, Approximare and Causal Reasoning. London: North Oxford Academy, 1989. cision identical or nearly identical to the true optimum one) [4] N. J. Nilsson. Probabilistic logic, Artificial Inrell., vol. 28, pp. 71-87, was 96% for all the 135 images of our database. Statistical 1986. tests have been done to examine the final population of [5] D. J. Spiegelhalter, Probabilistic reasoning i n predictive expert systems, Uncertainty in Art$cial Intelligence, Kanal and Lemmer, Eds. the search method: 52.5% of the individuals of the final Amsterdam: North-Holland, 1986, pp. 4 7 4 7 . population are corresponding with a sufficient result (at least [6] G. Shafer, A Marhematical Theory of Evidence. Princeton, NJ: Princeton Univ. Press, 1976. two correct axis), while the other individuals correspond with Probability judgment in artificial intelligence and expert sys[7] -, an insufficient result. tems, Statistical Sri., vol. 2, no. 1, pp. 3-16, 1987. 2) Multi-stage GA: We chose to subdivide the GA search . [8] R. E. Bellman and L. A. Zadeh, Decision making i n a fuzzy environment, Managemen? Sri., vol. 17, pp. 141-164, 1970. into two stages. In the first stage, GA performs a search [9] L. A. Zadeh, Outline of a new approach to the analysis of complex limited to the first two levels, which carry the most important systems and decision processes, IEEE Trans. Syst., Man. Cybern., vol. part of the information. An individual therefore represents a SMC-3, pp. 2 8 4 4 , 1973.
TABLE X SUMMARY OF A~DITIONAL TWO-STAGE GENETIC ALGORITHMS PARAMETERS 1 PHASE I 1 I Chromosome leneth 12 I
312
[IO] H. Prade, A computational approach to approximate and plausible reasoning with applications to expert systems, Trans. Pattern Anal. Machine Intell., vol. PAMI-7, pp. 260-283, 1985. 1111 D. V. Lindley, The probability approach to the treatment of uncertainty in artificial intelligence and expert systems, Statistical Sei., vol. 2, no. 1, pp. 17-24, 1987. [ 121 P. Cheeseman, Probabilistic versus fuzzy reasoning, Uncertainry in Artificial Inrelligence, Kanal and Lemmer, Eds. Amsterdam: NorthHolland, 1986, pp. 85-102. 1131 L. A. Zadeh, Is probability theory sufficient for dealing with uncertainty in AI: A negative view, Uncertainty in Arrijicial Intelligence, Kanal and Lemmer, Eds. Amsterdam: North-Holland, 1986, pp, 103-1 16. 1141 B. Kosko, Neural Networks and Fuzzy Systems. Englewood Cliffs, NJ: Prentice-Hall, 1992. Fuzzy knowledge combination, Int. J. Intell. Syst., vol. I, pp. 11.51 -, 293-320, 1986. 1161 M. Sugeno and G. Kang, Structure identification of fuzzy model, Fuzzy Sets Sysr., vol. 28, pp. 15-33, 1988. 1171 M. Sugeno and T. Yasukawa, A fuzzy-logic-based approach to qualitative modeling, ZEEE Trans. F u u y Syst., vol. 1, pp. 7-31, 1993. 1181 H. Nomura, I. Hayashi, and N. Wakami, A learning method of fuzzy inference rules by descent method, in Proc. IEEE Inr. Con5 Fuzzy Sysr., 1992, pp. 203-210. 1191 H. Bersini, J. Nordvik, and A. Bornari, A simple direct fuzzy controller derived from its neural equivalent, in Proc. 2nd IEEE In?. Con$ FUZZY Sysr., 1993, pp. 345-350. 1201 C. L. Karr, Design of an adaptive fuzzy logic controller using a genetic algorithm, in Proc. 4th Int. Con$ Genetic Algorithms, 1991, pp. 4 5 w 5 7 . 1211 C. L. Karr and E. J. Gentry, Fuzzy control of Ph using genetic algorithms. IEEE Trans. Fuzzy Syst., vol. I , pp. 46-53, 1993. 1221 P. Thrift, Fuzzy logic synthesis with genetic algorithms, in Proc. 4th In?. Con$ Generic Algorithms, 1991, pp. 509-513. [23] E. H. Mamdani, Application of fuzzy logic to approximate reasoning using linguistic synthesis, IEEE Trans. Computers, vol. C-26, pp. 1182-1 191, 1977. 1241 C. Perneel. M. de Mathelin, and M. Acheroy, Automatic target recognition fuzzy system for thermal infrared images, in Proc. 2nd IEEE Inr. Con$ Fuzzy Sysr.-IEEE, San Francisco, Mar. 28-Apr. 1, 1993, pp. 576581. [25] G. S. Beveridge and R. S . Schechter, Optimization: Theory and Practice. New York: McGraw-Hill, 1970. [26] J. A. Nelder and R. Mead, A simplex method for function optimization, Computational J., vol. 7, pp. 308-313, 1965. 1271 D. E. Goldberg, Genetic Algorithms in Search, Optimization and Machine Learning. Redwood City, CA: Addison-Wesley, 1989. 1281 J. S. R. Jang and C. T. Sun, Functional equivalence between radial basis function networks and fuzzy inference systems, IEEE Trans. Neural Networks, vol. 4, Jan. 1993. 1291 J. S. R. Jang, Anfis: Adaptive neural based fuzzy inference systems, IEEE Trans. Syst., Man, Cybern., to appear. 1301 R. Hecht-Nielsen, Neurocomputing. Redwood City, CA: AddisonWesley, 1989. 1311 C. Perneel, M. de Mathelin, and M. Acheroy, Detection of important directions on thermal infrared images with application to target recognition, in Proc. Forward Looking Infrared Image Process.-SPIE, Orlando, Apr. 12-16, 1993. [32] R. 0. Duda and P. E. Hart, Pattern Classification and Scene Analysis. NY: Wiley, 1973. 1331 R. P. Lippmann, An introduction to computing with neural nets, IEEE ASSP Mag., Apr. 1987, pp. 4-22. [34] W. D. Potter, J. A. Miller, and 0. R. Weyrich, A comparison of methods for diagnostic decision making, Expert Syst. Applica., vol. I, pp. 425436, 1990. 1351 M. de Mathelin, C. Pemeel, and M. Acheroy, Bayesian estimation versus fuzzy logics for heuristic search algorithms, in Proc. 2nd IEEE In?. ConJ Fuzzy Syst., San Francisco, Mar. 28-Apr. 1, 1993, pp. 944-95 1. [36] C. Perneel, H. V. D. Velde, and M. Acheroy, An heuristic search algorithm based on belief functions, in Proc. AI 94: 14th Avignon Inr. Con$, Paris, May 30-June 3, 1994.
Christiaan Perneel (M94) was born in 1963. In 1986, he received the masters degree of engineer in telecommunications at the Royal Military Academy of Brussels, Belgium, and he received the Ph.D. degree in 1994 at the Vrije Universiteit Brussel. Currently, he is a Lecturer at the Department of Applied Mathematics of the Royal Military Academy and where he is teaching probability and statistics. His research interests include image processing, pattern recognition, and imperfect information. He is head of the pattern recognition cell of the Signal and Image Center of the Royal Military Academy and he is the Belgian representative of IAPR (International Association of Pattern Recognition).
Jean-Marc Themlin was born on November 4, 1963. He received the bachelors degree in physics in 1985 at the FacultCs Universitaires Notre-Dame de la Paix in Namur, Belgium, where he received the Ph.D. degree in 1991. After post-doctoral studies in Marseille, France, in 1991-1992, he joined the Signal and Image Center of the Royal Military Academy of Brussels, Belgium, where he worked on GAs and fuzzy systems applied to artificial intelligence and image processing. Since 1993, he has been Maitre de conferences involved in teaching and research at the FacultC des Sciences de Luminy in Marseille. His primary research interests in experimental physics are photoemission and inverse photoemission, applied to solids, surfaces, and interfaces to reveal their electronic properties.
Jean-Michel Renders was born in Brussels, Belgium. He received the Masters degree in mechanical end electrical engineering from the UniversitC Libre de Bruxelles in 1987, and the Ph.D. degree from the same university in 1993. He is currently working at Tractebel Energy Engineering (Artificial Intelligence Section), Belgium. His research interest include neural networks, GAs, and artificial intelligence techniques applied to process control and power systems.
Marc Acheroy (M90) was born in 1948. He received the masters degree of engineer in transport-mechanics at the Royal Military Academy of Brussels in 1971. In 1978, he became Engineer of the Military Material, in 1981 Engineer on Automation Control and he received the Ph.D. degree in 1983 at the UniversitC Libre de Bruxelles. Since 1985, he has been teaching at the Royal Military Academy as assistant professor and as professor since 1991. He is the head of the Electrical Engineering Dept. and of the Signal and Image Center of the Royal Military Academy. His research interests include signal and image processing, especially image compression and restoration.

Eka 1

Diunggah oleh

Informasi Dokumen

Deskripsi Asli:

Judul Asli

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

Eka 1

Diunggah oleh

Hak Cipta:

Format Tersedia

300

IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 3, NO. 3, AUGUST 1995

1063-6706/95$04.00 0 1995 IEEE

PERNEEL et al.: OFTMIZATION OF FUZZY EXPERT SYSTEMS USING GENETlC ALGORITHMS

D = dl-n can be associated with a specific path in the

Let D be the discrete set of all the global decisions, so

IEEE TRANSACTIONS ON FUZZY SYSTEMS,VOL. 3, NO. 3, AUGUST 1995

111. DECISION MAKINGIN AN UNCERTAIN

= LILl(dl+l),LP(d1-+2), . 1 Ln(dl+n)l = O(Q(d1-n))

To get a rating of the partial decision

PERNEFL ef al.: OPTIMIZATION OF FUZZY

EXPERT SYSTEMS USING GENETIC ALGOIUTHMS

Fig. 2. An example of membership functions with T = 3.

Iv. OlTlhWATION OF THE FUZZY EXPERT SYSTEM

' ' '

IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 3, NO. 3, AUGUST 1995

B. Optimization Using GAs

with the error function E ( 8 ) defined as

PERNEEL ef al.: OPTIMIZATION OF FUZZY EXPERT SYSTEMS USING GENETIC ALGORITHMS

[ ( P ; (m1 ( D ) ). , . . 9 X ( m 1( D ) ) ).,. . >

(Pkf(mM(D),' ' p5(mM(D)))1

E(@i) defined by (1). Fig. 3 outlines the principle of the

IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 3, NO. 3, AUGUST 1995

Fig. 5. Schematic representation of the tunable part of the neural network

where 77 (0 < 77 < 1) is a learning rate and

DESCRIPTION OF THE APPLICATION

PERNEEL er al.: OF'TIMIZATION OF FUZZY EXPERT SYSTEMS USING GENETIC ALGORITHMS

HEURISTIC !SCRIPTION OF THE RECOGNITION SYSTEM

TABLE IV NUMERICALVALUES USEDFOR THE OFTMIZATION

Selection pressure Elitism strategy

Yes 0.0 , 1.0 discrete values with interval of 0.05

B. System Tuned with GA's

I ' ' ' I

EEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 3, NO. 3, AUGUST 1995

TABLE V GLOBAL COMPARISON OF THE RESULTS

TABLE VI POSITION AND ORIENTATION DETECTION RESULTS

Fig. 7. Fitness of best individual and mean fitness.

Fig. 8. Schematic represenration of the function Lteacher(D).

; ! ( if [ID l J - Dteacher Dteacher I <[ lD,, a)

PERNEEL et al.: OFTIMIZATION OF FUZZY EXPERT SYSTEMS USING GENETIC ALGORITHMS

TABLE VI1 GLOBAL COMPARISON OF THE RESULTS

GA Medium Yes Yes

Fig. 9. Schematic representation of another function Lteacher(D).

Fig. 10. Learning curve of the network.

Fig. 10 shows that, after a few iterations, a mean relative error of

IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 3, NO. 3, AUGUST 1995

TABLE M GA PARAMETERS USEDI N OURAPPLICATION

Fig. 1 1 . Learning curves for the fuzzy logic approach.

Di = (d"l,di2 > . . . , 4 J = 4+n

PERNEEL er al.: OFTMEATION OF FUZZY EXPERT SYSTEMS USING GENETIC ALGORITHMS

Fig. 13. Leaming curves for the multi-stage GA.

IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 3, NO. 3, AUGUST 1995

Anda mungkin juga menyukai