PPT

14.
Neuro-Fuzzy Systems
Building a fuzzy system requires
z prior knowledge (fuzzy rules, fuzzy sets)
z manual tuning: time consuming and error-prone
Therefore: Support this process by learning

z learning fuzzy rules (structure learning)
z learning fuzzy set (parameter learning)
Approaches from Neural Networks can be used
113
N
SF
EURO
UZZY
Learning Fuzzy Sets: Problems in Control
Reinforcement learning must be used to compute an error value

(note: the correct output is unknown)
After an error was computed, any fuzzy set learning procedures can be used
Example: GARIC (Berenji/Kedhkar 1992)

online approximation to gradient-descent
Example: NEFCON (Nauck/Kruse 1993)

online heuristic fuzzy set learning using a
rule-based fuzzy error measure
114
N
SF EURO
UZZY
115
N
SF
EURO
UZZY
Example: Prognosis of the Daily Proportional Changes of the DAX at
the Frankfurter Stock Exchange (Siemens)
Database: time series from 1986 - 1997
DAX Composite DAX

German 3 month interest rates Return Germany
Morgan Stanley index Germany Dow Jones industrial index
DM / US-$ US treasury bonds
Gold price Nikkei index Japan
Morgan Stanley index Europe Price earning ratio
116
N
SF
EURO
UZZY
Fuzzy Rules in Finance
Trend Rule
IF DAX = decreasing AND US-$ = decreasing
THEN DAX prediction = decrease
WITH high certainty
Turning Point Rule
IF DAX = decreasing AND US-$ = increasing
THEN DAX prediction = increase
WITH low certainty
Delay Rule
IF DAX = stable AND US-$ = decreasing
THEN DAX prediction = decrease
WITH very high certainty
In general
IF x1 is P1 AND x2 is P2
THEN y=K
WITH weight k
117
N
SF
EURO
UZZY
Classical Probabilistic Expert Opinion Pooling Method
DM analyzes each source (human expert, data +

forecasting model) in terms of (1) Statistical accuracy,
and (2) Informativeness by asking the source to asses
quantities (quantile assessment)
DM obtains a “weight” for each source
DM “eliminates” bad sources
DM determines the weighted sum of source outputs
Determination of “Return of Invest” N

118 SF
EURO
UZZY
E experts, R quantiles for N quantities
o each expert has to asses R·N values
stat. Accuracy:
R si
C 1 F R2 >2 N I s, p @,
I s, p ¦ si ln
i 0 p
information score:
1 Nª R 1 pr 1 º
I ¦ «lnvi, R 1 vi,o ¦ pr 1 ln »
N i 1¬ r 1 vi ,r vi ,r 1 ¼
ce I e idD ce
weight for expert e: we
E
¦eE 1 ce I e id e ce
e
outputt= ¦ we output t
e 1
T

roi = ¦ y t sign output tDM
t 1
119
N
SF
EURO
UZZY
Formal Analysis
Sources of information
R1 rule set given by expert 1
R2 rule set given by expert 2
D data set (time series)
Operator schema
fuse (R1, R2)fuse two rule sets
induce(D) induce a rule set from D
revise(R, D) revise a rule set R by D
120
N
SF
EURO
UZZY
Formal Analysis
Strategies:
z fuse(fuse (R1, R2), induce(D))
z revise(fuse(R1, R2), D) m
z fuse(revise(R1, D), revise(R2, D))
Technique: Neuro-Fuzzy Systems

z Nauck, Klawonn, Kruse, Foundations of Neuro-Fuzzy
Systems, Wiley 97
z SENN (commercial neural network environment, Siemens)
121
N
SF
EURO
UZZY
From Rules to Neural Networks
1. Evaluation of membership degrees
2. Evaluation of rules (rule activity)

n r D
Pl: IR o [0,1] , x j l 1 P c( ,js) xi
3. Accumulation of rule inputs and normalization

n r kl P l x
NF: IR o IR, x ¦l 1 wl r
¦ j 1
k j P j x
122
N
SF
EURO
UZZY
Neuro-Fuzzy Architecture
123
N
SF
EURO
UZZY
The Semantics-Preserving Learning Algorithm
Reduction of the dimension of the weight space

1. Membership functions of different inputs share their parameters,
e.g. stable stable
P dax { P cdax
2. Membership functions of the same input variable are not allowed to pass
each other, they must keep their original order,
e.g.
P decreasing P stable P increasing
Benefits: x the optimized rule base can still be interpreted

x the number of free parameters is reduced
124
N
SF
EURO
UZZY
Return-on-Investment Curves of the Different Models
Validation data from March 01, 1994 until April 1997
125
N
SF
EURO
UZZY
Neuro-Fuzzy Systems in Data Analysis
Neuro-Fuzzy System:
zSystem of linguistic rules (fuzzy rules).
zNot rules in a logical sense, but function
approximation.
zFuzzy rule = vague prototype / sample.
Neuro-Fuzzy-System:
zAdding a learning algorithm inspired by neural
networks.
zFeature: local adaptation of parameters.
126
N
SF
EURO
UZZY
A Neuro-Fuzzy System
is a fuzzy system trained by heuristic learning techniques derived from neural
networks
can be viewed as a 3-layer neural network with fuzzy weights and special
activation functions
is always interpretable as a fuzzy system
uses constraint learning procedures
is a function approximator (classifier, controller)
127
N
SF
EURO
UZZY
Learning Fuzzy Rules
Cluster-oriented approaches
=> find clusters in data, each cluster is a rule
Hyperbox-oriented approaches
=> find clusters in the form of hyperboxes
Structure-oriented approaches
=> used predefined fuzzy sets to structure the
data space, pick rules from grid cells
128
N
SF
EURO
UZZY
Hyperbox-Oriented Rule Learning
y
Search for hyperboxes
in the data space
Create fuzzy rules by
projecting the
hyperboxes
Fuzzy rules and fuzzy
sets are created at the
same time
x
Usually very fast
129
N
SF
EURO
UZZY
Hyperbox-Oriented Rule Learning
y y y y
x x x x
Detect hyperboxes in the data, example: XOR function

Advantage over fuzzy cluster anlysis:
z No loss of information when hyperboxes are represented as fuzzy rules
z Not all variables need to be used, don‘t care variables can be discovered
Disadvantage: each fuzzy rules uses individual fuzzy sets, i.e. the rule base is
complex.
130
N
SF EURO
UZZY
Structure-Oriented Rule Learning
large y
Provide initial fuzzy sets for

all variables.
The data space is partitioned
medium
by a fuzzy grid
Detect all grid cells that
contain data (approach by
Wang/Mendel 1992)
small
Compute best consequents

and select best rules
x (extension by Nauck/Kruse
1995, NEFCLASS model)
small medium large

131
N
SF EURO
UZZY
Structure-Oriented Rule Learning
Simple: Rule base available after two cycles through the training data
z 1. Cycle: discover all antecedents
z 2. Cycle: determine best consequents
Missing values can be handled

Numeric and symbolic attributes can be processed at the same time (mixed
fuzzy rules)
Advantage: All rules share the same fuzzy sets

Disadvantage: Fuzzy sets must be given
132
N
SF
EURO
UZZY
Learning Fuzzy Sets
Gradient descent procedures
only applicable, if differentiation is possible, e.g. for Sugeno-type fuzzy
systems.
Special heuristic procedures that do not use gradient information.
The learning algorithms are based on the idea of backpropagation.
133
N
SF
EURO
UZZY
Learning Fuzzy Sets: Constraints
Mandatory constraints:
z Fuzzy sets must stay normal and convex
z Fuzzy sets must not exchange their relative positions (they must
not „pass“ each other)
z Fuzzy sets must always overlap
Optional constraints
z Fuzzy sets must stay symmetric
z Degrees of membership must add up to 1.0
The learning algorithm must enforce these constraints.
134
N
SF
EURO
UZZY
Example: Medical Diagnosis
Results from patients tested for breast cancer

(Wisconsin Breast Cancer Data).
Decision support: Do the data indicate a malignant or a benign

case?
A surgeon must be able to check the classification for

plausibility.
We are looking for a simple and interpretable classifier:

Öknowledge discovery.
135
N
SF
EURO
UZZY
Example: WBC Data Set
699 cases (16 cases have missing values).
2 classes: benign (458), malignant (241).
9 attributes with values from {1, ... , 10}

(ordinal scale, but usually interpreted as a numerical scale).
Experiment: x3 and x6 are interpreted as nominal attributes.
x3 and x6 are usually seen as „important“ attributes.
136
N
SF
EURO
UZZY
Applying NEFCLASS-J
Tool for developing Neuro-Fuzzy Classifiers
Written in JAVA
Free version for research available
Project started at Neuro-Fuzzy Group of University of Magdeburg, Germany
137
N
SF
EURO
UZZY
NEFCLASS: Neuro-Fuzzy Classifier
Output variables (class labels)
Unweighted connections
Fuzzy rules
Fuzzy sets (antecedents)
Input variables (attributes)
138
N
SF
EURO
UZZY
NEFCLASS: Features
Automatic induction of a fuzzy rule base from data

Training of several forms of fuzzy sets
Processing of numeric and symbolic attributes
Treatment of missing values (no imputation)
Automatic pruning strategies
Fusion of expert knowledge and knowledge obtained
from data
139
N
SF
EURO
UZZY
Representation of Fuzzy Rules
Example: 2 Rules
c1 c2 R1: if x is large and y is small, then class is c1.
R2: if x is large and y is large, then class is c2.
The connections x o R1 and x o R2

R1 R2
are linked.
small
large
large The fuzzy set large is a shared weight.
x y
That means the term large has always the
same meaning in both rules.
140
N
SF
EURO
UZZY
1. Training Step: Initialisation
Specify initial fuzzy partitions for all input variables
y
large
c1 c2
medium
small
x
x y
small medium large
141
N
SF EURO
UZZY
2. Training Step: Rule Base
Algorithm: Variations:
for (all patterns p) do Fuzzy rule bases can
find antecedent A, also be created by
such that A( p) is maximal; using prior
if (A L) then add A to L; knowledge, fuzzy
end; cluster analysis, fuzzy
decision trees, genetic
for (all antecedents A L) do
algorithms, ...
find best consequent C for A;
create rule base candidate R = (A,C);
Determine the performance of R;
Add R to B;
end;
Select a rule base from B;
142
N
SF EURO
UZZY
Selection of a Rule Base
Performance of a Rule :
• Order rules by
performance.
N
1
Pr
N
¦ 1
c

Rr x p , with • Either select
p 1 the best r rules or
the best r/m rules per
class.
0 if class(x p ) con( Rr ), • r is either given or is
° determined automatically
c ® such that all patterns are
°1 otherwise. covered.
¯
143
N
SF EURO
UZZY
Rule Base Induction
NEFCLASS uses a modified Wang-Mendel procedure
y
large
c1 c2
medium
R1 R2 R3
small
x
x y
small medium large
144
N
SF EURO
UZZY
Computing the Error Signal
Error Signal Fuzzy Error ( jth output):
Ej sgn(d ) 1 J (d ) , with d tj oj
c1 c2 § ad ·
2
¨¨ ¸¸
and J : o >0, 1@, J (d ) e © d max ¹
(t : correct output, o : actual output)

R1 R2 R3
Rule Error:
x y Er W r 1 W r H Econ( R ) , with 0 H 1
r
145
N
SF
EURO
UZZY
3. Training Step: Fuzzy Sets
x a ½
°b a if x [a, b) °
Example: ° °
triangular °c x °
membership P a ,b,c : o [0,1], P a ,b,c ( x) ® if x [b, c] ¾
function. °c b °
° °
°0 otherwise °
¯ ¿
V P ( x) if E 0
Parameter f ®
updates for an ¯V 1 P ( x) otherwise
antecedent 'b f E c a sgn( x b)
fuzzy set. 'a f E b a 'b
'c f E c b 'b
146
N
SF EURO
UZZY
Training of Fuzzy Sets
y
large
initial fuzzy set
P(x)
medium
reduce enlarge
0.85
0.55
small
0.30 x
x small medium large
Heuristics: a fuzzy set is moved away from x (towards x)

and its support is reduced (enlarged), in order to
reduce (enlarge) the degree of membership of x.
147
N
SF EURO
UZZY
Training of Fuzzy Sets
Algorithm:
Variations:
repeat
for (all patterns) do • Adaptive learning rate
accumulate parameter updates; • Online-/Batch
accumulate error; Learning
end;
modify parameters; • optimistic learning
until (no change in error); (n step look ahead)
local Observing the error on

minimum a validation set
148
N
SF
EURO
UZZY
Constraints for Training Fuzzy Sets
• Valid parameter values

• Non-empty intersection of 1
adjacent fuzzy sets
• Keep relative positions
2
• Maintain symmetry
• Complete coverage
(degrees of membership add up
3
to 1 for each element)
Correcting a partition after

modifying the parameters
149
N
SF
EURO
UZZY
4. Training Step: Pruning
Goal: remove variables, rules and fuzzy sets, in order to
improve interpretability and generalisation.
150
N
SF
EURO
UZZY
Pruning
Algorithm: Pruning Methods:
repeat 1. Remove variables

select pruning method; (use correlations, information
gain etc.)
repeat
execute pruning step; 2. Remove rules
train fuzzy sets; (use rule performance)
if (no improvement) 3. Remove terms

then undo step; (use degree of fulfilment)
until (no improvement); 4. Remove fuzzy sets

(use fuzziness)
until (no further method);
151
N
SF
EURO
UZZY
WBC Learning Result: Fuzzy Rules
R1: if uniformity of cell size is small and bare nuclei is fuzzy0 then benign
R2: if uniformity of cell size is large then malignant
152
N
SF
EURO
UZZY
WBC Learning Result: Classification Performance
Predicted Class
malign benign not sum
classified
malign 228 (32.62%) 13 (1.86%) 0 (0%) 241 (34.99%)
benign 15 (2.15%) 443 (63.38%) 0 (0%) 458 (65.01%)
sum 243 (34.76%) 456 (65.24%) 0 (0%) 699 (100.00%)
Estimated Performance on Unseen Data (Cross Validation)
NEFCLASS-J: 95.42% NEFCLASS-J (numeric): 94.14%

Discriminant Analysis: 96.05% Multilayer Perceptron: 94.82%
C 4.5: 95.10% C 4.5 Rules: 95.40%
153
N
SF
EURO
UZZY
WBC Learning Result: Fuzzy Sets
uniformity of cell size
sm lg
1.0
0.5
0.0
1.0 2.8 4.6 6.4 8.2 10.0
bare nuclei
1.0
0.5
0.0
1.0 2.8 4.6 6.4 8.2 10.0
154
N
SF
EURO
UZZY
NEFCLASS-J
155
N
SF
EURO
UZZY
Resources
Detlef Nauck, Frank Klawonn & Rudolf Kruse:
Foundations of Neuro-Fuzzy Systems

Wiley, Chichester, 1997, ISBN: 0-471-97151-0
Neuro-Fuzzy Software (NEFCLASS, NEFCON, NEFPROX):

http://www.neuro-fuzzy.de
Beta-Version of NEFCLASS-J:
http://www.neuro-fuzzy.de/nefclass/nefclassj
156
N
SF
EURO
UZZY
Download NEFCLASS-J
Download the free version of NEFCLASS-J at

http://fuzzy.cs.uni-magdeburg.de
157
N
SF
EURO
UZZY
Conclusions
Neuro-Fuzzy-Systems can be useful for knowledge discovery.
Interpretability enables plausibility checks and improves acceptance.
(Neuro-)Fuzzy systems exploit tolerance for sub-optimal solutions.
Neuro-fuzzy learning algorithms must observe constraints in order not to

jeopardise the semantics of the model.
Not an automatic model creator, the user must work with the tool.
Simple learning techniques support explorative data analysis.
158
N
SF
EURO
UZZY

PPT

Diunggah oleh

Informasi Dokumen

Deskripsi Asli:

Judul Asli

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

PPT

Diunggah oleh

Hak Cipta:

Format Tersedia

14.

z manual tuning: time consuming and error-prone

 Therefore: Support this process by learning

Approaches from Neural Networks can be used

 Reinforcement learning must be used to compute an error value

 Example: GARIC (Berenji/Kedhkar 1992)

 Example: NEFCON (Nauck/Kruse 1993)

 Database: time series from 1986 - 1997

DAX Composite DAX

 DM analyzes each source (human expert, data +

 DM obtains a “weight” for each source

 DM “eliminates” bad sources

 DM determines the weighted sum of source outputs

 Determination of “Return of Invest” N

 Technique: Neuro-Fuzzy Systems

2. Evaluation of rules (rule activity)

3. Accumulation of rule inputs and normalization

Reduction of the dimension of the weight space

Benefits: x the optimized rule base can still be interpreted

Validation data from March 01, 1994 until April 1997

 is always interpretable as a fuzzy system

 uses constraint learning procedures

 is a function approximator (classifier, controller)

 Detect hyperboxes in the data, example: XOR function

Provide initial fuzzy sets for

Compute best consequents

small medium large

 Missing values can be handled

 Advantage: All rules share the same fuzzy sets

 Special heuristic procedures that do not use gradient information.

 The learning algorithms are based on the idea of backpropagation.

 Results from patients tested for breast cancer

 Decision support: Do the data indicate a malignant or a benign

 A surgeon must be able to check the classification for

 We are looking for a simple and interpretable classifier:

 2 classes: benign (458), malignant (241).

 9 attributes with values from {1, ... , 10}

 Experiment: x3 and x6 are interpreted as nominal attributes.

 x3 and x6 are usually seen as „important“ attributes.

 Free version for research available

 Project started at Neuro-Fuzzy Group of University of Magdeburg, Germany

Output variables (class labels)

Fuzzy sets (antecedents)

Input variables (attributes)

 Automatic induction of a fuzzy rule base from data

c1 c2 R1: if x is large and y is small, then class is c1.

R2: if x is large and y is large, then class is c2.

The connections x o R1 and x o R2

Error Signal Fuzzy Error ( jth output):

(t : correct output, o : actual output)

x small medium large

Heuristics: a fuzzy set is moved away from x (towards x)

local Observing the error on

• Valid parameter values

Correcting a partition after

repeat 1. Remove variables

if (no improvement) 3. Remove terms

until (no improvement); 4. Remove fuzzy sets

Estimated Performance on Unseen Data (Cross Validation)

 NEFCLASS-J: 95.42%  NEFCLASS-J (numeric): 94.14%

Foundations of Neuro-Fuzzy Systems

Neuro-Fuzzy Software (NEFCLASS, NEFCON, NEFPROX):

Therefore: Support this process by learning

Reinforcement learning must be used to compute an error value

Example: GARIC (Berenji/Kedhkar 1992)

Example: NEFCON (Nauck/Kruse 1993)

Database: time series from 1986 - 1997

DM analyzes each source (human expert, data +

DM obtains a “weight” for each source

DM “eliminates” bad sources

DM determines the weighted sum of source outputs

Determination of “Return of Invest” N

Technique: Neuro-Fuzzy Systems

is always interpretable as a fuzzy system

uses constraint learning procedures

is a function approximator (classifier, controller)

Detect hyperboxes in the data, example: XOR function

Missing values can be handled

Advantage: All rules share the same fuzzy sets

Special heuristic procedures that do not use gradient information.

The learning algorithms are based on the idea of backpropagation.

Results from patients tested for breast cancer

Decision support: Do the data indicate a malignant or a benign

A surgeon must be able to check the classification for

We are looking for a simple and interpretable classifier:

2 classes: benign (458), malignant (241).

9 attributes with values from {1, ... , 10}

Experiment: x3 and x6 are interpreted as nominal attributes.

x3 and x6 are usually seen as „important“ attributes.

Free version for research available

Project started at Neuro-Fuzzy Group of University of Magdeburg, Germany

Automatic induction of a fuzzy rule base from data

NEFCLASS-J: 95.42% NEFCLASS-J (numeric): 94.14%

Interpretability enables plausibility checks and improves acceptance.

(Neuro-)Fuzzy systems exploit tolerance for sub-optimal solutions.

Neuro-fuzzy learning algorithms must observe constraints in order not to

Simple learning techniques support explorative data analysis.