8 tayangan

Diunggah oleh Mészáros Balázs

ANN slides

ANN slides

Attribution Non-Commercial (BY-NC)

- Biological and Artificial Neurons
- Face Recognition Using SOM Network
- A Thesis on Automated Handling of Port Containers Using Machine Learning
- Deep Learning - Evolution and Future Trends
- This Document Contains a Step by Step Guide to Implementing a Simple Neural Network in C
- Kohon En
- Artificial Neural Network Travel Time Prediction Model for Buses
- An Example Machine Learning Notebook
- Farsi Handwritten Digit Recognition Based on Mixture of RBF Experts
- 13009480 Estimation of Concrete Compressive Strenght by Ultasonic Pulse Velocity
- of Induction Motor Using Artificial Neural Network and Implementation in MATLAB
- Art [Fritsch, 2010] neuralnet Training of Neural Networks.pdf
- 01388509
- answers.pdf
- Trabajo 3.1 Release
- p015
- Decision Tree Analysis Intrusion Detection How to Guide
- 11773-26759-1-SM
- Proj
- Ijsetr Vol 3 Issue 1-94-99

Anda di halaman 1dari 7

Background Artificial neurons, what they can and cannot do The multilayer perceptron (MLP) Three forms of learning The back propagation algorithm Radial basis function networks Competitive learning (and relatives)

1

An artificial neuron

x0 = +1 x1 x2 xn w2 wn w1 w0 =

y = f (S )

S=

f (S ) = 1 1 + e S

2

wi xi =

wi xi

i =0

i =1

The neuron can be used as a classifier y<0.5 class 0 y>0.5 class 1 Linear discriminant = a hyper plane x2 2D example: A line

x2 =

Not linearly separable must combine two linear discriminants.

x2 1

w1 x1 + w2 w2

Only linearly separable classification problems can be solved. x1

3

0 0 1 NOR

x1 AND

Two sigmoids implement fuzzy AND and NOR 4

store information in the weights, not in the nodes are trained, by adjusting the weights, not programmed can generalize to previously unseen data are adaptive are fast computational devices, well suited for parallel simulation and/or hardware implementation are fault tolerant

5 6

Inputs

Outputs

Can implement any function, given a sufficiently rich internal structure (number of nodes and layers)

Application areas

Finance Forecasting Fraud detection Medicine Image analysis Consumer market Household equipment Character recognition Speech recognition Industry Adaptive control Signal analysis Data mining

(statistical methods are always at least as good, right?)

Neural networks are statistical methods Model independence Adaptivity/Flexibility Concurrency Economical reasons (rapid prototyping)

Supervised

Input Target function State Learning system Error Learning system

Suggested actions

Back propagation

Input Output (y) Desired output (d) Error function Action

Unsupervised

Reinforcement

Environment Reward Agent

Action selector

E w ji

The weight should be moved in proportion to that contribution, but in the other direction:

w ji =

9

E w ji

10

i

Network is initialised with small random weights Split data in two a training set and a test set The training set is used for training and is passed through many times. Update weights after each presentation (pattern learning) or Accumulate weight changes (w) until the end of the training set is reached (epoch or batch learning) The test set is used to test for generalization (to see how well the net does on previously unseen data). This is the result the counts!

wji

Assumptions

Error is squared error:

w ji =

E=

1 2

n j =1

j k E = j xi w ji derivative of error

(d j y j )2

j = y (1 y ) j j

y j = f (S j ) =

1 1+ e

S j

derivative of sigmoid

sum over all nodes in the next layer (closer to the outputs)

11

12

Overtraining

E Typical error curves Test or validation set error

Network size

Overtraining is more likely to occur

if we train on too little data if the network has too many hidden nodes if we train for too long

Training set error Time (epochs) Overtraining Cross validation: Use a third set, a validation set, to decide when to stop (find the minimum for this set, and retrain for that number of epochs)

13

The network should be slightly larger than the size necessary to represent the target function Unfortunately, the target function is unknown ... Need much more training data than the number of weights!

14

1. Start with a small network, train, increase the size, train again, etc., until the error on the training set can be reduced to acceptable levels. 2. If an acceptable error level was found, increase the size by a few percent and retrain again, this time using the cross-validation procedure to decide when to stop. Publish the result on the independent test set. 3. If the network failed to reduce the error on the training set, despite a large number of nodes and attempts, something is likely to be wrong with the data.

15

Practical considerations

What happens if the mapping represented by the data is not a function? For example, what if the same input does not always lead to the same output? In what order should data be presented? Sequentially? At random? How should data be represented? Compact? Distributed? What can be done about missing data? Trick of the trade: Monotonic functions are easier to learn than non-monotonic functions! (at least for the MLP)

16

Layered structure, like the MLP, with one hidden layer Output nodes are conventional Each hidden node

Geometric interpretation

The input space is covered with overlapping Gaussians.

measures the distance between its weight vector and the input vector (instead of a weighted sum)

Inputs

Outputs

17 18

RBF training

Could use backprop (transfer function still differentiable) Better: Train layers separately

RBF (hidden) nodes work in a local region, MLP nodes are global MLPs do better in high-dimensional spaces MLPs require fewer nodes and generalizes better RBFs can learn faster RBFs are less sensitive to the order in which data is presented RBFs make less false-yes classification errors MLPs extrapolate better

20

Hidden layer: Find position and size of Gaussians by unsupervised learning (e.g. competitive learning, K-means) Output layer: Supervised, e.g. Delta-rule, LMS, backprop

19

Unsupervised learning

Classifying unlabeled data Nearest neighbour classifiers Classify the unknown sample (vector) x to the class of its closest previously classified neighbour

The new pattern, x, will be classified as a . x

K-means

K-means, for K=2 Make a codebook of two vectors, c1 and c2 Sample (at random) two vectors from the data as initial values of c1 and c2 Split the data in two subsets, D1 and D2 where D1 is the set of all points with c1 as their closest codebook vector, and vice versa Move c1 towards the mean in D1 and c2 towards the mean in D2 Repeat from 3 until convergence (until the codebook vectors stop moving)

Problem 1: The closest neighbour may be an outlier from the wrong class Problem 2: Must store lots of samples and compute distance to each one, for every new sample

21

Voroni regions

K-means form so called Voroni regions in the input space The Voroni region around a codebook vector ci is the region in which ci is the closest codebook vector

1. 2. 3.

Competitive learning

M linear, threshold less, nodes (only weighted sums) N inputs

Present a pattern (sample), x The node with the largest output (node k) is declared winner The weights of the winner is updated so that it will become even stronger the next time the same pattern is presented. All other weights are left unchanged With normalised weights, this is equivalent to finding the node with the minimum distance between its weight vector and the input vector Network node = Codebook vector

wki = ( xi wki)

Voroni regions around 10 codebook vectors

23

1 i N

24

Problem with competitive learning: A node may become invincible A B

The cerebral cortex is a twodimensional structure, yet we can reason in more than two dimensions Different neurons in the auditory cortex respond to different frequencies. These neurons are located in frequency order! Topological preservation / topographic map

W

Poor initialisation: The weight vectors have been initialised to small random numbers (in W), but these are far from the data (A and B) The first node to win will move from W towards A or B and will always win, henceforth Solution: Use the data to initialise the weights (as in K-means), or include the winning-frequency in the distance measure, or move more nodes than only 25 the winner.

Dimensional reduction

Non-linear, topologically preserving, dimensional reduction (like pressing a flower)

26

SOM

Competitive learning, extended in two ways: 1. The nodes are organised in a two-dimensional grid

(in competitive learning, there is no defined order between nodes) A 3x3 grid, making a twodimensional map of the fourdimensional input space

Find the winner, node k, and then update all weights by:

wki = f ( j , k )( xi wki )

1 i N

(not only the winner is updated, but also its closest neighbours in the grid)

f(j, k) is a neighbourhood function in the range [0,1], with a maximum for the winner ( j=k) and decreasing with distance from the winner, e.g. a Gaussian Gradually decrease neighbourhood radius (width of the Gaussians) and learning rate () over time. Result: Vectors that are close in the high-dimensional input space will activate areas that are close on the grid.

28

27

A 10x10 SOM, is trained on a chemical analysis of 178 wines from one region in Italy, where the grapes have grown on three different types of soil. The input is 13-dimensional. After training, wines from different soil types activate different regions of the SOM. For example:

http://websom.hut.fi A two-dimensional, clickable, map of Usenet news articles (from comp.ai.neural-nets)

Note that the network is not told that the difference between the wines is the soil type, nor how many such types (how many classes) there are.

29

Growing unsupervised network (starting from two nodes) Dynamic neighbourhood Constant parameters Very good at following moving targets Can also follow jumping targets Current work: Using GNG to define and train the hidden layer of Gaussians in a RBF network

31

Node positions

Start with two nodes Each node has a set of neighbours, indicated by edges The edges are created and destroyed dynamically during training For each sample, the closest node, k, and all its current neighbours are moved towards the input

32

Node creation

A new node (blue) is created every th time step, unless the maximum number of nodes has been reached The new node is placed halfway between the node with the greatest error and the node among its current neighbours with the greatest error The node with the greatest error is the most unstable one

33

Here, a fourth node has just been created In effect, new nodes are created close to where they are most likely needed The exact position of the new node is not crucial, since nodes move around

34

After a while

Neighbourhood

Neighbourhood edges are created and destroyed as follows: For each sample, let k denote the winner (the node closest to the sample) and r the runner-up (the second closest) If an edge exists between k and r, reset its age to 0

Otherwise, create such an edge and set its age to 0

7 nodes

35

Increment the age of all other edges emanating from node k Edges older than amax are removed, as are any nodes that in this way loses its last remaining edge

36

Delaunay triangulation

Connect the codebook vectors in all adjacent Voroni regions

Dead units

There is only one way for an edge to get younger when the two nodes it interconnects are the two closest to the input If one of the two nodes wins, but the other one is not the runner-up, then, and only then, the edge ages If neither of the two nodes win, the edge does not age!

37

The input distribution has jumped from the lower left to the upper right corner

38

The lab

(in room 1515!) Classification of bitmaps, by supervised learning (back propagation), using the SNNS simulator An illustration of some unsupervised learning algorithms, using the GNG demo applet

LBG/LBG-U ( K-means) HCL (Hard competitive learning) Neural gas CHL (Competitive Hebbian learning) Neural gas with CHL GNG/GNG-U (Growing neural gas) SOM (Self organising map) Growing grid

39

- Biological and Artificial NeuronsDiunggah olehhitesh2012
- Face Recognition Using SOM NetworkDiunggah olehEditor IJRITCC
- A Thesis on Automated Handling of Port Containers Using Machine LearningDiunggah olehAshifur Rahaman
- Deep Learning - Evolution and Future TrendsDiunggah olehAI Labs
- This Document Contains a Step by Step Guide to Implementing a Simple Neural Network in CDiunggah olehcallmedugu
- Kohon EnDiunggah olehSonali Kushwah
- Artificial Neural Network Travel Time Prediction Model for BusesDiunggah olehAs Mansur
- An Example Machine Learning NotebookDiunggah olehRamakrishnan Harihara Venkatasubramania
- Farsi Handwritten Digit Recognition Based on Mixture of RBF ExpertsDiunggah olehzzzxxccvvv
- 13009480 Estimation of Concrete Compressive Strenght by Ultasonic Pulse VelocityDiunggah olehjoseph_karunya6975
- of Induction Motor Using Artificial Neural Network and Implementation in MATLABDiunggah olehnareshreddy136
- Art [Fritsch, 2010] neuralnet Training of Neural Networks.pdfDiunggah olehIsmael Neu
- 01388509Diunggah olehapi-3697505
- answers.pdfDiunggah olehMuhammadRizvannIslamKhan
- Trabajo 3.1 ReleaseDiunggah olehTravis James
- p015Diunggah olehMario Zamora
- Decision Tree Analysis Intrusion Detection How to GuideDiunggah olehVulkan Abriyanto
- 11773-26759-1-SMDiunggah olehlubeck abraham huaman ponce
- ProjDiunggah olehPranav Patil
- Ijsetr Vol 3 Issue 1-94-99Diunggah olehNaveed Khan
- 5555555Diunggah olehali
- 3836801 Neural NetworksDiunggah olehvalentinz1
- 20100734Diunggah olehs.pawar.19914262
- ANN Exam 2001Diunggah olehskphero
- Neural network approach on SR.pdfDiunggah olehSuman
- Building Business Intelligence and Data Mining Applications With Microsoft SQL Server 2005Diunggah olehmeomuncute1988
- A Robotic Eye Controller Based on Cooperative Neural AgentsDiunggah olehAhmed Alostaz
- Brain and Lung Cancer DetectionDiunggah olehSachin Pathare
- 10.1.1.1Diunggah olehbmntan
- 1-s2.0-S0142061511000172-mainDiunggah olehAli Asaad

- Lorem Ipsum Dolor Sit AmetDiunggah olehNinaMalina
- TimeZonesDiunggah olehMészáros Balázs
- Piano Funk Groove in FDiunggah olehRuben van Harmelen
- Preflop Texas Hold`em short stackDiunggah olehMészáros Balázs
- Microsoft Portable Execution and Common Object FIle Format SpecificationDiunggah olehJeff Pratt
- ELF FormatDiunggah olehapi-3721578

- Alzheimers DiseaseDiunggah olehSæhrímnir
- Autonomic Nervous System in the Head Neck E-learningDiunggah olehfarahnadmajid
- Intelligence TestsDiunggah olehmarcoknight
- final research proposalDiunggah olehapi-446885985
- The Power of Negative Thinking.pdfDiunggah olehDarkoPolšek
- Knowledge MgmtDiunggah olehsuryamon
- Michael Yapko Pre Con Mindfulness Article for NetworkerDiunggah olehPedro Machado
- Pip JourneyDiunggah olehLeah Abbott
- Niall McLaren Must Download to see highlights Interactive Dualism as a Partial Solution to the Mind Brain Problem for Psychiatry 2006Diunggah olehChomskeet
- High MotivationDiunggah olehpolice44
- journalDiunggah olehnegriRD
- Propioceptive Neuro Muscular - JournalDiunggah olehAgus Kresna Ardiyana
- Carmine Case Analysis of Wolfgang Keller at Königsbräu Hellas A.E.Diunggah olehcarminemg
- Nervous SystemDiunggah olehelvico
- SIPTDiunggah olehAthila Mohamed
- Overview of Early Childhood Ages 3-4.pdfDiunggah olehTancicleide Gomes
- dee-0006-0020Diunggah olehIcaro
- Application of the ICD-11 classification of personality disordersDiunggah olehIrene
- PR-1.pptxDiunggah olehAngel
- DOES HIGH SELF-ESTEEM CAUSE BETTER PERFORMANCEDiunggah olehDrace Klimt
- Lindenberger Human Cognitive Aging Corriger La FortuneDiunggah olehjan5437
- obsesiv compulsivDiunggah olehGaby Gavrilas
- Elizabeth Holmes Body Language Report (1)Diunggah olehAmmajan
- Olfactory Identification AlzheimerDiunggah olehJahiberAlférez
- Olfaction PATHWAYDiunggah olehAchmadPrihadianto
- instructional strategies for students with learning disabilitiesDiunggah olehapi-257938212
- 681-traumatic brain injury presentationDiunggah olehapi-308905646
- BIONIC EYE ABSTRACTDiunggah olehvamsi28
- Recent Advance in Treatment of SchizophreniaDiunggah olehakashdee
- Triune Brain - WikipediaDiunggah olehWilhelm Richard Wagner