AML M3 Introduction To CNNs

ADVANCED MACHINE
LEARNING
Module 3: Get introduced to convolutional neural
networks
Instructor: Amit Sethi
Co-developer: Neeraj Kumar
TAs: Gaurav Yadav, Niladri Bhatacharya
Page: AdvancedMachineLearning.weebly.com
IITG Course No: EE 622
Module objectives
Understand the basics of CNN
Convolutions
Pooling
Fully connected layers
Understand the basics of backpropagation
through CNN
Be able to use LeNet level CNNs for simple tasks
Receptive fields of neurons

Levine and Shefner (1991) define a receptive field as an
"area in which stimulation leads to response of a

particular sensory neuron" (p. 671).
Source: http://psych.hanover.edu/Krantz/receptive/
The concept of the best stimulus

Depending on excitatory and inhibitory connections, there
is an optimal stimulus that falls only in the excitatory

region
On-center retinal ganglion cell example shown here
On-center vs. off-center
Source: https://en.wikipedia.org/wiki/Receptive_field
Bar detection example
Gabor filters model simple cell in

visual cortex
Source: https://en.wikipedia.org/wiki/Gabor_filter
Modeling oriented edges using Gabor
Feature maps using Gabor filters
Source: https://en.wikipedia.org/wiki/Gabor_filter
Haar filters
More feature maps
Source: http://www.cosy.sbg.ac.at/~hegenbart/
Convolution
Lets convolve
For g*[n] = g[-n], to convert to sum of element-wise sum
products
Now convert it to 2-D
, + , +
Fast implementation for multiple PUs
Convolution animation
Source: http://bmia.bmt.tue.nl/education/courses/fev/course/notebooks/triangleblockconvolution.gif
Convolution in 2-D (sharpening filter)
Source: https://upload.wikimedia.org/wikipedia/commons/4/4f/3D_Convolution_Animation.gif
Convolution in 2-D (X-shaped)
Source: http://deeplearning.stanford.edu/wiki/index.php/Feature_extraction_using_convolution
Source: http://www.slideshare.net/srkrishna341/matched-filter-12735596
CNN: Let the network learn conv kernels
Number of weights with and without conv.

Fully connected layer:
Input 32x32x3
Hidden 28x28x25
Weights 32x32x3 x 28x28x25 = 60,211,200
With convolutions (weight sharing):

Input 32x32x3
Hidden 28x28x25
Weights 5x5x3 x 25 = 1,875
How will backpropagation work?

Backpropagation will treat each input patch
(not image) as a sample!
Feature maps
Convolutional layer:
Input A (set of) layer(s)
Convolutional filter(s)
Bias(es)
Nonlinear squashing
Output Another layer(s); AKA: Feature maps
A map of where each feature was detected

A shift in input => A shift in feature map
Is it important to know where exactly the feature was
detected?
Notion of invariances: translation, scaling, rotation,
contrast
Pooling is subsampling
Source: "Gradient-based learning applied to document recognition" by Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, in Proc. IEEE, Nov. 1998.
Types of pooling
Two types of pooling
Average
Max
How do these differ?
How do gradient computations differ?
Other considerations:
Bias and nonlinearity after pooling
A bi-pyramid approach: Map size

decreases, but number of maps increases
Why?
Fully connected layers
Multi-layer non-linear decision making
Visualizing weights, conv layer 1
Source: http://cs231n.github.io/understanding-cnn/
Visualizing feature map, conv layer 1
Visualizing weights, conv layer 2
Visualizing feature map, conv layer 2
CNN for speech processing
Source: "Convolutional neural networks for speech recognition" by Ossama Abdel-Hamid et al., in IEEE/ACM Trans. ASLP, Oct, 2014
CNN for DNA-protein binding
Source: "Convolutional neural network architectures for predicting DNAprotein binding by Haoyang Zeng et al., Bioinformatics 2016, 32 (12)
31
In summary, CNNs
Use convolutions to:
Reduce number of free parameters
Regularize the model
Detect features at various locations
Utilize spatial (grid) contiguity
Use pooling to:
Reduce feature maps size
Introduce translation invariance
Introduce scale invariance
Introduce rotation invariance
Detect presence of features
Together, CNNs are very powerful tools for:
Learning feature hierarchies on data with local structure on a grid

AML M3 Introduction To CNNs

Diunggah oleh

Informasi Dokumen

Judul Asli

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

AML M3 Introduction To CNNs

Diunggah oleh

Hak Cipta:

Format Tersedia

ADVANCED MACHINE

Receptive fields of neurons

"area in which stimulation leads to response of a

The concept of the best stimulus

is an optimal stimulus that falls only in the excitatory

On-center vs. off-center

Bar detection example

Gabor filters model simple cell in

Modeling oriented edges using Gabor

Feature maps using Gabor filters

More feature maps

For g*[n] = g[-n], to convert to sum of element-wise sum

Fast implementation for multiple PUs

Convolution in 2-D (sharpening filter)

Convolution in 2-D (X-shaped)

CNN: Let the network learn conv kernels

Number of weights with and without conv.

With convolutions (weight sharing):

Weights 5x5x3 x 25 = 1,875

How will backpropagation work?

(not image) as a sample!

Output Another layer(s); AKA: Feature maps

A map of where each feature was detected

A bi-pyramid approach: Map size

Fully connected layers

Multi-layer non-linear decision making

Visualizing weights, conv layer 1

Visualizing feature map, conv layer 1

Visualizing weights, conv layer 2

Visualizing feature map, conv layer 2

CNN for speech processing

CNN for DNA-protein binding

Anda mungkin juga menyukai