Anda di halaman 1dari 16

Image Processing Techniques for Speech Recognition

Presented by Amr Medhat, Mostafa Fathy and Sameh Serag Supervised by

Dr. Magda Fayek


Date: 5-5-2004

Agenda
Introduction Audio Visual Modeling Spectrogram Reading

Spectrogram Filtering

Introduction
What is Speech Recognition? Speech Recognition Problems
noise inter and intra speaker variation continuous: no boundaries between words

Image Processing is a possible solution

Audio Visual Speech Modeling


Reading speech from facial and lip movements. Categorizing mouth shapes visual phonemes (visemes) #phoneme: the smallest distinctive unit of speech sound Why?
distinguish between confusing phonemes (like f, s and m, n) improve recognition performance in noisy environments.

Demo
Ready for the challenge ? Listen to this audio and try to understand the speech content: vox_mix[1].mov Listen to speech with video image: dig_tranexp[1].mov Did you understand the content? Get a prize from IBM Play the answer: vox_clean[1].mov (5893642)

How?

Audio Feature Extraction Audio-Visual Integration Visual Feature Extraction

Visual Features
Geometric lip dimensions
Lip shape: height/width of the inner/outer lip

Visibility of the tongue/teeth

Audio-Visual Integration
Feature Fusion Synchronization Problem

Use low-resolution image

Spectrograms
A Spectrogram:
Translation of speech into the visual domain

frequency

time

Spectrogram Reading

Waveform and Spectrogram of the word: "phonetician"

Spectrogram Filtering
Required:

How?

Using: Morphological Processing

Morphological Processing
Based on the theory of Mathematical Morphology ?!! Stresses the role of shape in image preprocessing used for region identification. Important Morphological operations:
Erosion Dilation Opening Closing

Erosion & Dilation


Erosion: the meaning
Used to shrink objects.

Dilation: the meaning


Dual of erosion. Used to fill small gaps or valleys between shapes

Both are irreversible

Opening & Closing


Both used for smoothing an object contour Opening: Erosion followed by dilation
smoothes from the inside of the object contour separate objects.
Erosion Dilation

Closing: Dilation followed by erosion


smoothes from the outside of the object contour fill in small halls.
Dilation Erosion

Spectrogram Filtering
convert

thresholding

erosion

dilation

ANDing

convert

Anda mungkin juga menyukai