Anda di halaman 1dari 34

Look Whos Talking Now SEM Exchange, Fall 2008

October 9, 2008
1
Montgomery College
Speaker Identification Using Pitch Engineering Expo Banquet 2009
05/08/09
Speaker Identification Using a
Pitch Detection Algorithm

Presenters:
Estefany Carrillo
Roberto M. Melndez
Komal Syed


Montgomery College
Speech Processing
Center

Faculty Advisor:
Dr. Uchechukwu Abanulo
Look Whos Talking Now SEM Exchange, Fall 2008
October 9, 2008
2
Montgomery College
Speaker Identification Using Pitch Engineering Expo Banquet 2009
05/08/09



Introduction
Speech Classification Algorithm
Pitch Detection Algorithm
Application and Results
Summary

Presentation Outline

Presenters:
Estefany Carrillo
Roberto M. Melndez
Komal Syed


Montgomery College
Speech Processing
Center

Faculty Advisor:
Dr. Uchechukwu Abanulo
Look Whos Talking Now SEM Exchange, Fall 2008
October 9, 2008
3
Montgomery College
Speaker Identification Using Pitch Engineering Expo Banquet 2009
05/08/09
Objectives

To estimate the pitch contour of a given
speech signal using autocorrelation


To determine the effectiveness of pitch
for speaker identification


Introduction

Speech Classification Algorithm

Pitch Detection Algorithm

Application and Results

Summary

Look Whos Talking Now SEM Exchange, Fall 2008
October 9, 2008
4
Montgomery College
Speaker Identification Using Pitch Engineering Expo Banquet 2009
05/08/09
Speech Signals


To understand pitch, one must first understand
some basic concepts of speech signals
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
Energy level waveform of "Shout" 0.8s @ 10kHz
Time (s)
A
m
p
l
i
t
u
d
e



Introduction

Speech Classification Algorithm

Pitch Detection Algorithm

Application and Results

Summary

Look Whos Talking Now SEM Exchange, Fall 2008
October 9, 2008
5
Montgomery College
Speaker Identification Using Pitch Engineering Expo Banquet 2009
05/08/09
Voiced vs. Unvoiced Speech
5
Voiced

Quasi-periodic
excitation

Modulation by vocal
tract

Production of mainly
vowels

High Energy



Unvoiced


No periodic vibration
of vocal chords

Noise-like nature


Production of most
consonants


Low Energy



Introduction

Speech Classification Algorithm

Pitch Detection Algorithm

Application and Results

Summary

Look Whos Talking Now SEM Exchange, Fall 2008
October 9, 2008
6
Montgomery College
Speaker Identification Using Pitch Engineering Expo Banquet 2009
05/08/09
Speech Signals


0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
Energy level waveform of "Shout" 0.8s @ 10kHz
Time (s)
A
m
p
l
i
t
u
d
e


0.34 0.36 0.38 0.4 0.42 0.44 0.46 0.48 0.5 0.52
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
Energy level waveform of "Shout" 0.8s @10kHz
Time (s)
A
m
p
l i t u
d
e


0.29 0.3 0.31 0.32 0.33 0.34
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
Energy level waveform of "Shout" 0.8s @10kHz
Time (s)
A
m
p
l i t u
d
e



Introduction

Speech Classification Algorithm

Pitch Detection Algorithm

Application and Results

Summary

Look Whos Talking Now SEM Exchange, Fall 2008
October 9, 2008
7
Montgomery College
Speaker Identification Using Pitch Engineering Expo Banquet 2009
05/08/09
Pitch Illustration
0.34 0.36 0.38 0.4 0.42 0.44 0.46 0.48 0.5 0.52
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
Energy level waveform of "Shout" 0.8s @ 10kHz
Time (s)
A
m
p
l i t u
d
e



Pitch period is the distance in time from one peak
to the next
Approximately the same for the same phoneme by
the same speaker
0.29 0.3 0.31 0.32 0.33 0.34
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
Energy level waveform of "Shout" 0.8s @ 10kHz
Time (s)
A
m
p
l i t u
d
e



No periodicity, no frequency

Introduction

Speech Classification Algorithm

Pitch Detection Algorithm

Application and Results

Summary

Look Whos Talking Now SEM Exchange, Fall 2008
October 9, 2008
8
Montgomery College
Speaker Identification Using Pitch Engineering Expo Banquet 2009
05/08/09
How do we measure the pitch
period Automatically?
Correlation
Measure of similarity between two signals
Two signals compared by
Sliding one signal by a certain time lag
Multiplying both the overlapping regions
Repeating the process and adding the products
until there is no more overlap
Cross-correlation two different signals
compared
Autocorrelation the same signal correlated
Results in a maximum peak at which we set time
= 0, and the rest of the correlation signals tapers
of to zero




Introduction

Speech Classification Algorithm

Pitch Detection Algorithm

Application and Results

Summary

Look Whos Talking Now SEM Exchange, Fall 2008
October 9, 2008
9
Montgomery College
Speaker Identification Using Pitch Engineering Expo Banquet 2009
05/08/09
Rationale for Autocorrelation

Introduction

Speech Classification Algorithm

Pitch Detection Algorithm

Application and Results

Summary


1. A periodic (or quasi-periodic) signal will be
similar from one period to the next

2. It is expected that the maximum peak in the
autocorrelation function will occur at the pitch
period value for each speech frame.
Look Whos Talking Now SEM Exchange, Fall 2008
October 9, 2008
10
Montgomery College
Speaker Identification Using Pitch Engineering Expo Banquet 2009
05/08/09

Introduction

Speech Classification Algorithm

Pitch Detection Algorithm

Application and Results

Summary

Speech Classification Algorithm
Look Whos Talking Now SEM Exchange, Fall 2008
October 9, 2008
11
Montgomery College
Speaker Identification Using Pitch Engineering Expo Banquet 2009
05/08/09
Speech Classification
1. Given a normalized speech signal (amplitudes from -1 to
1)

2. Since speech is non-stationary (changes characteristics
frequently with time), we first segment this signal into
short frames (of about 10 ms)

3. We then compute the average energy of each frame:


4. Based on a pre-determined threshold, we classify the
speech into voiced or unvoiced or background

Introduction

Speech Classification Algorithm

Pitch Detection Algorithm

Application and Results

Summary

Look Whos Talking Now SEM Exchange, Fall 2008
October 9, 2008
12
Montgomery College
Speaker Identification Using Pitch Engineering Expo Banquet 2009
05/08/09

Introduction

Speech Classification Algorithm

Pitch Detection Algorithm

Application and Results

Summary

Pitch Detection Algorithm
Look Whos Talking Now SEM Exchange, Fall 2008
October 9, 2008
13
Montgomery College
Speaker Identification Using Pitch Engineering Expo Banquet 2009
05/08/09
Autocorrelation Based PDA
1. First we automatically assign a pitch of zero to every
unvoiced or silence frame determined from the speech
classification algorithm

2. We then compute the autocorrelation function of each
voiced frame

3. A peak is searched for within the 2ms to 16ms range

4. The lag of this peak is considered the pitch period for that
frame, and the pitch is computed as the inverse of that
lag.

Introduction

Speech Classification Algorithm

Pitch Detection Algorithm

Application and Results

Summary

Pitch = 0
Pitch = 0
0 500 1000 1500 2000 2500 3000
-200
0
200
400
Sample #
A
m
p
l
i
t
u
d
e
Autocorrelation of a frame of speech
Zero lag
Look Whos Talking Now SEM Exchange, Fall 2008
October 9, 2008
14
Montgomery College
Speaker Identification Using Pitch Engineering Expo Banquet 2009
05/08/09

Introduction

Speech Classification Algorithm

Pitch Detection Algorithm

Application and Results

Summary

Autocorrelation Based PDA -
Illustration
0.006 0.007 0.008 0.009 0.01 0.011 0.012 0.013 0.014 0.015
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
Pitch Period of One Frame of Speech
Time (s)
A
m
p
l
i
t
u
d
e
0 0.005 0.01 0.015 0.02 0.025 0.03
-20
-10
0
10
20
30
Autocorrelation of One Frame of Speech
Time (s)
A
m
p
l
i
t
u
d
e

(
H
z
)
One itch Period = 7.3ms (approx.)
Maximum
Autocorrelation
Estimated pitch period
= about 7.3ms pitch =
1/7.3ms = 137Hz
Look Whos Talking Now SEM Exchange, Fall 2008
October 9, 2008
15
Montgomery College
Speaker Identification Using Pitch Engineering Expo Banquet 2009
05/08/09

Introduction

Speech Classification Algorithm

Pitch Detection Algorithm

Application and Results

Summary

Application and Results
Look Whos Talking Now SEM Exchange, Fall 2008
October 9, 2008
16
Montgomery College
Speaker Identification Using Pitch Engineering Expo Banquet 2009
05/08/09

Introduction

Speech Classification Algorithm

Pitch Detection Algorithm

Application and Results

Summary

Speaker Recognition
Reference Speech
Feature Extraction
Model Building
Test Speech
Feature
Extraction
Comparison
Recognition
Decision
System
Output
Look Whos Talking Now SEM Exchange, Fall 2008
October 9, 2008
17
Montgomery College
Speaker Identification Using Pitch Engineering Expo Banquet 2009
05/08/09

Introduction

Speech Classification Algorithm

Pitch Detection Algorithm

Application and Results

Summary

Speaker Identification using PDA
Reference Speech
Pitch Detection
Average Pitch of Signal
Test Speech
Pitch
Detection and
average
pitch
computation
Distance
Computation
Speaker
= Minimum
distance
System
Output
Test Speech
Test Speech
Look Whos Talking Now SEM Exchange, Fall 2008
October 9, 2008
18
Montgomery College
Speaker Identification Using Pitch Engineering Expo Banquet 2009
05/08/09

Introduction

Speech Classification Algorithm

Pitch Detection Algorithm

Application and Results

Summary

Experiment
Group II: 10 Men Group I: 10 Women
1. Record each group member twice saying
the same phrase
2. Record each group member saying a
different phrase

Look Whos Talking Now SEM Exchange, Fall 2008
October 9, 2008
19
Montgomery College
Speaker Identification Using Pitch Engineering Expo Banquet 2009
05/08/09
Categories
Case I: Female/Same Phrase
Case II: Male/Same Phrase
Case III: Female/Different
Phrase
Case IV: Male Different Phrase
Case V: Female and Male/Same
Phrase
Case VI: Female and
Male/Different Phrase

Look Whos Talking Now SEM Exchange, Fall 2008
October 9, 2008
20
Montgomery College
Speaker Identification Using Pitch Engineering Expo Banquet 2009
05/08/09
Procedure
1. Select a range of thresholds for
unvoiced segments of speech
Range = [0.001:0.0005:0.01]
2. Construct the pitch contour for each
of the reference and test speech files
for all thresholds
3. Using minimum distance criterion,
determine the test speaker that
matches the reference speaker

Look Whos Talking Now SEM Exchange, Fall 2008
October 9, 2008
21
Montgomery College
Speaker Identification Using Pitch Engineering Expo Banquet 2009
05/08/09
Pitch Contours
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
x 10
4
-2
-1
0
1
Test Speaker
Time (seconds)
A
m
p
l
i
t
u
d
e
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
x 10
4
0
200
400
600
Time (seconds)
E
n
e
r
g
y
P
I
T
C
H
A
M
P
L
I
T
U
D
E

Reference Speaker
Time (ms)
Time (ms)
Look Whos Talking Now SEM Exchange, Fall 2008
October 9, 2008
22
Montgomery College
Speaker Identification Using Pitch Engineering Expo Banquet 2009
05/08/09
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
x 10
4
-1
-0.5
0
0.5
1
Identified Speaker
Time (seconds)
A
m
p
l
i
t
u
d
e
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
x 10
4
0
200
400
600
Time (seconds)
E
n
e
r
g
y
P
I
T
C
H
Matched Test Speaker
A
M
P
L
I
T
U
D
E

Time (ms)
Time (ms)
Pitch Contours
Look Whos Talking Now SEM Exchange, Fall 2008
October 9, 2008
23
Montgomery College
Speaker Identification Using Pitch Engineering Expo Banquet 2009
05/08/09

Introduction

Speech Classification Algorithm

Pitch Detection Algorithm

Application and Results

Summary

Best Threshold
1 2 3 4 5 6 7 8 9 10
x 10
-3
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
Threshold
N
u
m
b
e
r

o
f

M
a
t
c
h
e
s
Accuracy at Different Thresholds
3. Select threshold that gives maximum number of
correctly matched speakers for each category

Look Whos Talking Now SEM Exchange, Fall 2008
October 9, 2008
24
Montgomery College
Speaker Identification Using Pitch Engineering Expo Banquet 2009
05/08/09
Noise
4. Add different levels of noise (5dB
to 30dB) to:
Both reference and test speech files
Only reference speech file
Only test speech files
5. Examine the number of matched
speakers vs. the level of SNR
(Signal to Noise Ratio)
Look Whos Talking Now SEM Exchange, Fall 2008
October 9, 2008
25
Montgomery College
Speaker Identification Using Pitch Engineering Expo Banquet 2009
05/08/09
Female/Same Phrase
0 5 10 15 20 25 30
2
4
6
8
10
12
14
16
SNR Levels
N
u
m
b
e
r

o
f

M
a
t
c
h
e
s
Accuracy at Different Levels of Noise added to Reference and Test Speech Files
0 5 10 15 20 25 30
2
4
6
8
10
12
14
16
18
SNR
N
u
m
b
e
r

o
f

M
a
t
c
h
e
s
Accuracy at Different Levels of Noise Added to Reference File
0 5 10 15 20 25 30
2
4
6
8
10
12
14
16
SNR
N
u
m
b
e
r

o
f

M
a
t
c
h
e
s
Accuracy at Different Levels of Noise Added to Test Files
Noise Added to
Both Files
Noise Added to
Reference File
Noise Added to
Test File
Look Whos Talking Now SEM Exchange, Fall 2008
October 9, 2008
26
Montgomery College
Speaker Identification Using Pitch Engineering Expo Banquet 2009
05/08/09
Male/Same Phrase
0 5 10 15 20 25 30
3
4
5
6
7
8
9
10
11
12
13
SNR
N
u
m
b
e
r

o
f

M
a
t
c
h
e
s
Accuracy at Different Levels of Noise Added to Reference and Test Speech Files
0 5 10 15 20 25 30
3
4
5
6
7
8
9
10
11
12
SNR
N
u
m
b
e
r

o
f

M
a
t
c
h
e
s
Accuracy at Different Levels of Noise Added to Reference File
Noise Added to
Both Files
Noise Added to
Reference File
Noise Added to
Test File
0 5 10 15 20 25 30
4
5
6
7
8
9
10
11
12
SNR
N
u
m
b
e
r

o
f

M
a
t
c
h
e
s
Accuracy at Different Levels of Noise Added to Test Files
Look Whos Talking Now SEM Exchange, Fall 2008
October 9, 2008
27
Montgomery College
Speaker Identification Using Pitch Engineering Expo Banquet 2009
05/08/09
Female/Different Phrase
0 5 10 15 20 25 30
0
2
4
6
8
10
12
14
16
18
SNR
N
u
m
b
e
r

o
f

M
a
t
c
h
e
s
Accuracy at Different Levels of Noise Added to Reference and Test Speech Files
0 5 10 15 20 25 30
2
4
6
8
10
12
14
16
SNR
N
u
m
b
e
r

o
f

M
a
t
c
h
e
s
Accuracy at Different Levels of Noise Added to Reference Speech Files
Noise Added to
Both Files
Noise Added to
Reference File
Noise Added to
Test File
0 5 10 15 20 25 30
2
4
6
8
10
12
14
SNR
N
u
m
b
e
r

o
f

M
a
t
c
h
e
s
Accuracy at Different Levels of Noise Added to Test Speech Files
Look Whos Talking Now SEM Exchange, Fall 2008
October 9, 2008
28
Montgomery College
Speaker Identification Using Pitch Engineering Expo Banquet 2009
05/08/09
Male/Different Phrase
0 5 10 15 20 25 30
4
5
6
7
8
9
10
11
12
SNR
N
u
m
b
e
r

o
f

M
a
t
c
h
e
s
Accuracy at Different Levels of Noise Added to Reference and Test Speech Files
0 5 10 15 20 25 30
2
3
4
5
6
7
8
9
10
SNR
N
u
m
b
e
r

o
f

M
a
t
c
h
e
s
Accuracy at Different Levels of Noise Added to Reference Speech File
Noise Added to
Both Files
Noise Added to
Reference File
Noise Added to
Test File
0 5 10 15 20 25 30
4
5
6
7
8
9
10
11
12
13
SNR
N
u
m
b
e
r

o
f

M
a
t
c
h
e
s
Accuracy at Different Levels of Noise Added to Test Speech Files
Look Whos Talking Now SEM Exchange, Fall 2008
October 9, 2008
29
Montgomery College
Speaker Identification Using Pitch Engineering Expo Banquet 2009
05/08/09
Male and Female/Same Phrase
0 5 10 15 20 25 30
5
10
15
20
25
30
SNR
N
u
m
b
e
r

o
f

M
a
t
c
h
e
s
Accuracy at Different Levels of Noise Added to Reference and Test Speech Files
0 5 10 15 20 25 30
0
5
10
15
20
25
SNR
N
u
m
b
e
r

o
f

M
a
t
c
h
e
s
Accuracy at Different Levels of Noise Added to Reference Speech File
Noise Added to
Both Files
Noise Added to
Reference File
Noise Added to
Test File
0 5 10 15 20 25 30
0
5
10
15
20
25
SNR
N
u
m
b
e
r

o
f

M
a
t
c
h
e
s
Accuracy at Different Levels of Noise Added to Test Speech Files
Look Whos Talking Now SEM Exchange, Fall 2008
October 9, 2008
30
Montgomery College
Speaker Identification Using Pitch Engineering Expo Banquet 2009
05/08/09
Male and Female/Different Phrase
0 5 10 15 20 25 30
4
6
8
10
12
14
16
18
20
22
SNR
N
u
m
b
e
r

o
f

M
a
t
c
h
e
s
Accuracy at Different Levels of Noise Added to Reference and Test Speech Files
0 5 10 15 20 25 30
2
4
6
8
10
12
14
16
18
20
22
SNR
N
u
m
b
e
r

o
f

M
a
t
c
h
e
s
Accuracy at Different Levels of Noise Added to Reference Speech File
Noise Added to
Both Files
Noise Added to
Reference File
Noise Added to
Test File
0 5 10 15 20 25 30
0
5
10
15
20
25
SNR
N
u
m
b
e
r

o
f

M
a
t
c
h
e
s
Accuracy at Different Levels of Noise Added to Test Speech File
Look Whos Talking Now SEM Exchange, Fall 2008
October 9, 2008
31
Montgomery College
Speaker Identification Using Pitch Engineering Expo Banquet 2009
05/08/09

Introduction

Speech Classification Algorithm

Pitch Detection Algorithm

Application and Results

Summary
Summary
Look Whos Talking Now SEM Exchange, Fall 2008
October 9, 2008
32
Montgomery College
Speaker Identification Using Pitch Engineering Expo Banquet 2009
05/08/09

Introduction

Speech Classification Algorithm

Pitch Detection Algorithm

Application and Results

Summary

1. Pitch detection algorithms are heavily
dependent on speech segmentation
accuracy
2. Pitch is somewhat effective as a simple
speaker identifier
Summary
Look Whos Talking Now SEM Exchange, Fall 2008
October 9, 2008
33
Montgomery College
Speaker Identification Using Pitch Engineering Expo Banquet 2009
05/08/09
Results
3. As signal to noise ratios increase,
the number of correctly identified
speakers increases
4. There seems to be an optimum
signal to noise ratio that gives the
maximum number of correctly
matched speakers

Look Whos Talking Now SEM Exchange, Fall 2008
October 9, 2008
34
Montgomery College
Speaker Identification Using Pitch Engineering Expo Banquet 2009
05/08/09

Presenters:
Estefany Carrillo
Roberto M. Melndez
Komal Syed


Montgomery College
Speech Processing
Center

Faculty Advisor:
Dr. Uchechukwu Abanulo

Anda mungkin juga menyukai