Anda di halaman 1dari 3

SPHSC 503 – Speech Signal Processing UW – Summer 2006

Homework 1 – due Wednesday 6/28 in class


Since this is the first homework in this course, you may wonder what to hand in for your
homework. Your homework should include all requested plots, and it should provide answers to
all questions and assignments in the homework. For example, for problem 1.1, you don’t need to
provide anything for part a, but you need to provide a plot for part b and c, provide the duration of
the signal in seconds in part d, and provide estimates for t=0.2 and t=4 for the distance in seconds
between peaks (part e, 2 estimates), and for the fundamental frequency (part f, 2 estimates).
You’re free to include any additional information, such as Matlab commands, supporting plots
and comments, but those are not required.

You may either bring a hard-copy of your homework to class on Wednesday, or you can email
me your homework. For the latter option, I recommend collecting all your plots and answers in a
single word processing document such as Microsoft Word.

Problem 1.1 – Loading and analyzing a speech signal


Download the file ex1_3.wav from the class website, and save it in a folder on your computer
(if you’re in SAV 137, the suggested location for the file is C:\Temp\SPHSC503\ex1_3.wav).
The file contains the spoken word “zero” sampled at 10 kHz. Change the current directory in
Matlab to the directory that contains the saved file.

a. Load the ex1_3.wav file into Matlab. You can either use Matlab’s Import Wizard, by
double-clicking on the filename in Matlab’s current directory window, or use the
wavread command (see help wavread for details).

b. Plot the speech signal against its index (n) by using the interactive tools or by using the
plot command. Label the axes and the plot appropriately.

c. Plot the speech signal as a function of time, and label the axes and the plot appropriately.
Hint: you need to create the a ‘time’ vector t, and then plot the signal with “plot(t,y)”.
You can create the correct time vector by dividing the speech signal’s index vector by the
sampling frequency of the signal, for example:
t = n / fs;

d. What is the duration of the signal in seconds? It may be helpful to use the figure’s zoom,
pan and data cursor tools.

Voiced speech, such as vowels, is characterized by a series of high-energy peaks in the


speech signal. Those peaks are created by the repeated opening and closing of the vocal
chords.

e. Make a rough estimate of the distance in seconds between the peaks in the “zero” speech
signal around t=0.2 and t=0.4 seconds, corresponding to the two vowels. Again, it may be
helpful to use the figure’s zoom, pan and data cursor tools.

f. Convert the measured distances in seconds from part e into an estimate of the
fundamental frequency (in Hz) of the speech signal around t=0.2 and t=0.4 seconds.

–1–
SPHSC 503 – Speech Signal Processing UW – Summer 2006

Problem 1.2 – Measuring fundamental frequency with correlation


In problem 1.1f, you’ve manually found an estimate for the fundamental frequency of the speech
signal. In this problem, we will use a technique called correlation to estimate the fundamental
frequency automatically. Correlation is a measure of the degree to which two sequences are
similar. It is related to convolution, and its mathematical expression looks like the convolution
sum. There are two kinds of correlation: auto-correlation (correlation between a signal and itself)
and cross-correlation (correlation between two different signals). They are defined as follows:

Auto-correlation: rxx [l ] = ∑
n =−∞
x[ n] x[ n − l ]

Cross-correlation: rxy [l ] = ∑
n=−∞
x[ n] y[ n − l ]

In Matlab, both types of correlation can be computed using the xcorr function from the Signal
Processing Toolbox. For example,

>> x = [1 1 1]; lmax = 3;


>> [rxx,l] = xcorr(x,lmax);
>> stem(l,rxx); % a triangle

computes and plots the auto-correlation of x, rxx [l ] , for l=-lmax,…,lmax. Similarly,

>> x = [1 1 1]; y = [-1 0 1]; lmax = 3;


>> [rxy,l] = xcorr(x,y,lmax);
>> stem(l,rxy); % 2 up, 2 down

computes and plots the cross-correlation of x and y, rxy [l ] , for l=-lmax,…,lmax.

a. Clear the workspace (clear), load the speech signal from problem 1.1 again, and extract
a voiced section of the speech signal using yvoiced1 = y(1900:2300); . Plot the
voiced section against its index, nvoiced1 = 1900:2300; .

b. Compute and plot the autocorrelation of the voiced section for lmax=250. You can
modify the example code above to do this. Label your x-axis as ‘Lag (in samples)’.

Notice the following in your plot of the autocorrelation: the autocorrelation has the highest peak
for zero lag (l=0). This is a necessary property of all autocorrelations. Then the autocorrelation
has strong positive peaks at equal distances to the left and right of the zero lag point.

c. Determine the value of the lag for the next strongest peak to the left and right. The zoom
tools of the plot window may be helpful for this.

d. Divide the value of the lag you found in part e by the sampling frequency to get the value
of the lag in seconds.

e. The value of the lag in seconds should correspond more or less to the distance between
the peaks you found in problem 1.1e for t=0.2. Is that the case for you? Convert the value
of the lag in seconds to a frequency in Hz. This value should correspond to the
fundamental frequency from problem 1.1f for t=0.2.

f. Repeat a-e for the second voiced section around t=0.4, nvoiced2 = 3800:4200;.

–2–
SPHSC 503 – Speech Signal Processing UW – Summer 2006

Problem 1.3 – Measuring fundamental frequency with a pitch estimator


It is possible to fully automate the estimation of the fundamental frequency of a speech signal
with the methods used in problem 1.1 and 1.2. In this problem, we will use a pitch estimator to
estimate the pitch of the entire speech signal.

a. Download the file pitchestimate.m from the class website. This m-file contains the
Matlab function pitchestimate, which can be used as follows

% y,fs is the input signal and sampling frequency


>> [t,p]=pitchestimate(y,fs);
% t is the times at which the fundamental frequency was estimated
% p is the estimates of the fundamental frequency
>> plot(t,p)

See help pitchestimate for details.

b. Clear the workspace, load the speech signal again, and estimate and plot its fundamental
frequency using the pitchestimate function.

c. [optional] If you have access to a microphone, it could be interesting to record your own
voice and determine your own fundamental frequency.

–3–

Anda mungkin juga menyukai