Anda di halaman 1dari 22

Enhancement of Speech in Noisy Conditions

Final Year Project 2008/2009

Progress Report
Paul Coffey 05678536 Final Year Electronic Engineering

Project Supervisor Dr. Edward Jones

National University of Ireland, Galway

January 2009
Table of Contents:
Chapter 1 Introduction.3 1.1 Project Specification.3 1.2 Project Aims..4 Chapter 2 Speech Enhancement Techniques.5 2.1 Spectral Subtraction.5 2. Wiener Filtering...7 2.3 Matlab.9 Chapter 3 Progress to Date10 3.1 Spectral Subtraction.10 3.1.1 Getting Started.. 10 3.1.2 Working in Matlab..11 3.1 Wiener Filtering..20 3.2 Project Website..............................................................................20 Chapter 4 Work To Be Completed.21 4.1 Wiener Filtering..21 4.2 Testing.21 4.3 Translating to C..21 4.4 Extend Spectral Subtraction.21 References22

Paul Coffey

Progress Report January 2009

National University of Ireland, Galway

Chapter 1 Introduction
1.1 Project Specification

The overall objective of this project is to implement and compare a number of techniques for enhancement of speech that has been degraded by noise. Such degradation of speech takes place all around us, for example, in mobile environments where the user is in a public place, such as a busy street, or in the case of a hands-free situation in a car. Despite the human auditory system being very robust, allowing us to be able to talk to people even in a noisy atmosphere, the addition of high levels of noise can result in a significant reduction of intelligibility of the degraded speech. Therefore, enhancement of speech through noise reduction is often a critical part of speech communication systems. Speech enhancement is also sometimes used for pre-processing of speech for computer speech recognition systems, since such systems often perform poorly with noisy speech [5]. Throughout this project a number of speech enhancement techniques will be looked at. These include a very common technique called Spectral Subtraction, along with Wiener Filtering. This project will also look into extending Spectral Subtraction to include aspects of the operation of the human auditory system, since human hearing is well known to be very robust to allow us to hear even in noisy atmospheres such as train stations. It can allow us to hear things even in a noisy environment, like an announcement for a train in a station even though there is a lot of noise around us. In this project the evaluation of each technique examined is a very important part of this project. This will be carried out by calculations such as Signal to Noise Ratio and also subjective tests. This will involve getting people to listen to samples of noisy speech filtered by the different techniques to judge if they are working and also how good they are working. Matlab will be used to carry out the simulation and evaluation of the different techniques in this project. C programming will be used in this project for the implementation of Spectral Subtraction. A speech acquisition circuit will also be used with the C programming to carry out real time implementation of the spectral subtraction.
3

Paul Coffey

Progress Report January 2009

National University of Ireland, Galway

1.2 Project Aims


The main aims of the project are as follows: Examine the Spectral Subtraction speech enhancement technique and then simulate it in Matlab. Carry out preliminary testing of the algorithm using simple speech samples by subjective and objective methods (by fellow students listening to samples of speech and making judgement on it). It is required to allow flexibility in the code in order to be able to test with different parameters. Examine the Wiener Filtering method and simulate in Matlab. Carry out testing on the code using objective and subjective methods. Extend the Wiener Filter method to simulate iterative Wiener filtering, and evaluate the operation of the algorithm. Thoroughly evaluate the algorithms using different noise types, such as white, aircraft and car with subjective and objective methods. This will include listening tests, with as many test listeners as possible, mainly classmates. A suitable testing framework will be developed in order to make it as easy as possible for the listeners.

Paul Coffey

Progress Report January 2009

National University of Ireland, Galway

Chapter 2 Speech Enhancement Techniques

Speech enhancement is a very important part of todays world. It is used in a number of applications in order to reduce the noise around us from cars, trains, planes or even just crowds. Everywhere we go there is noise to be heard. Due to this speech enhancement techniques benefit a wide range of applications such as mobile phones, hands free phones and speech recognition services. There are many different techniques to enhance the quality of speech for the listener. In this project the main ones to be discussed are Spectral Subtraction and Wiener Filtering.

2.1 What is Spectral Subtraction?


Spectral Subtraction is a method to enhance the quality of speech that has been degraded by additive noise. For this method it is assumed that the speech and noise signals are uncorrelated and for the noise signal to be stationary [1]. Therefore, in Spectral Subtraction the noise in the degraded speech is estimated from the pauses in the speech signal, since speech in general is made up of many pauses such as between words or the next person talking. Noisy or degraded speech can be expressed as: s(n) = x(n) + p(n), where s(n) is the noisy speech, x(n) is the clean speech and p(n) is the noise. This equation is in the time domain; however in Spectral subtraction it is necessary to transform this to the frequency domain in order to use this method. The previous equation can be rearranged in the frequency domain as X(f) = S(f) - P(f), The power spectrum of this equation can be written as: |X(f)|2 = |S(f)|2 - |P(f)|2, This equation when generalised turns out to be:
5

Paul Coffey

Progress Report January 2009

National University of Ireland, Galway |X(f)|a = |S(f)|a - |P(f)|a, From this a general equation for estimated speech can be expressed as, X(f) = ( max(|S(f)|a - q|P(f)|a,0) )a . ejx(f), Where q > 1 is used to overestimate the noise and the term |S(f)|a - q|P(f)|a is kept to positive values since the overestimated noise could be larger than the signal itself [4].

Noisy Signal

Analysis Window

Fourier Transform

Spectral Modification

Inverse Fourier Transform

Synthesis Window

Enhanced Signal

Figure 1; Shows the Block diagram of Spectral Subtraction enhancement method [1].

Paul Coffey

Progress Report January 2009

National University of Ireland, Galway

2.2 What is Wiener Filtering?


The Wiener filter is similar to Spectral Subtraction in the way it is derived and attempts to minimize the mean-square error in the frequency domain [4]. Wiener Filtering was first proposed by Norbert Wiener in 1949. It is typically used in the estimation or prediction of a signal observed in noise. The Filter can be used to enhance the quality of speech by removing unwanted noise. It can also be used to de-blur distorted images. Wiener Filtering can also be used in the prediction of the trajectory of projectiles like it was used for in World War 2. Of course for this project only the use for filtering out the noise from a noisy speech signal is being looked at [5]. A noisy signal s(n) can be expressed as S(n) = x(n) + y(n) Here x(n) is the clean original signal and y(n) is the additive noise to the signal. This same equation in the frequency domain now becomes S(f) = X(f) + Y(f), Where X(f) is the signal spectrum and Y(f) is the noise spectrum. The Wiener filter is written as W(f) = PXX(f) / ( PXX(f) + PYY(f)),

where PXX(f) is the signal power spectrum and PYY(f) is the noise power spectrum. Taking this equation and dividing top and bottom by P YY(f) and letting SNR(f) = PXX(f)/ PYY(f) gives W(f) = SNR(f) / (SNR(f) + 1)

This Equation gives us an important insight into how noise reduction systems work by using a function of the estimates of the SNR ratios to change the spectral amplitudes of signals disrupted with noise [5].

Paul Coffey

Progress Report January 2009

National University of Ireland, Galway


Noisy Signal Speech and noise power spectrum estimation PXX(f) PNN(f)

Wiener Filtering

W(f) = PXX(f) / ( PXX(f) + PNN(f))

Enhanced Signal Figure 2; Shows block diagram of Wiener Filtering for speech enhancement [5].

Paul Coffey

Progress Report January 2009

National University of Ireland, Galway

2.3 What is Matlab?

Matlab [2] is an interactive environment and high-level programming language that allows you to implement intensive tasks faster than traditional programming language like C, C++ and FORTRAN. It offers easy matrix manipulation, plotting data and functions, implementation of algorithms, creation of user interfaces and it can also connect with other languages [3]. As you can see Matlab is a very handy and useful tool.

Matlab is where the majority of the work in this project will be carried out. The different algorithms will be evaluated and simulated all within Matlab. It is very useful for this project since it can plot the out comes to see just how well the different speech enhancement techniques are working on a graph. Matlab also allows files to be played so listening tests can also be carried out using it.

Paul Coffey

Progress Report January 2009

National University of Ireland, Galway

Chapter 3 Progress to Date


This chapter details the work completed on this project to date. It can be divided up into three main categories: Spectral Subtraction Wiener Filtering Project Website

3.1 Spectral Subtraction


3.1.1 Getting Started

To start of this project as much relevant information about Spectral Subtraction was gathered. This took some time as there is plenty of information on Spectral Subtraction to be found everywhere. The project supervisor Dr. Edward Jones helped to get started with this project by giving of a couple of articles that are very useful to get started with this project. By reading these articles it became clear what was going to be expected from this project. By using these articles as a basis for this project many more books and online sources on the subject speech enhancement using Spectral Subtraction were then looked up to make sure to have a clear understanding of the method. It was discovered that there are many different sources of information regarding Spectral Subtraction, however many had to be ruled out due to many of them not being necessarily productive for this project or just simply very like other articles I already gathered. When sufficient information had been gotten regarding Spectral Subtraction and how it worked, work began in Matlab to evaluate and simulate Spectral Subtraction. To begin with it was found to be quite difficult to use Matlab. It was very time consuming to begin with, since there was no previous experience of coding in Matlab. For this project it was starting from scratch trying to use Matlab. Even though there is a lot of information to be got with regarding Matlab, it takes some time to get used to using the features in Matlab. This is due to a lot of the information about Matlab is hard to understand as it seemed to be
10

Paul Coffey

Progress Report January 2009

National University of Ireland, Galway very complex for the purpose of this project. With the help of the project supervisor and online resources after a short period of time the algorithm for Spectral Subtraction began to be developed.

3.1.2 Working in Matlab


The code started off in Matlab by first of all using the Matlab function FFT to get the Fourier Transform of the signal. From here, the magnitude and phase of the signal were able to be calculated by using the Matlab functions abs and angle. To begin with and get used to using Matlab something easy like simply doubling the magnitude of the signal was tried before trying to do harder functions like the Spectral Subtraction later on. Here is an example of the code and graph of the end product:
infile = fopen ('G:\FYP\C0R0.END','rb'); digit=fread(infile,'short'); fclose(infile); %This is used to read in % the signal

x = digit(1501:1628); X = fft(x); MagX = abs(X); PhaseX = angle(X);

%Here the % This is % This is % This is

signal of 128 samples is being used the FFT of the samples the Magnitude of the samples the Phase of the samples

% Here I did something trivial with the magnitude like doubling it MagY = 2*MagX; % This generates the FT of the output from the modified magnitude and % original Phase Y = MagY.*exp(sqrt(-1).*PhaseX); % ifft (Inverse FT function) to get the time-domain signal again y = real(ifft(Y)); %Plots the results plot (x) hold on; grid on; plot(y,'r');

11

Paul Coffey

Progress Report January 2009

National University of Ireland, Galway

Figure 3; Shows the original sample in blue and modified sample in red

In Figure 3 above the modified sample can be seen to be around twice as large as the original sample. From here, above the above code was further developed to analyse and resynthesise the entire signal instead of samples 1501 to 1628. For this the signal had to be split up into frames of 128 samples long. Then it needed to use for loops in order to go through each frame individually. It was also necessary to overlap each frame in order to have a smooth continuous signal. In this case the overlapping of each frame was by 50%. The frames also need to be multiplied by a Hanning window before getting the FFT of them. Here are some of the previous to code adjusted to implement previous items discussed followed by graphical view of it working:
% some constants being used nsamp = length(digit);

12

Paul Coffey

Progress Report January 2009

National University of Ireland, Galway


N = 128; overlap = N/2; win = hanning (N); % Hanning window u = zeros(nsamp,1); % Initialise u vector nn= randn(N,1); % Random noise for k = 1:overlap:nsamp-N, % For loop to go through each frame a = digit(k:k+N-1); b = (a + (100*nn)); x = (b.*win); % Split into frames % add random noise % multiply signal by hanning window

% get the signal spectrum X = fft(x); MagX = abs(X); PhaseX = angle(X); % do something to the magnitude in this case X2 MagY = 2*MagX; % ifft (Inverse FT function) to get the time-domainsignal again Y = MagY.*exp(sqrt(-1).*PhaseX); y = real(ifft(Y)); % overlap-add on output u(k:k+N-1) = u(k:k+N-1) + y;

end; % Plots results onto graph figure(1); subplot(2,1,1), plot(digit); grid on; subplot(2,1,2), plot(u,'r'); grid on;

13

Paul Coffey

Progress Report January 2009

National University of Ireland, Galway

Figure 4; Shows Original signal on top (blue) and Modified on the bottom (red).

In Figure 4 above the code is still doing the same as the previous example (multiplying the magnitude by 2) but now looping through the entire signal or utterance (Zero).

Further development of the code was needed to implement Spectral Subtraction of the signal. To do this it is necessary to get the noise spectrum along with the signal spectrum, then taking the noise from the signal to get the cleaned speech signal. However there are also a number of other steps necessary to make this work properly. Sometimes the estimated noise could be larger than the current signal and we end up with a negative magnitude. This would lead to poor quality sound and needed to be limited to positive values to reduce musical noise [4]. It was also necessary to keep the code flexible so a range of values could be tested for
14

Paul Coffey

Progress Report January 2009

National University of Ireland, Galway the different parameters, such as a (to switch between magnitude and power spectrum), q (used to overestimate the noise) and D (used to set negative spectrum values to a positive value). The code was also set up to play the signal so people can listen to the signal being filtered by Spectral Subtraction and see how well it is or isnt working. Here is code and graphs to show it working:
infile = fopen ('G:\FYP\MESS10.8K','rb'); digit=fread(infile,'short'); fclose(infile); % some constants nsamp = length(digit); N = 128; overlap = N/2; %Pause delay w = 4; %Plays original clean sample soundsc(digit,8000); pause(w); win = hanning (N); u = zeros(nsamp,1); %Initialise of Hanning window % Initialise u vector %Normalization of sample vector % Noise % Noisy signal, added noise

digit = digit ./ max(abs(digit)); noise = 0.1*randn(nsamp, 1); digit2 = digit + noise; %Plays Noisy sample soundsc(digit2,8000);

%loops to go through range of different variables for a = (1:2) %Used to change between magnitude and power spectrum %Used to overestimate the noise for q = (1:0.2:1.4) % Used to set negative spectrum values to positive ones for D = (0.06:0.02:0.12) % For loop to go through all the frames of 128 samples for k = 1:overlap:nsamp-N, % e is a frame of size 128 samples from k to k+N-1 e = digit2(k:k+N-1);

15

Paul Coffey

Progress Report January 2009

National University of Ireland, Galway


x = (e.*win); % multiply signal by hanning

% get the signal spectrum X = fft(x); MagX = abs(X).^a; PhaseX = angle(X); % get the noise spectrum noisevec = noise(k:k+N-1); NN = fft(noisevec); MagNN = q.*abs(NN).^a; % do something to the magnitude MagY = MagX - MagNN; % function finds samples less than zero then set them to D z = find (MagY < 0 ); MagY(z) = D; % ifft (Inverse FT function) to get the time-domain signal again Y = (MagY.^a).*exp(sqrt(-1).*PhaseX); y = real(ifft(Y)); % overlap-add on output u(k:k+N-1) = u(k:k+N-1) + y;

end; % play back of filtered sound pause(w); soundsc(u,8000); end; end; end; disp('Finished') % Simply displays finished when the code runs through

16

Paul Coffey

Progress Report January 2009

National University of Ireland, Galway

Figure 5; Show clean original speech

Figure 6; Shows noisy signal to be filtered

17

Paul Coffey

Progress Report January 2009

National University of Ireland, Galway

Figure 7; Shows the Filtered speech

18

Paul Coffey

Progress Report January 2009

National University of Ireland, Galway

Figure 8; Shows the clean signal (red), Noisy signal (blue) and filtered signal (green) for easier comparison.

It can be clearly seen from the graphs figure 5 to 8 that the Spectral Subtraction code is working. It does filter out a lot of the noise added to the original clean speech. Testing on the code using many different parameters to see how well it did or didnt work was carried. However this section is not completely finished fully just yet as some more testing is going to be necessary, such as listening tests by fellow students. However this hasnt been carried out yet as it is planned to finish Wiener Filtering and then have both methods tested at once in order to speed up the testing process.

19

Paul Coffey

Progress Report January 2009

National University of Ireland, Galway

3.1 Wiener Filtering


For Wiener filtering it started much the same way as Spectral Subtraction did. The project supervisor gave a useful article to get started with this section of the project. From there as much relevant information on a number of articles were gathered to further help with this project. At the moment researching is ongoing about Wiener Filtering. Work has started on writing the code for Wiener Filtering in Matlab and it is hoped to have it up and running very soon.

3.2 Project Website


For this project a website was also set up. This website has the project abstract and the list of goals for this project. It is being kept up to date with all the latest work being carried out and has links to sites with more information regarding this project. The site can be found at http://ohm.nuigalway.ie/0809/pcoffey/ .

20

Paul Coffey

Progress Report January 2009

National University of Ireland, Galway

Chapter 4 Work To Be Completed


The work still to be completed in this project is detailed below:

4.1 Wiener Filtering


As stated at the end of the previous chapter, development of the code for Wiener Filtering is in progress and hope to get it setup and running very soon.

4.2 Testing
When Wiener Filtering is completed fully, a thorough evaluation of Spectral Subtraction and Wiener Filtering will be carried out using a wide range of different noise types such as white and aircraft noise. A number of subjective tests will be carried out by many of the students in the class. A suitable framework will be needed to be developed for this in order for it to run as easy and smoothly as possible.

4.3 Translating to C
The Spectral Subtraction method developed will be translated to C and tested. A circuit for speech acquisition will have to be constructed and then interfaced to a PC using possible a NIDAQ card. This will then be all tested to make sure it is operating correctly.

4.4 Extend Spectral Subtraction


The Spectral Subtraction method will be extended to include elements of the auditory system. Masking is the particular behaviour that will be focused on. This will be simulated and evaluated in Matlab.
21

Paul Coffey

Progress Report January 2009

National University of Ireland, Galway

References:
[1]: http://www.assta.org/sst/2006/sst2006-45.pdf [2]: http://www.mathworks.com [3]: http://en.wikipedia.org/wiki/MATLAB [4]: http://www-mmsp.ece.mcgill.ca/MMSP/Theses/2001/ThiemannT2001.pdf [5]: Advanced Digital Signal Processing and Noise Reduction by Saeed V. Vaseghi

22

Paul Coffey

Progress Report January 2009