Anda di halaman 1dari 63

R.V.

COLLEGE OF ENGINEERING, Bangalore-560059 (Autonomous Institution Affiliated to VTU, Belgaum)

SOUND SOURCE LOCALIZATION USING LabVIEW


PROJECT REPORT
2011-12

Submitted by 1. JAGRITI R 2. SHREE VARDHAN SARAF


Under the Guidance of

1RV08IT058 1RV08IT061

Mr. HARSHA HERLE


Assistant professor

Department of Instrumentation Technology, RVCE In partial fulfillment for the award of degree of

Bachelor of Engineering
in

INSTRUMENTATION TECHNOLOGY

R.V. COLLEGE OF ENGINEERING, BANGALORE 560059 (Autonomous Institution Affiliated to VTU, Belgaum) DEPARTMENT OF INSTRUMENTATION TECHNOLOGY

CERTIFICATE
Certified that the project work titled Sound source localization using LabVIEW is carried out by Jagriti R (1RV08IT058) and Shree Vardhan (1RV08IT061) who are bonafide students of R.V College of Engineering, Bangalore, in partial fulfillment for the award of degree of Bachelor of Engineering in Instrumentation Technology of the Visvesvaraya Technological University, Belgaum during the year 2011-2012. It is certified that all corrections/suggestions indicated for the internal Assessment have been incorporated in the report deposited in the departmental library. The project report has been approved as it satisfies the academic requirements in respect of project work prescribed by the institution for the said degree.

Signature of Guide:

Signature of Head of Department: Signature of Principal

External Viva Name of Examiners 1 2 Signature with date

R.VCOLLEGE OF ENGINEERING, BANGALORE - 560059 (Autonomous Institution Affiliated to VTU, Belgaum) DEPARTMENT OF INSTRUMENTATION TECHNOLOGY

DECLARATION
We, Jagriti R (1RV08IT058) and Shree Vardhan Saraf (1RV08IT061) the

students of eighth semester B.E., Instrumentation Technology, hereby declare that the project titled Sound source localization using LabVIEW has been carried out by us and submitted in partial fulfillment for the award of degree of Bachelor of Engineering in Instrumentation Technology. We do declare that this work is not carried out by any other students for the award of degree in any other branch.

Place: Bangalore Date:

Names 1. Jagriti R 2. Shree Vardhan

Signature

ACKNOWLEDGEMENT
The satisfaction that accompanies the successful completion of any endeavour would be incomplete without the mention of people who made it possible and whose constant support, encouragement and guidance has been a source of inspiration throughout the course of this project.

We thank our internal guide Mr. Harsha Herle, Assistant Professor, Instrumentation Technology, R.V College of Engineering for his endearing help and guidance.

We express our heart-felt gratitude to Mr. Rohit Pannikar, Manager, Applications engineering division and Mr. Rajshekhar, Staff Applications engineer, National Instruments India for providing a very congenial work environment and for their expert supervision that enabled us to complete this project successfully in the given duration.

We would like to thank Dr. Prasanna Kumar S.C., Professor and Head of Department of Instrumentation Technology, R.V College of Engineering, Bangalore for his encouragement and support.

We would like to thank Prof. B.S. Satyanarayana, Principal, R.V College of Engineering for his constant support.

Finally, we thank one and all involved directly or indirectly in successful completion of the project.

ABSTRACT
Problem of locating a sound source in space has received a growing interest. The human auditory system uses several cues for sound source localization, including time- and level-differences between ears, spectral information, timing analysis, correlation analysis, and pattern matching. Similarly, a biologically inspired sound localization system can be built by making use of an array of microphones, which are hooked up to a computer.

Methods for determining the direction of incidence based on sound intensity, the phase of cross-spectral functions, cross-correlation functions, and Frequency Selection algorithm are available. Sound source localizations finds applications in military, camera pointing in video-conferencing environments, beam former steering for robust speech recognition systems etc.

There is no universal solution for accurate sound source localization. Depending on the object under study and the noise problem, the most appropriate technique has to be selected. In this project we attempt to localize a single sound source by using four microphones. A most practical acoustic source localization scheme is based on time delay of arrival estimation (TDOA). We implement generalized cross correlation to find time delay of arrival between microphone pairs. TDOA estimation using microphone arrays is to use the phase information present in signals from microphones that are spatially separated. Phase difference between the Fourier transformed signals to estimate the TDOA and is implemented using a 4 element tetrahedron shaped microphone array.

Once TDOA estimation is performed, it is possible to find the position of the source through geometrical calculations therefore deriving the source location by solving the set of non-linear least squares equations. The experimental results showed that the direction of the two sources was estimated with high accuracy while the range of the sources was estimated with moderate accuracy.

\ i

CONTENTS
Abstract List of Figures List of Tables List of symbols, Acronyms and Nomenclature 1. Chapter 1: Introduction 1.1 Sound localization in Biology 1.2 Sound localization: a signal processing view 1.3 Problem statement 1.4 Objective 1.5 Overview of the Project 1.6 Organization of Report 1.7 Block Diagram and Description 2. Chapter 2: Theoretical Background 2.1 Nature of Sound 2.2 Microphone 2.2.1 Types of microphone 2.3 Microphone array 2.4 Various Coherence Measures 3. Chapter 3: Design and methodology 3.1 Scenario 3.2 Direction of Arrival Estimation 3.2.1 The Geometry of the Problem 3.2.2 Microphone array structure 3.2.3 Time Delay of Arrival (TDOA) 3.2.4 Algorithm to find Time delay of Arrival 3.3 Distance Estimation 3.3.1 Source Localization in 2-Dimensional Space 3.3.2 Hyperbolic position location 3.3.2.1 General Model 3.3.2.2 Position Estimation 3.4 Hardware Design 3.4.1 cDAQ-9172 3.4.2 Analog input module - NI 9234 3.4.3 Digital output module NI 9472 1 2 3 4 4 5 5 6 7 8 10 11 13 14 15 16 16 16 17 18 19 21 21 21 22 23 24 24 24 24 i ii iii i

3.5 Assumptions and Limitations 4. Chapter 4: Implementation Overview 4.1 Hardware and interfacing 4.2 Overview of LabVIEW 4.2.1 Front panel 4.2.2 Block diagram 4.3 Programming using LabVIEW 11 4.3.1 Microphone signal interface using NI DAQ Assistant. 4.3.2 Threshold detection of each signal 4.3.3 Finding time delay of arrival 4.3.4 Direction and distance estimation 4.3.5 Servo control 4.4 System Hardware 4.4.1 Microphone 4.4.2 The microphone array 4.4.3 Data Acquisition 4.4.3.1 Modules 4.5 Flow Chart 5. Chapter 5: Results and Discussion 5.1 Experimental Setup 5.2 Experiment 1: Time delay of arrival 5.3 Experiment 2: Direction of arrival 5.4 Experiment 3: Distance estimation 6. Chapter 6: Conclusion and future work 6.1 Conclusion 6.2 Future work 7. Chapter 6: Appendix 7.1 Bibliography 7.2 Snapshots of working model 7.3 Datasheets

25 26 27 27 28 28 29 29 32 34 36 37 37 38 38 39 40 42 43 44 45 46 47 49 50 50 52 53 55 57

LIST OF FIGURES
Fig 1.1 Fig 2.1 Fig 2.2 Fig 2.3 Fig 2.4 Fig 2.5 Fig 2.6 Fig 2.7 Fig 3.1 Fig 3.2 Fig 3.3 Fig 3.4 Fig 4.1 Fig 4.2 Fig 4.3 Fig 4.4 Fig 4.5 Fig 4.6 Fig 4.7 Fig 4.8 Fig 4.9 Fig 4.10 Fig 4.11 Fig 4.12 Fig 4.13 Fig 4.14 Fig 4.15 Fig 4.16 Fig 4.17 Block diagram of the System. Diagram of sound wave vibrations Types of Microphones (a) Free-field microphone, (b) pressure-field microphone, (c) random-incident microphone Condenser Microphone Electret Microphone Dynamic Microphone Piezo Microphone Generalized Microphone Array Microphone Array Conceptual Diagram for TDOA 2 Microphone array with a source in the far field. Position Estimation of the Sound Source Data Flow between major hardware components Opening the DAQ Assistant Panel NI DAQ Assistant settings NI DAQ Assistant Configuration DAQ Assistant on Block Diagram Threshold Detection Waveforms generated on the front panel Observed time delay of arrival between two microphones Generalized Cross Correlation Direction Estimation Distance Estimation Servo Control Setup Panasonic Dynamic Microphone Picture of the Microphone Array Microphone Coordinates Front Panel

Fig 4.18 Fig 4.19 Fig 5.1 Fig 5.2 Fig 5.3

Hardware Setup Direction Indicator Array Structure ii Time Delay of Arrival between mics 1,2,3. Direction Error Graph

LIST OF TABLES

Table 5.1 Table 5.2

Direction error estimation Distance Error Estimation

iii

List of symbols, Abbreviations and Nomenclature

DOA Direction of Arrival GCC - Generalized Cross Correlation GCCPHAT - GCC Phase Transform SSL Sound source localization TDE - Time Delay Estimation TDOA - Time Difference Of Arrival PL - Position Location 2-D - Two dimensional 3-D - Three dimensional DFT- Discrete Fourier Transform FFT - Fast Fourier Transform

iv

R V College of Engineering

CHAPTER 1 INTRODUCTION

Department of Instrumentation Technology

Page 1

R V College of Engineering

Chapter 1 INTRODUCTION

1.1 Sound localization in Biology

For many species such as barn owl, sound localization is a matter of survival. The natural capabilities of human and animals to localize sound has intrigued researchers for many years. Numerous studies have attempted to determine the processes and mechanisms used by humans or animals to achieve spatial hearing.

One of the first steps in understanding nature's way of solving this problem is to understand how information is processed in the ear. A number of models for the ear have been suggested by the researchers [11]. These studies suggest that the cochlea effectively extracts the spectral information from the sound wave impinging on the ear drums and converts it into the electrical signals. The cochlear output is in the form of electrical signals at different neuron points along the basilar memebrane of cochlea. The electrical signals then travel up to the brain for further processing.

Many researchers have come up with different models of processing of electrical signals in the brain for sound localization to support the experimental data from various neurophysiological studies. All these different models agree on the fundamental view that the direction of the sound is determined by two important binaural cues - the interaural time difference and the interaural level difference. These binaural cues arise from the differences in the sound waveforms entering the two ears. The interaural time difference is the temporal difference in the waveforms due to the delay in reaching the ear farther away from the sound source. The interaural level difference is the difference in the intensity of the sound reaching the two ears. In general, the ear which is farther away from the source will receive a fainter sound than the ear which is relatively closer to the source due to the attenuation effect of the head and surroundings. The phenomena of time delay and the intensity difference can be integrated into the notion of interaural transfer function which represents the transfer function between the two ears.

Department of Instrumentation Technology

Page 2

R V College of Engineering In general, the task that the human auditory system performs in order to detect, localize, recognize and emphasize different sound sources is referred to as auditory scene analysis (ASA). An auditory scene denotes the listener and his/her physical surroundings, including sound sources. It is generally accepted that cross-correlation based computational models for binaural processing provide excellent qualitative and quantitative accounts of experimental studies. The output patterns obtained from the cross-correlation operations reflect the binaural information which can be refined further and interpreted to determine the direction of the source.

1.2 Sound localization: a signal processing view


In the signal processing community, the more commonly used term for this problem is direction-of-arrival (DOA) estimation Time Delay Estimation between replicas of signals is intrinsic to many signal processing applications. Depending on the type of signals acquired ranging from human hearing to radars, various Time Delay Estimation methods have been described in literature [9]. Sound source localization (SSL) systems estimate the location of audio sources based on signals received by an array of microphones. With proper microphone geometry, SSL systems can also provide 3D location information. The other example of SSL is to locate sound sources so that a robot can interact with detected objects. Rotating a microphone in a conference room to isolate and process a particular speaker is another example where SSL systems can be implemented.

In general, there are three categories of techniques for sound source localization, i.e. steered-beamformer based, high resolution spectral estimation based, and time delay of arrival (TDOA) based [10].

The direction of the sound source can be obtained by estimating the relative Time Delay of Arrival (TDOA) between two microphones. Peak levels for each microphone signal are analyzed, from which a time delay between signals can be found. The location of the source relative to the microphone array is calculated using this delay, and this location is displayed on the computer screen. Department of Instrumentation Technology Page 3

R V College of Engineering

1.3 Problem statement

Sound Source Localization system is to determine the location of audio sources based on the audio signals received by an array of microphones at different positions in the environment.

Sound source localization is a complex and cumbersome task. The toughest challenge facing any acoustics engineer is to figure out where the sound originates especially when there is considerable interference and reverberation flying around. Even though a number of basic techniques exist and have undergone constant improvement, the problem remains that there is no magical sound source localization technique that prevails over the others. Depending on the test object, the nature of the sound and the actual environment, engineers have to select one method or the other [8].

1.4 Objective

In this Project we have studied the available techniques and eventually developed an algorithm as we attempt to localize a sound source. Using an array of 5 microphones, the direction of the sound source as well as well as the distance to the sound source is estimated.

The first step computes TDOA for each microphone pair, and the second step combines these estimates using a set of equations to obtain the direction vector and distance coordinates.

Department of Instrumentation Technology

Page 4

R V College of Engineering

1.5 Overview of the Project


The procedure for localization of multiple sound sources by the TDOA method is:

1. Estimation of delays of arrival[2] 2. Localization by clustering delay of arrivals[3] 3. Display the sound location.

A simple method of localization is to estimate the time delay of arrival (TDOA) of a sound signal between the two microphones. This TDOA estimate is then used to calculate the Angle of Arrival (AoA). The most commonly used TDOA estimation method is generalized cross correlation (GCC). The TDOA estimate can be calculated by applying the cross correlation equation. The sample corresponding to the maximum coefficient denotes the time delay in number of samples.

Combining the data from two microphone pairs and by using the process of hyperbolic position estimation, we compute the distance of sound source from the microphone pairs

A microphone cluster was set up in a lab with coordinates, where the microphones were placed at the corners of a known array. A sound localization experiment was done whose results proved the measured success of the sound localization routine implemented in LabVIEW.

1.6 Organization of Report

Chapter 1 introduces a brief literature along with the latest application of the project. The block diagram explains the working. Chapter 2 deals with the theory behind the nature of sound and the acquisition of the same using different types of microphones. Chapter 3 describes the methodology along with the algorithms used. Chapter 4 explains the system software and hardware implementation. In Chapter 5, the readings have been tabulated and the discrepancies are discussed. Chapter 6 concludes the work and the future scope of the project is discussed. Finally the snapshots of the working model along with the datasheets are presented in chapter 7. Department of Instrumentation Technology Page 5

R V College of Engineering 1.7 Block Diagram and Description A simplified block diagram showing the process involved using three microphones is shown below in Fig 1.1.

Fig. 1.1 Block diagram

The microphones are used to capture the sound signal. Each microphone does so at different time intervals. These signals are interfaced to the PC using an analog input module. The signals are further processed on LabVIEW. Cross correlation is done to estimate time delay of arrival. The correlation is always done with respect to a reference signal. In this case, the signal from microphone 1 is taken as reference. So the signal received at microphone 2 and 3 are cross correlated with respect to signal at microphone 1. By doing this, the extra time taken by the sound signal to reach microphone 2 and 3 is computed. Using the time delay estimates, suitable direction and distance algorithms are implemented. Once the direction is found, it is given to the digital output module in terms of duty cycle. This is used to drive a servo motor which houses a pointer. So by doing this, the direction and distance are indicated on the front panel as well as a visual indication of the direction of sound source. In the following chapter each of the blocks are explained in detail.

Department of Instrumentation Technology

Page 6

R V College of Engineering

Chapter 2 THEORETICAL BACKGROUND

Department of Instrumentation Technology

Page 7

R V College of Engineering

Chapter 2 THEORETICAL BACKGROUND


Many audio processing applications can obtain substantial benefits from the knowledge of the spatial position of the source which is emitting the signal under process. For this reason many efforts have been devoted to investigating this research area and several alternative approaches have been proposed over the years. [1]

Microphone arrays have been implemented in many applications, including teleconferencing, speech recognition, and position location of dominant speaker in an auditorium. Direction of arrival estimation of acoustic signals using a set of spatially separated microphones has many practical applications in everyday life. DOA estimates from the set of microphones can be used to automatically steer cameras to the speaker in a conference room.

Techniques such as the generalized cross correlation (GCC) method, phase transform (GCC-PHAT) are widely used for DOA estimation [9].

Accuracy of the system depends on various factors. The hardware used for data acquisition, sampling frequency, number of microphones used for data acquisition, and noise present in the signals captured, determine the accuracy of the estimates. Increase in the number of microphones increases the performance of source location estimation.

2.1 Nature of Sound


Sound is a variation in the pressure of the air of a type which has an effect on our ears and brain. These pressure variations transfer energy from a source of vibration that can be naturally-occurring, such as by the wind or produced by humans such as by speech. Sound in the air can be caused by a variety of vibrations, such as the following.

Moving objects: examples include loudspeakers, guitar strings, vibrating walls and human vocal chords.

Department of Instrumentation Technology

Page 8

R V College of Engineering

Moving air: examples include horns, organ pipes, mechanical fans and jet engines.

A vibrating object compresses adjacent particles of air as it moves in one direction and leaves the particles of air spread out as it moves in the other direction. The displaced particles pass on their extra energy and a pattern of compressions and rarefactions travels out from the source, while the individual particles return to their original positions. Fig 2.1 shows how the amplitude changes with the loudness of the signal

Fig 2.1 Diagram of sound wave vibrations Wavelength (l) is the distance between any two repeating points on a wave. The unit is the metre (m) Frequency (f) is the number of cycles of vibration per second. The unit is the hertz (Hz) Velocity (v) is the distance moved per second in a fixed direction. The unit is metres per second (m/s) For every vibration of the sound source the wave moves forward by one wavelength. The length of one wavelength multiplied by the number of vibrations per second therefore gives the total length the wave motion moves in 1 second. This total length per second is also the velocity. This relationship between velocity, frequency and wavelength is true for all wave motions. A sound wave travels away from its source with a speed of 344 m/s (770 miles per hour) when measured in dry air at 20 C (68 F) . If an object that produces sound waves vibrates 100 times a second, for example, then the frequency of that sound wave will be 100 Hz. Department of Instrumentation Technology Page 9

R V College of Engineering

2.2 Microphone

A microphone is an acoustic-to-electric transducer or sensor that converts sound into an electrical signal. Most microphones today use electromagnetic induction (dynamic microphone), capacitance change (condenser microphone), piezoelectric generation, or light modulation to produce an electrical voltage signal from mechanical vibration.

They can be classified depending on the type of field: free-field, pressure-field, and random-incident (diffuse) field. As shown in Fig 2.1(a), Free-field microphones are intended for measuring sound pressure variations that radiate freely through a continuous medium, such as air, from a single source without any interference. The microphone is typically pointed directly at the sound source (0 incidence angle). Free-field microphones measure the sound pressure at the diaphragm; however, the sound pressure may be altered from the true value when the wavelength of a particular frequency approaches the dimensions of the microphone. Consequently, correction factors are usually added to the microphones calibration curves to compensate for any changes in pressure at its diaphragm due to its own presence in the pressure field. These microphones work best in anechoic chambers or large open areas where hard or reflective surfaces are absent.

(a)

(b)

(c)

Fig 2.2 Types of Microphones (a) Free-field microphone, (b) pressure-field microphone, (c) random-incident microphone

The second type is called a pressure-field microphone (Fig 2.1(b)). They measure sounds from a single source within a pressure field that has the same magnitude and phase at any location. In order to simulate a uniform pressure field, they are usually calibrated in enclosures or cavities, which are small compared to their wavelength. Department of Instrumentation Technology Page 10

R V College of Engineering This minimizes any alterations in measurements due to the presence of the microphone in the sound field. They are also supplied with a pressure versus frequency-response curve. Such microphones measure the pressure exerted on walls, airplane wings, or inside structures such as tubes, housings, and cavities.

The third type is called a random-incident or a diffuse-field microphone. Shown in Fig 2.11(c) they are omni-directional and measure sound pressure from multiple directions and sources, including reflections. They come with typical frequency response curves for different angles of incidence and compensate for the effect of their own presence in the field. An appropriate application for this type of microphone is measuring sound in a building with hard, reflective walls, such as a church.

2.2.1 Types of microphone

The condenser microphone (Fig. 2.3) is also called a capacitor microphone or electrostatic microphone. Here, the diaphragm acts as one plate of a capacitor, and the vibrations produce changes in the distance between the plates. Condenser microphones span the range from telephone transmitters through inexpensive karaoke microphones to high-fidelity recording microphones.

Fig. 2.3 Condenser microphone

An electret microphone is a type of capacitor microphone. The externally applied charge described above under condenser microphones is replaced by a permanent charge in an electret material. An electret is a ferroelectric material that has been permanently electrically charged or polarized.

Department of Instrumentation Technology

Page 11

R V College of Engineering

Fig 2.4 Electret microphone

Dynamic microphones work via electromagnetic induction. They are robust, relatively inexpensive and resistant to moisture. Moving-coil microphones use the same dynamic principle as in a loudspeaker, only reversed. A small movable induction coil, positioned in the magnetic field of a permanent magnet, is attached to the diaphragm. When sound enters through the windscreen of the microphone, the sound wave moves the diaphragm. When the diaphragm vibrates, the coil moves in the magnetic field, producing a varying current in the coil through electromagnetic induction

Fig. 2.5 Dynamic microphone

A crystal microphone or piezo microphone uses the phenomenon of piezoelectricity the ability of some materials to produce a voltage when subjected to pressure to convert vibrations into an electrical signal.

Department of Instrumentation Technology

Page 12

R V College of Engineering

Fig 2.6 Piezo microphone

2.3 Microphone array

A microphone array is any number of microphones operating in tandem. Microphone arrays consist of multiple microphones functioning as a single directional input device: essentially, an acoustic antenna. Using sound propagation principles, the principal sound sources in an environment can be spatially located and distinguished from each other. Distinguishing sounds based on the spatial location of their source is achieved by filtering and combining the individual microphone signals. The location of the principal sounds sources may be determined dynamically by analyzing peaks in the correlation function between different microphone channels.

Fig 2.7 Microphone array There are many applications:

Systems for extracting voice input from ambient noise (notably telephones, speech recognition systems, hearing aids)

Surround sound and related technologies

Department of Instrumentation Technology

Page 13

R V College of Engineering

Locating objects by sound: acoustic source localization, e.g., military use to locate the source(s) of artillery fire. Aircraft location and tracking.

High fidelity original recordings

2.4 Various Coherence Measures

Various Coherence Measures are required to find the time delay. Given the signals acquired by a couple of microphones, a coherence measure can be defined as a function that indicates the similarity degree between the two signals realigned according to a given time lag. Coherence measures can hence be used to estimate the time delay between two signals. For example, Cross-Correlation is the most straightforward coherence measure[9].

Another approach adopted in the sound source localization community to compute a coherence measure is the use of GCC-PHAT. Let us consider two digital signals x1(n) and x2(n) acquired by a couple of microphones, GCC-PHAT is defined as follows:

GCC-PHAT (d) = IFFT X1 X2* |X1||X2| where d is a time lag, subject to |d| < _max, while X1 and X2 are the DFT transforms of x1 and x2 respectively. The inter-microphone distance determines the maximum valid time delay _max. It has been shown that, in ideal conditions, GCC-PHAT presents a prominent peak in correspondence of the actual TDOA. On the other hand, reverberation introduces spurious peaks which may lead to wrong TDOA estimates

An alternative way to obtain a coherence measure is offered by AED that is able to provide a rough estimation of the impulse responses that describe the wave propagation from one acoustic source to two microphones. Under the assumption that the main peak of each impulse response identifies the direct path between the source and the microphone, the TDOA can be estimated as the time difference between the two main peaks. Let us denote with h1 and h2 the two impulse responses, in ideal conditions, i.e. without noise, the following equation holds: h2 *x1(n) = h2 *h1 *s(n) = h1 *x2(n) Department of Instrumentation Technology Page 14

R V College of Engineering

Chapter 3 DESIGN AND METHODOLOGY

Department of Instrumentation Technology

Page 15

R V College of Engineering

Chapter - 3 DESIGN AND METHODOLOGY


1.8 Scenario
Given a set of M acoustic sensors (microphones) in known locations, our goal is to estimate two or three dimensional coordinates of the acoustic sound source. We assume that source is present in a defined coordinate system. We know the number of sensors present and that single sound source is present in the system. The sound source is excited using a broad band signal with defined bandwidth and the signal is captured by each of the acoustic sensors. The TDOA is estimated from the captured audio signals. The TDOA for a given pair of microphones and speaker is defined as the difference in the time taken by the acoustic signal to travel from the speaker to the microphones. We assume that the signal emitted from the speaker does not interfere with the noise sources. Computation of the time delay between signals from any pair of microphones can be performed by first computing the cross- correlation function of the two signals. The lag at which the cross-correlation function has its maximum is taken as the time delay between the two signals.

3.2 Direction of Arrival Estimation

3.2.1 The Geometry of the Problem

Analyzing the geometry of the problem is important; because it allows us to address the following issues. Any source localization system can be prone to bewilderment regarding the location of the source due to aliases. Aliases arise when we do not have enough sensors or when the geometric placement of the sensors makes some of them redundant. The problem of aliases can be solved by adding more sensors to our localization system. However, consideration of the geometry of the problem allows us to add microphones economically. That is, we can determine the minimum number of microphones needed for any given situation. In some cases, we may need to constrain the degree of freedom of the source of sound. This allows us to do simple experiments by using even smaller number of microphones than the number of microphones that Department of Instrumentation Technology Page 16

R V College of Engineering we would need to localize a source in 3D. It is evident that no source localization can be achieved using one microphone. So, we start by looking at how we need to constrain our source of sound, assuming that we only have two microphones. The only way that a source can be localized to a point using two microphones is if the source is constrained to a line that passes through the two microphones. If this constraint is lifted, the precision of the system degrades. We consider adding a third microphone to our system because we want to ameliorate the constraints placed on the source. Such constraints were imposed to make the system precise to a point. Adding a third microphone on a line that passes through the previous two microphones results in redundancy. Thus, the three microphones should not all be on a single line. Three microphones placed at the corners of a triangle may seem to be adequate to localize a source to a point in a plane.

3.2.2 Microphone array structure

Preliminary experiments were done using a three-element-two-dimensional microphone array for Direction of Arrival (DOA) estimation. The array consists of three microphones arranged in an fashion in a 2-dimensional plane. As shown in the Fig. 3.1 the microphones M3-M1-M2 form the array with M1 being the center microphone. M1 is at the origin of the coordinate axis. The x axis and y axis is as shown. The angle of arrival 1 is measured in clockwise direction w.r.t to the x axis. This convention is chosen for experimental convenience.
Y Axis M3 M2

X Axis M1 M1

Fig 3.1 Microphone Array

Department of Instrumentation Technology

Page 17

R V College of Engineering In order to implement the same in 3D, another microphone is added on top to form a tetrahedron thereby adding dimension

3.2.3 Time Delay of Arrival (TDOA)

Let mi for i belonging to [1,M] be the three dimensional vectors representing the spatial coordinates of the ith microphone and s as the spatial coordinates of sound source. We excite the source s and measure the time difference of arrivals. Letting c as the speed of sound in the acoustical medium (air) and || is the Euclidean norm. The TDOA for a given pair of microphones and the source is defined as the time difference between the signals received by the two microphones. Let TDOAij be the TDOA between the ith and jth microphone when the sources is excited. It is given by equation 3.1 TDOAij = (||mi s|| - ||mj s||) (3.1) c TDOAs are then converted to time delay estimations (TDEs) and path differences. This is depicted in Fig. 3.2

Fig 3.2 Conceptual diagram for TDOA In order to compute the TDOA between the reference channel and any other channel for any given segment it is usual to estimate it as the delay that causes the crosscorrelation between the two signals segments to be maximum. In order to improve robustness against reverberation Generalized Cross Correlation (GCC) is used. Department of Instrumentation Technology Page 18

R V College of Engineering Given two signals xi(n) and xj(n) the GCC is defined as G(f) = Xi(f)[Xj(f)]* Where and are the Fourier transforms of the two signals and (3.2) denotes

the complex conjugate. The TDOA for these two microphones is estimated as: D(i,j) = argmax(R(d)) (3.3)

In equation 3.3, R(d) is the inverse Fourier transform of Eq. (3.2). Maximum value of R(d) corresponds to the estimated TDOA for that particular segment.

3.2.4 Algorithm to find Direction of Arrival

The lag at which the cross-correlation function has its maximum is taken as the time delay between the two signals. Once TDOA estimation is performed, it is possible to compute the position of the source through geometrical calculations. One technique based on a linear equation system but sometimes, depending on the signals, the system is ill-conditioned and unstable. For that reason, a simpler model based on far field assumption is used [1]. Fig. 3.3 illustrates the case of a 2 microphone array with a source in the far field.

Consider two microphones i and j placed at a distance Xij from each other. Tij is the time delay of arrival found using the above method. On multiplying with the speed of sound c, it gives the extra distance the sound source has to travel to reach microphone i. On dropping a perpendicular, two angles of phi and theta are subtended. u is a unit vector in the direction of the sound source.

Fig. 3.3 Two Microphone array with a source in the far field. Department of Instrumentation Technology Page 19

R V College of Engineering

(3.4)

where Xi j is the vector that goes from microphone i to microphone j and u is a unit vector pointing in the direction of the source. From the same figure, it can be stated that:

(3.3) where c is the speed of sound. When combining equations (3.4) and (3.5), we obtain:

which can be re-written as:

The position of microphone i being (xi, yi, zi). Considering N microphones, we obtain a system of N -1 equations:

Therefore, using 3 microphones, two time delays are estimated and direction is indicated in 2D. Similarly using the equations above, if 4 microphones are used (With one microphone located in another plane), 3 time delays can be estimated and the direction vector in 3D is located.

Department of Instrumentation Technology

Page 20

R V College of Engineering

3.3 Distance Estimation


3.3.1 Source Localization in 2-Dimensional Space

Sound source localization is a two step problem. First the signal received by several microphones is processed to obtain information about the time-delay between pairs of microphones. We use the

GCC PHAT method for estimating the time-delay. The estimated time-delays for pairs of microphones can be used for getting the location of the sound source.

3.3.2

Hyperbolic position location

By definition, a hyperbola is the set of all points in the plane whose location is characterized by the fact that the difference of their distance to two fixed points is a constant. The two fixed points are called the foci. In our case the foci are the microphones. Each hyperbola consists of two branches. The emitter is located on one of the branches. The line segment which connects the two foci intersects the hyperbola in two points, called the vertices. The line segment which ends at these vertices is called the transverse axis and the midpoint of this line is called the center of the hyperbola [4].

The time-delay of the sound arrival gives us the path difference that defines a hyperbola on one branch of which the emitter must be located. At this point, we have infinity of solutions since we have single information for a problem that has two degrees of freedom.

We need to have a third microphone, when coupled with one of the previously installed microphones, it gives a second hyperbola. The intersection of one branch of each hyperbola gives one or two solutions with at most of four solutions being possible. Since we know the sign of the angle of arrivals, we can remove the ambiguity.

Department of Instrumentation Technology

Page 21

R V College of Engineering Hyperbolic position location (PL) estimation is accomplished in two stages. The first stage involves estimation of the time difference of arrival (TDOA) between the sensors (microphones) through the use of time-delay estimation techniques. The estimated TDOAs are then utilized to make range difference measurements. This would result in a set of nonlinear hyperbolic range difference equations.

When the microphones are arranged in non-collinear fashion, the position location of a sound source is determined from the intersection of hyperbolic curves produced from the TDOA estimates. The set of equations that describe these hyperbolic curves are non- linear and are not easily solvable. If the number of nonlinear hyperbolic equations equals the number of unknown coordinates of the source, then the system is consistent and a unique solution can be determined from iterative techniques. For an inconsistent system, the problem of solving for the position location of the sound source becomes more difficult due to non-existence of a unique solution

Fig 3.4 Position Estimation of the Sound Source

3.3.2.1 General Model

A general model for the two-dimensional (2-D) position location estimation of a source using three microphones is developed. All TDOAs are measured with respect to the center microphone M1 and let index i = 2, 3. i = 1 represents the microphone M1. Let (x, y) be the source location and (Xi, Yi) be the known location of the i th Department of Instrumentation Technology Page 22

R V College of Engineering microphone. The squared range difference between the source S and the ith microphone is given as Sqrt ( (Xi-x)2 + (Yi-y)2 )

Ri

(3.4)

Ri = Sqrt ( Xi2 + Yi2 2Xix 2Yiy + x2 +y2) Using equation (3.4), the range difference between center microphone M1 and ith microphone is Ri,1 = ci,1 = Ri R1 Ri,1
= Sqrt(

(3.5)
2 2

( Xi x) + (Yi y) ) Sqrt( (X1 x) + (Y1 y) )

where c is velocity of sound, Ri,1 is the range difference distance between M1 and i th microphone, R1 is the distance between M1 and sound source and i,1 is the estimated TDOA between M1 and i th microphone. This defines the set of nonlinear hyperbolic equations whose solution gives the 2-D coordinates of the source.

3.3.2.2Position Estimation

To localize the source, we first estimate TDOA of the signal received by sensors i and j using DFSE technique proposed in chapter 2. The technique measures TDOAs w.r.t. the first receiver, di, 1 = di d1 for i = 2, 3... M. TDOA between receivers i and j are computed from di,j = di,1 dj,1 where, i, j = 2, 3, ...,M Let i = 2... M and source be at unknown position (x, y) and sensors at known locations (xi, yi). The squared distance between the source and sensor i is ri2= (xi x)2 + (yi y)2 = Ki 2xix 2yiy + x2 + y2 where i = 1, 2... M. where Ki = xi2 + yi2 If c is the speed of sound propagation, then ri,1 = cdi,1 = ri r1 define a set of nonlinear equations whose solution gives (x, y)

Department of Instrumentation Technology

Page 23

R V College of Engineering

3.4 Hardware Design

3.4.1 cDAQ-9172

The cDAQ-9172 is an eight-slot NI CompactDAQ chassis that can hold up to eight C Series I/O modules. This USB 2.0-compliant chassis operates on 11 to 30 VDC and includes an AC/DC power converter and a 1.8 m USB cable.

The cDAQ-9172 has two 32-bit counter/timer chips built into the chassis. With a correlated digital I/O module installed in slot 5 or 6 of the chassis, you can access all the functionality of the counter/timer chip including event counting, pulse-wave generation or measurement, and quadrature encoders.

3.4.2 Analog input module - NI 9234

The NI 9234 is a four-channel C Series dynamic signal acquisition module for making high-accuracy audio frequency measurements from integrated electronic piezoelectric (IEPE) and non-IEPE sensors with NI CompactDAQ system. The NI 9234 delivers 102 dB of dynamic range and incorporates software-selectable AC/DC coupling and IEPE signal conditioning for accelerometers and microphones. The four input channels simultaneously digitize signals at rates up to 51.2 kHz per channel with built-in antialiasing filters that automatically adjust to your sampling rate. 3.4.3 Digital output module NI 9472

The National Instruments NI 9472 is an 8-channel, 100 s sourcing digital output module for any NI CompactDAQ or CompactRIO chassis. Each channel is compatible with 6 to 30 V signals and features 2,300 Vrms of transient overvoltage protection between the output channels and the backplane. Each channel also has an LED that indicates the state of that channel. With the NI 9472, you can connect directly to a variety of industrial devices such as motors, actuators, and relays.

Department of Instrumentation Technology

Page 24

R V College of Engineering

3.5 Assumptions and Limitations

We assume the following conditions under which location of sound source is estimated: Single sound source, infinitesimally small, omni directional source. Reflections from the bottom of the plane and from the surrounding objects are negligible. No disturbing noise sources contributing to the sound field. The noise source to be located is assumed to be stationary during the data acquisition period. Microphones are assumed to be both phase and amplitude matched and without self-noise. The change in sound velocity due to change in pressure and temperature are neglected. The velocity of sound in air is taken as 330 m/sec. Knowledge of positions of acoustic receivers and perfect alignment of the receivers as prescribed by processing techniques.

Perfect solutions are not possible, since the accuracy depends on the following factors:

Geometry of microphone and source. Accuracy of the microphone setup. Uncertainties in the location of the microphones. Lack of synchronization of the microphones. Inexact propagation delays. Bandwidth of the emitted pulses. Presence of noise sources. Numerical round off errors.

Department of Instrumentation Technology

Page 25

R V College of Engineering

Chapter - 4 IMPLEMENTATION OVERVIEW

Department of Instrumentation Technology

Page 26

R V College of Engineering

Chapter - 4 IMPLEMENTATION OVERVIEW


4.1 Hardware and interfacing
In order to develop and evaluate the localization algorithms, it was necessary to first test the hardware and write the required software interfaces. The hardware used were five unidirectional microphones mounted on an array, NI compact DAQ along with three modules, a servo motor which houses an indicator.

LabVIEW was used for interfacing with the hardware, as it provided a rich data access toolbox. This also meant that no data conversion was required, as the localization algorithms were implemented in LabVIEW. NI cDAQ 9172 NI 9234
Analog I/P module

NI 9472
Digital O/P module

Array of microphones Pointer mounted on a servomotor

Sensing Sound Signal

Indicating direction of sound

Sound source

Fig. 4.1 Dataflow between major hardware components.

4.2. Overview of LabVIEW


The programming language used is LabVIEW A data flow programming language. Execution is determined by the structure of a graphical block diagram on which the programmer connects different function nodes by drawing wires. These wires propagate variables and any node can execute as soon as all its input data become

Department of Instrumentation Technology

Page 27

R V College of Engineering available. Multi-processing and multi threading hardware is automatically exploited by the built in scheduler, which multiplexes multiple OS threads over the nodes ready for execution.

NI LabVIEW is a graphical programming environment used on campuses all over the world to deliver project-based learning to the classroom, enhance research applications, and foster the next generation of innovators. With the intuitive nature of graphical system design, educators and researchers can design, prototype and deploy their applications.

LabVIEW programs/subroutines are called virtual instruments (VIs). Each VI has three components: Block diagram, a front panel, and a connector pane. The last is used to represent the VI in the block diagram of other, calling VIs. Controls and indicators on the front panel allow an operator to input data into or extract data from a running virtual instrument. However, the front panel can also serve as a programming interface. Thus a virtual instrument can either run as a program, with the front panel serving a s a user interface, or, when dropped as a node onto the block diagram, the front panel defined the inputs and outputs for the given node through the connector pane, this implies that each VI can be easily tested before being embedded as a subroutine into a larger program.

4.2.1 Front panel

Every user created VI has a front panel that contains the graphical interface with which a user interacts. The front panel can house various graphical objects ranging from simple buttons to complex graphs. Various options are available for changing the look and feel of the objects on the front panel to match the application needs.

4.2.2 Block diagram

Nearly every VI has a block diagram containing some kind of program logic that serves to modify data as it from sources to sinks. The block diagram houses a pipeline

Department of Instrumentation Technology

Page 28

R V College of Engineering structure of sources, sinks, VIs, and structures wired together in order to define this program logic. Most importantly, every data source and sink from the front panel has its analog source and sink on the block diagram. This representation allows the input values from the user to be accessed from the block diagram. Likewise, new output values can be shown on the front panel by code executed in the block diagram.

4.2 Programming using LabVIEW 11

There are 5 parts of graphical code in the program to make the VI and enable it to localize the sound source in the most efficient manner. They are:

4.3.1 Microphone signal interface using NI DAQ Assistant. 4.3.2 Threshold detection of each signal 4.3.3 Finding time delay of arrival 4.3.4 Direction and distance estimation 4.3.5 Servo control

By joining and grouping these blocks appropriately, and running them continuously will result in required software for the process.

4.3.1 Microphone signal interface using NI DAQ Assistant. Microphones output signal are connected to NI 9234 modules channels 1- 4. These must be configured. Configuring of channels can be done by the following steps:

Blank VI > Block diagram > Input > DAQ Assistant

DAQ Assistant is a graphical interface for building and configuring measurement channels and tasks. The signal properties such as INPUT, ANALOG, VOLTAGE, CHANNELS, SIGNAL RANGES, ACQUISITION MODE and SAMPLING RATE are to be properly selected as shown in the figures below.

Department of Instrumentation Technology

Page 29

R V College of Engineering Figure 4.2 shows the DAQ Assistant VI on the functions Palette.

Fig 4.2 Opening the DAQ assistant Palette

Fig 4.3 NI DAQ Assistant settings Department of Instrumentation Technology Page 30

R V College of Engineering Figure 4.3 indicates process for selection of each channel. This is the Create New Express Task dialog box. Here you can choose which data acquisition device to use as well as specify what type of data you want to acquire or generate. Select Generate SignalsAnalog OutputVoltage to specify that you want to generate the voltage of a signal

The DAQ Assistant dialog box allows you to edit the configurations you want to use to measure and read the voltage. In figure 4.4, by clicking on the Add channel option, the five channels connected to microphones can be added. The sampling rate can be set as per requirement.

Fig 4.4 NI DAQ assistant configuration

Department of Instrumentation Technology

Page 31

R V College of Engineering Given below (Fig 4.10) is a snap shot of the block diagram containing the DAQ Assistant.

Fig. 4.5 DAQ Assistant on the block diagram

The sampling rate and the acqusition speed can be specified as per requirement. All the five signals are received at one from the data output of the DAQ Assistant. However, each signal needs to be processed seperately so a split signal function is used to do the same. The five signals are further computed in the remaining portion of the code.

4.3.2 Threshold detection of each signal

To avoid unwanted results, a threshold must be set. i.e we want the correlation to start only when the desired sound is made and not the moment the program starts to run. To ensure this, a threshold of 0.8 mV is set. So if any of the five microphones pick up signals greater than 0.8 mV, the generalized cross correlation begins with respect to the reference microphone. The set threshold can be seen in fig 4.12. When the signal crosses the threshold as seen, the correlation starts. Department of Instrumentation Technology Page 32

R V College of Engineering Give below (Fig 4.11) is a snap shot of the thresohold detection portion of the code.

Fig. 4.6 Threshold detection

Fig 4.7 Waveforms generated on front panel

Department of Instrumentation Technology

Page 33

R V College of Engineering

4.3.3

Finding time delay of arrival

Given in fig below is the sound signal received at two microphones simultaneously. On close observation it can be seen that the signal is received at slightly different time intervals. This is the time delay of arrival and has to be calculated. As the time difference is very small, time stamping the reception of the signals at the microphones will not give accurate results. In order to do this, generalized cross correlation is done as explained in design. By cross correlating the signal, level of similarity of two waveforms as a function of a time-lag applied to one of them is found. As an example, consider two real valued functions f and g differing only by an unknown shift along the x-axis. One can use the cross-correlation to find how much g must be shifted along the x-axis to make it identical to f. The formula essentially slides the g function along the x-axis, calculating the integral of their product at each position. When the functions match, the value of (f*g) is maximized.

Fig. 4.8 Observed time delay of arrival between two microphones Department of Instrumentation Technology Page 34

R V College of Engineering

Given below (Fig. 4.9) is a snap shot of the cross correlation done. The signal received at each microphone is cross correlated with the reference microphone (Microphone 1 at coordinate (0,0,0))

Fig. 4.9 Generalized cross correlation Department of Instrumentation Technology Page 35

R V College of Engineering

4.3.4 Direction and distance estimation Using the estimated time delay of arrival, specific algorithms are implemented to estimate the position of sound source. Fig shows the direction estimation in 2D and 3D. 2. Solving linear equation 3. Extracting all the three elements of direction vector and indicating the same on 3D graph

1. Coordinates of all five the microphones

4. Extracting first two elements of the direction vector and finding direction in 2D

Fig. 4.10 Direction estimation In figure 4.10, 1 indicates the coordinates of the microphones 2, 3 and 4. The coordinates along with the difference in distance in solved as a linear equation for the unknown matrix which is the direction vector indicating the direction of the sound source. In 3, all the three components of the direction vector are extracted to indicate direction in 3-D. The same is plotted on a 3-D graph. In 4, only two components of direction vector are extracted to indicate direction in 2-D. The value obtained in radians is converted to degrees.

Department of Instrumentation Technology

Page 36

R V College of Engineering

In figure Fig 4.11, using the algorithm mentioned in the design, the distance to the sound source can also be estimated using hyperbolic position estimation. It again employs the time delay of estimation to formulate the equations.

Fig. 4.11 Distance estimation 4.3.5 Servo control Once the direction is found, a servo motor is used to indicate the same. The duty cycle of the servo motor and the direction values are interpolated to specify the rotation of the servo motor for every degree. (Fig 4.12)

Fig. 4.12 Servo control

4.4 System Hardware


Figure 4.13 shows the major components in the physical set up of our system. The microphones are mounted on the array structure to collect the sound signals. These signals are sent to the PC via the CompactDAQ. In the PC, the program is run on LabVIEW which does the processing and computation to obtain of the direction and Department of Instrumentation Technology Page 37

R V College of Engineering distance to the sound source. We will now describe each of the components in greater detail.

Fig. 4.13 Set up 4.4.1 Microphone

The Panasonic RPVK21 Microphone (Fig 4.3) is a dynamic type, uni-directional microphone. The microphone features an 80 Hz- 12 kHz frequency response and 55 dB/mW sensitivity which ensures that the sound is clear. It comes with a built-in on/off switch that is easy to operate and an O.F.C output cable that measures 3 meters in length.

Fig 4.14 Dynamic microphone

4.4.2 The microphone array

A stand for the microphones (figure 4.4), was constructed as per specifications, and enabled the height of the entire array be adjusted from 1.5 -2 meter. For the purposes of this project, a baseline of 1.5 meter was used. The servo was mounted below the microphones on the central axis. Department of Instrumentation Technology Page 38

R V College of Engineering

The purpose of the servo motor was to indicate the direction of the sound source on one half of the 2-D plane. i.e 0-180 degrees.

Fig. 4.15 Microphone array

The coordinates of each microphone were fixed as shown in the figure below( Fig 4.5)

(0,-30,35)
(50,-35,0) (100,-35,0)

(-50,-35,0)

(0,0,0 )

Fig. 4.16 Microphone coordinates 4.4.3 Data Acquisition The cDAQ-9172 is an eight-slot NI CompactDAQ chassis that can hold up to eight C Series I/O modules. It is connected to the Windows host computer connected over USB. NI CompactDAQ serves as a flexible, expandable platform to meet the needs of any electrical or sensor measurement system.

Department of Instrumentation Technology

Page 39

R V College of Engineering By placing instrumentation close to the test subject, electrical noise can be minimized from the surroundings. This is because digital signals, used by USB are significantly less susceptible to electromagnetic interference. Since the NI CompactDAQ is a small rugged package, it can be easily placed close to the unit under test.

4.4.3.1 Modules

Analog input module - In our project, of the 8 slots, we have utilized 3 slots (slot1, slot2, slot5) as shown in fig. Slot 1 and slot 2 are occupied by two NI 9234 modules. NI 9234 are analog input modules capable of simultaneous acquisition. The five microphones were connected to five channels respectively use BNC connections. The required signal conditioning is done within the modules itself. Maximum allowable sampling rate is 51.2kHz per channel. We have set our sampling rate as half of that i.e 25.6kHz per channel as during testing, our maximum frequency component does not exceed 1000 hz. So 25.6kHz was found to be more than sufficient as over sampling was resulting is excess data and hence slower processing. After the signals are received by the analog input modules, they are sent for further processing to LabVIEW. Here the algorithms which have already been discussed are implemented. And once the direction and distance have been found, it is displayed on the front panel as shown in fig 4.7.

Fig. 4.17 Front panel Department of Instrumentation Technology Page 40

R V College of Engineering

NI cDAQ 9172

NI 9234

NI 9274

Microphones 1,2,3,4 and 5 NI 9234 is connected to the counter 0 of connected to channels 0 to 3 the cDAQ in slot 5. PWM output is in the first module and given from channel 3 of the DO module channel 0 in the second to the servo motor. Channels 8 and 9 are using standard BNC used for giving a Vsup of +5v. connectors. Fig. 4.18 Hardware set up Digital output module - Once the direction of sound source is found, the same is indicated visually using a pointer which is mounted on a servo motor (fig. 4.8) As shown in fig. the digital module NI 9274 is placed in slot 5 of the cDAQ chassis (as slot 5 and 6 are the counter slots). The direction of sound source is given to the digital module which in turn sends it to a servo motor in the form of a duty cycle input.

Servo motor Fig 4.19 Indicator Department of Instrumentation Technology Page 41

R V College of Engineering

4.5 Flow Chart

Pick a peak Signals from five microphone array Generalized cross correlation Estimate TDOA

Calculate path differences Estimated path differences

Dimensions of coordinate system Position estimation

Microphone locations

Estimated source location

Department of Instrumentation Technology

Page 42

R V College of Engineering

Chapter - 5 RESULTS AND DISCUSSION

Department of Instrumentation Technology

Page 43

R V College of Engineering

Chapter - 5 RESULTS AND DISCUSSION


Experiments were done, using the algorithms described in the previous chapter, in order to be able to gain insight into the operation of the system. A localization error for each scenario was measured as the difference between the true angle, calculated from the center of the array to the primary source, and the estimated angle as predicted by the time delays. For this, it was assumed that the source was far away, compared to the size of the array, and that the source could therefore fall on a straight line from the array. This assumption was made and the errors calculated for both the azimuthal and altitudinal angles of incidence and for each time-delay estimation routine implemented. By its definition, the altitudinal angle may vary from +90 to -90. The azimuthal angle may vary from 0 to 180.

5.1 Experimental set up

The source localization routine was tested by sound recording experiments done in a laboratory. We setup a fixed coordinate system in the laboratory. Four microphones were placed at the tips of an imaginary tetrahedral, whose sides are about 40 cm long. A fifth microphone was placed as an extended arm of one of the microphones (Fig.5.1). The microphones were hooked up to a computer, which ran a LabVIEW program. The program saved five of signals from the microphones. Several sound recording experiments were done by placing a source of sound at various locations in the laboratory.
Mic 4

Mic 3

Mic 2

Mic 5

Mic 1

Fig. 5.1 Array structure

Department of Instrumentation Technology

Page 44

R V College of Engineering

We take into account both correlated noise and reverberation into account when generating our test data. By setting a threshold, we eliminate the inherent noise and pick up the most dominant sound in the room. The setup corresponds to a 6m-7m2.5m room, with five microphones placed at a distance from each other, 1m from the floor and 1m from the 6m wall (in relation to which they are centered). The sound source is generated from different positions.

The sampling frequency is 25.6kHz, and acquisition rate is 10kHz samples i.e. every 0.4 seconds. The sound source is generated using an air gun whose frequency range lies within 500 1000 Hz. Thus a 25.6kHz sampling rate was sufficient.

A number of complications limit the potential accuracy of the system. Some of these are due to physical phenomena that can never be corrected, and others are due to inherent errors built into the processing, due to the design of the system. As mentioned in the introduction, complications in locating the sound source that exist outside of perfect conditions.

5.2 Experiment 3: Time delay of arrival

By estimating the measured sharp peak created by cross correlation of microphone pairs, the time delay of arrival can be found. Given below is a figure showing the time delay of arrival between microphones 1, 2 and 3. It can be seen in Fig 5.2 that t1 is the extra time taken by the sound signal to reach microphone 2 and similarly t3 is the extra time taken to reach microphone 3. Since the microphones are placed in a colinear fashion, on multiplying this time delay by the speed of sound, the distance between the microphones is obtained. We have obtained the same using generalized cross correlation. It was found to be highly accurate with +3 cm accuracy.

Mic 3

Mic 2

Mic 1 t t+t1 t+t2


Sound Source

Fig. 5.2 Time delay of Arrival between microphone 1, 2, 3 Department of Instrumentation Technology Page 45

R V College of Engineering 5.3 Experiment 2: Direction of arrival

Once the time delay is estimated, it is used in a suitable algorithm as explained in previous chapters to find the direction of sound source. For direction in 2D, consider the plane formed by microphones 1, 2, 3 in figure 5.1. The table (5.1) shows the estimated source location and the direction of the source. It gives the error when measured in 2D. To normalize the error on both sides, instead of considering the direction of the sound source from 0-180 degrees, 0 to (+90) and 0 to (-90) is considered on both sides. The same is plotted as a graph in Figure 5.3.

Actual direction (Deg)


10 20 30 50 80 90 -80 -50 -30 -20 -10

Estimated direction (deg)


12 23 35 46 79 90 -75 -45 -37 -22 -16

%Error
20 15 10 6 3 0 4 8 11 14 19

Table 5.1

E R R O R P E R C E N T A G E

25 20 15
Error

10 5 0 0 10 30 50 80 90 -80 -50 -30 -20 -10


ACTUAL TIME DELAY

Figure 5.3

Department of Instrumentation Technology

Page 46

R V College of Engineering On observation it can be seen that the direction finding is most accurate in the range of 80 100 degrees. As we go towards the extremes, the accuracy falls as the microphones are unidirectional in nature. Hence, the signals are not picked up at its best when it comes from the side. For best results, the sound source should be located right in front of the microphone array. With omnidirectional microphones, this constraint could be removed. But keeping the cost and availability in consideration, we decided on the unidirectional microphones.

5.4. Experiment 3: Distance estimation For distance finding in 2-D, the microphone array consists of 3 microphones. We have conducted preliminary experiments with the 3 element microphone array. The experiments involved acquiring signals from a sound source which is triggered by a suitable mechanism. The source is located in a plane and its location is estimated using the planar 3 microphone array.

The source is positioned at various places in 2-D space. The table 5.2 produces the true, estimated source locations of the sound source. As mentioned in the chapter 3 Chan Hos linear array optimization method is utilized for solving the nonlinear equations.

True distance (cm) 10 50 75 100 120 150 180 210 250

Indicated distance (cm) 13 55 79 104 125 157 189 221 275 Table 5.2

%Error 30 10 5.3 4 4.1 4.6 5 5.23 10

Department of Instrumentation Technology

Page 47

R V College of Engineering Distance to the sound source was found in 2-D. Table 5.2 tabulates the readings obtained. On studying the same, varying levels of accuracy can be found. Larger percentage of error is found when the sound source is placed too close to the microphone or when the sound source is placed beyond 2 meters. So a safe range of 0.25 to 2 meters can be set.

The reason for this discrepancy is that if the sound source is placed too close to the microphone array, it assumes a spherical approach. And our project works on the assumption that sound signal travels in a straight manner i.e. the spherical nature of the sound signal is not taken into account. Secondly, if the sound source is placed far away, the sound signal reaches the microphones in an almost parallel manner so the small time delay of arrival is not accounted for.

Department of Instrumentation Technology

Page 48

R V College of Engineering

Chapter 6 CONCLUSIONS AND FUTURE WORK

Department of Instrumentation Technology

Page 49

R V College of Engineering

Chapter 6 CONCLUSIONS AND FUTURE WORK

6.1 Conclusion

In this report we present an implementation of a sound-based localization technique and introduce the platform we used in our lab. The report summarizes the basics of sound-based localization as discussed in the literature. The process of time delay of arrival estimation is explained. Then, the report explains the design which includes all the algorithms, hardware, the assumptions and limitations. The implementation of the concept is explained in detail. Finally, a comprehensive set of experimental results are offered.

We find that in our current hardware deployment there are still many inevitable errors in time of delay calculation. We proposed our algorithm which uses peak-weighted goal function that detects sound source location in real time.

6.2 Future work


There are multiple factors which contribute towards errors in the sound-based localization implementation. Future work will address reducing the impact of these factors. These can be identified as follows:

(ii) Different materials exhibit different reflection and absorption coefficients. It has been observed that the material of the floor between the microphone pair and the sound source, affects the phase as well as amplitude of the signal received. (iii) As the distance between the microphone pair and the sound source decreases, the DOA estimates become coarser. (iv)The position of sources of ambient noise in the room is important. This will affect the nature of the percentage abnormality plot causing it to become non-symmetric. (v) Position of reflective surfaces around the experimental setup contributes towards the fluctuations.

Department of Instrumentation Technology

Page 50

R V College of Engineering (vi) Physical parameters such as speaker width and sensitivity of the microphone contribute towards measurement errors. (vii) The frequency response of the microphone elements also affects the fidelity of the captured signal. (viii) Accuracy of experimental setup and error due to elevation of microphone and sound source are also factors which may cause errors.

The hyperbolic position location techniques presented in this thesis provides a general overview of the capabilities of the system. Further research is needed to evaluate the Dominant linear array algorithm for hyperbolic position location system. If improved TDOAs could be measured, the source position can be estimated very accurately. Improving the performance of the algorithm for TDOA measurements reduces the TDOA errors. The algorithm discussed for TDOA measurements is in its simplest form.

Experiments were performed assuming that the source is stationary until all the microphones have finished sampling the signals. Sophisticated multi-channel sampling devices could be used to get rid of this stationary condition. While the accuracy of the TDOA estimate appears to be a major limiting factor in the performance of the hyperbolic position location system, the performance of the hyperbolic position location algorithms is equally important. Position location algorithms which are robust against TDOA noise and are able to provide unambiguous solution to the set of nonlinear range difference equations are desirable. For real-time implementations of source localization, closed-form solutions or iterative techniques with fast convergence to the solution could be used. The trade-off between computational complexity and accuracy exist for all position location algorithms. The trade-off analysis through performance comparison of the closedform and iterative algorithms can be performed.

To find time delay only the most dominant peak is considered after performing correlation. Exploring the possibility of taking advantage of second peak with particle filtering must be done in order to get more reported sound source location data.

Department of Instrumentation Technology

Page 51

R V College of Engineering

Chapter 7 APPENDIX

Department of Instrumentation Technology

Page 52

R V College of Engineering

BIBLIOGRAPHY
[1]Byoungho Kwon KAIST, Daejeon Gyeongho Kim ; Youngjin Park Sound Source Localization Methods with Considering of Microphone Placement in Robot PlatformRobot and Human interactive Communication, 2007. RO-MAN 2007. The 16th IEEE International Symposium on 26-29 Aug. 2007 [2]Jean-Marc Valin, Francois Michaud, Jean Rouat, Dominic Letourneau, Robust Sound Source Localization Using a Microphone Array on a Mobile Robot [3]Y.T. Chan, senior member, IEEE, and K.C Ho, Member IEEE, A simple and Efficient Estimator for Hyperbolic location, IEEE transaction on signal processing, Vol 2, No. 8, Aug 1994. [4] Ralph Bucher and D. Misra A Synthesizable VHDL Model of the Exact Solution for Three-dimensional Hyperbolic Positioning System, Volume 15 (2002), Issue 2, Pages 507-520. [5] Johnson, Don H, Array Signal Processing: concepts and techniques. [6] Lorraine Green Mazerolle, Ph.D, James Frank, Ph.D. A Field Evaluation of the Shot spotter Gunshot Location System [7] Wang, H., Chu, P.: Voice Source Localization for Automatic Camera Pointing System in Videoconferencing. Proc. of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, Mohonk, New Paltz, NY, USA (1997) [8] Biniyam Tesfaye Taddese, Sound Source Separation and Localization Honors Thesis in Computer Science Macalester College, May 1st 2006. [9] Alessio Brutti, Maurizio Omologo, Piergiorgio Svaizer, Comparison Between Different Sound Source Localization Techniques Based On A Real Data Collection, IEEE Conf. On HSCMA 2008. [10] M. Brandstein and H. Silverman, A practical methodology for speech localization with microphone arrays, Technical Report, Brown University, November 13, 1996 [11] J. O. Pickels, An Introduction to the Physiology of Hearing, Academic Press, London, 1982.

Department of Instrumentation Technology

Page 53

Anda mungkin juga menyukai