Gerben Zeilstra 20090827 Speech Intelligibility

Faculty of Applied Sciences
Speech Intelligibility in Classrooms
A new measurement method
Master Thesis Project
Name: G.J. Zeilstra

Studentnumber: 1053876
Programme: Master Applied Physics
End date: 27/Aug/2009
Supervising Tutor: D de Vries (dr ir. D.)
Research Group: Acoustics
2nd reviewer: Gisolf (prof. dr ir A.)
3rd reviewer: Koen van Dongen (dr K.W.A.)
1
0 Abstract
Why is it so important that the acoustical surroundings in classrooms are good?

That question is the foundation of this research. Imagine that in a poor acoustical
environment only half of the information from teacher is understood by the students. And
with that half of the information the children should be able to learn reading, writing etc.
As people grow older these problems decrease since adults are able to understand the
context of the sentence and with that compensate for the words they can’t hear. For
children poor acoustics can lead to learning disorders and undesired behavior, especially
for mentally handicapped children this is a problem. Due to the fact they can’t understand
the teachers and may become scared and act in ways that others explain as undesired
behavior [xl].
Besides the problems of understanding the teacher, noise in classrooms and at home can
have further complications. One of the most striking researches on this has been
performed by the Cornel university [i], where they prove that children not only have
lower reading scores after the airport was build but also have higher blood pressure,
higher epinephrine 1 (adrenaline) levels and higher norepinephrine levels. Similar results
have been obtained in a Canadian research [ii], effects like sleeping disorders are also
known to be caused by noise in communities. Besides these psychological effects, effects
on reading abilities have been proven as well, an example is a research by Evans [iii].
Figure 0-1 Effects of noise on grade equivalent scores, source [iv]
In Figure 0-1 the effects of noise on children’s performance are depicted, the figure
shows both that more noise results in lower grades and it shows that older children are
less affected by noise. These results are reproduced on many occasions by several
researchers, like in [iv,v, vi and vii].
1
http://en.wikipedia.org/wiki/Adrenaline
2
In this research we are not trying to prove that these effects exist, but assume that a good
acoustical environment is necessary for children. Based on these researches we can safely
assume this is true. This assumption is supported by the many regulations we find on both
national and international levels, see appendix 13.4. The Reverberation Time and Signal
to Noise Ratio are commonly regulated parameters. They vary from 0.4-1.0 second for
the RT and 10-20 dB for the SNR.
The goal of this research is to investigate a new measurement method on Speech
Intelligibility (SI). First the acoustical parameters that influence the SI are investigated
followed by the common figures that are used to represent the SI. This is done by
showing the influence of the SNR and Direct-Reverberant-Ratio (DRR) on these figures
and discussing their advantages and disadvantages. The most striking shortcomings in
these measurements are that none of them can be performed in an actual classroom
situation (during lecture, using the speech of the teacher) and do not take into account the
possible hearing loss of the child. The new measurement tries to improve just those two
points without compromising too much on other factors. The effectiveness and quality of
this new measurement method is verified by comparing it to STI measurements. A
method that has proven it’s worth and capabilities of representing the SI.
0.1 Acknowledgements
This Thesis could not have been completed without the help of several people for
which I would like to give my explicit thanks.
First of all my father and mother, who have supported and motivated me
throughout my entire live to strive for this and many other goals, Sabrine who was
one of the persons responsible for the, very much needed, push to get going again
and supported me in my efforts, Diemer de Vries who has guided me in this long
process and finally Lau Nijs for providing me with the equipment, programs and
support to perform the STI and RT measurements.
A special not goes to Frans Coninx who has given me the chance to work on this
project and provide me with the measurement setup to perform the new
measurement method.
3
0 Abstract ....................................................................................................................... 2
0.1 Acknowledgements............................................................................................. 3
1 Introduction................................................................................................................. 6
2 Room Acoustics .......................................................................................................... 7
2.1 Impulse response................................................................................................. 7
3 Acoustics and Speech Intelligibility ........................................................................... 9
3.1 Signal to Noise Ratio .......................................................................................... 9
3.2 Direct to Reverberant Ratio .............................................................................. 10
3.3 Combined effects .............................................................................................. 13
4 Measurement methods of Speech Intelligibility ....................................................... 15
4.1 Speech Transmission Index .............................................................................. 15
4.2 Energy ratios ..................................................................................................... 17
4.3 Articulation loss of Consonants ........................................................................ 18
4.4 Articulation Index ............................................................................................. 21
5 Speech Transmission Index ...................................................................................... 22
5.1 Calculating the STI ........................................................................................... 22
6 Interpretation of the STI............................................................................................ 26
6.1 Relating STI to subjective measures................................................................. 26
6.2 Predicting STI values........................................................................................ 28
7 Hearing Impaired persons ......................................................................................... 30
7.1 Speech perception for the hearing impaired ..................................................... 30
7.2 Speech perception of children compared to adults ........................................... 32
8 Coninx method.......................................................................................................... 33
8.1 The SRT measurement...................................................................................... 33
8.2 The SNR measurement ..................................................................................... 34
8.3 The differences compared to STI...................................................................... 36
9 Simulations ............................................................................................................... 38
9.1 Simulations setup .............................................................................................. 38
9.2 Simulation results.............................................................................................. 41
9.2.1 Reverberation time.................................................................................... 41
9.2.2 Signal to Noise Ratio ................................................................................ 43
9.2.3 Speech Transmission Index ...................................................................... 45
9.2.4 C-50........................................................................................................... 48
9.3 Simulation Conclusions .................................................................................... 50
10 Case Signis............................................................................................................ 52
11 Measurements and Results.................................................................................... 53
11.1 Measurement method........................................................................................ 53
11.2 Measurement results ......................................................................................... 54
11.2.1 STI Results................................................................................................ 56
11.2.2 Coninx method results .............................................................................. 58
11.2.3 Comparing STI to Coninx method............................................................ 61
11.3 Measurement results summary ......................................................................... 63
11.3.1 STI Measurement...................................................................................... 63
11.3.2 Coninx method results summary .............................................................. 64
12 Discussion ............................................................................................................. 64
4
12.1 Coninx-Zeilstra method .................................................................................... 64
12.1.1 Measurement protocol .............................................................................. 65
12.1.2 Calculating STI from the SNR and RT..................................................... 66
13 Appendix............................................................................................................... 71
13.1 Case Signis........................................................................................................ 72
13.1.1 The Problem.............................................................................................. 72
13.1.2 The Classroom .......................................................................................... 73
13.1.3 The Analysis ............................................................................................. 74
13.1.4 The solution .............................................................................................. 75
13.1.5 Advise ....................................................................................................... 76
13.1.6 Result ........................................................................................................ 76
13.2 From SPL to SNR ............................................................................................. 77
13.3 Audio fragments analysis.................................................................................. 78
13.3.1 Analysis of the test signal ......................................................................... 78
13.3.2 Analysis of audio fragment....................................................................... 78
13.3.3 Reproducibility test for audio fragment analysis ...................................... 79
13.4 International regulation..................................................................................... 81
13.5 Dutch Noise Regulation.................................................................................... 82
13.6 European Guidelines......................................................................................... 87
13.6.1 Reverberation Time .................................................................................. 87
13.6.2 Signal to Noise Ratio ................................................................................ 89
13.7 ANSI Classroom Requirements........................................................................ 92
14 Bibliography ......................................................................................................... 94
5
1 Introduction
The Speech Intelligibility (SI) describes the quality of the signal the listener
receives; this quality is mainly dependent on the Direct-Reverberant-Ratio (influenced by
the Reverberation Time) and the Signal-to-Noise-Ratio (influenced by the sources sound
level, background noise and distance from the source to receiver). The Reverberation
Time is on its turn dependent on the absorbing qualities and dimensions of the materials
in the room, which is shown in Equation 2-1. Some research on this topic has been done
by C. Crandel and J. Smaldino in their research on classroom acoustics, see [viii]. They
concluded that the teacher voice level and the distance from the teacher to the student had
a significant influence on the SI as well.
In order to investigate the SI, the physical parameters that influence the intelligibility are
explained. Followed by the different methods of measuring the SI and how they use those
physical parameters. Even though the SI is especially important for children in a learning
environment, and even more so for children with a hearing impairment, there is no
specific measurement method to measure the SI for one specific child with a hearing
impairment, nor is there a way of measuring the SI during a normal classroom lesson.
In this research a new method is being investigated that should be able to do all this. The
method is compared to the other methods of measuring the SI; this is done by
investigating the difference in dependency on the SNR and DRR, by comparing actual
measurement results and by doing intensive simulations in order to understand the
physical background of the outcome. This work is done in the scope of a Master Thesis
of the TU Delft in cooperation with the Audio Pedagogic Institute Solingen, lead by Prof.
Dr. Ir. F. Coninx, and the Institute Viataal in the Netherlands.
The measurements have been performed on schools connected to this institute in the
Netherlands and are compared to measurement results based on a STI test developed by
Ir. L. Nijs from the faculty of Architecture of the TU Delft. The simulations are
performed in CATT Acoustic; an acoustical simulation program developed especially for
simulations on acoustical parameters in buildings.
6
2 Room Acoustics
The quality of the speech at the receiver’s position can be expressed as the Speech
Intelligibility, or the SI. This measure is influenced by two acoustic factors, being the
Signal to Noise Ratio, or SNR, and the Direct to Reverberant Ratio, or DRR.
The SNR is the ratio of the source energy and the energy of the noise at a specific
place in the room. The DRR is the energy ratio of the direct sound energy and the energy
of the reverberated or non direct sound from the source. The reverberant sound is thus
sound from the source which does not reach the receiver directly, but reflected on any
surface in the room. The two factors will be discussed in detail in chapter 3.
In this thesis it will be assumed that the reverberant sound in the room acts as a
diffuse field, meaning that the sound is apparently coming from all directions and is
reverberated many times. In fact this is not the case in a normal room, but the formulas
derived from this assumption can give a good idea on how the acoustics of the room are
in reality.
2.1 Impulse response
The Dirac Delta impulse is a non existing, infinitely small (in time) impulse with a
surface of one (in sound energy), which is used in acoustics on many occasions. It will be
used here to explain the impact of reflections of the sound on the SI. The impulse
response is the output of a system when an impulse is used as input. When the Dirac
Delta impulse is translated to the frequency domain a flat line is observed for all
frequencies. Since this is not possible in real life usually a sine sweep is used here which
will be band limited. An example of the Dirac Delta impulse, impulse response and a
portion of the sine sweep are shown in the figure below.
Figure 2-1 Dirac delta impulse, impulse response and Sine Sweep
In most measurements the sine sweep is produced by a loudspeaker and then recorded at
the output of the system. If the system is the room itself this can be done using a
microphone. The recorded signal is then translated again to the time domain to obtain the
7
impulse response. With the impulse response in the time domain a distinction can be
made between different arrival times at the receiver. The time it takes a sound reflection
to arrive at the receiver positions relative to the direct sound arrival is the key factor in
making the division between useful and disturbing sound. The different timings will be
explained below:
• Direct sound:
The sound that arrives at the receiver directly from the source, with no reflections, is
called the direct sound. This sound is considered to be the most useful for the SI; the
more direct sound there is the better the SI will be, if all other variables remain the same.
The direct sound energy could be zero if there is no free path between the source and the
receiver and thus the sound is always reflected at least once before it reaches the receiver.
• Pseudo Direct sound:
This is the sound that reaches the receiver in the first 20ms after the direct sound. These
sounds are usually reflected once from the ceiling, floor, a nearby wall or other objects in
the room. These reflections are proven to improve the SI for a human listener since the
brain integrates the first 20ms of sound, thereby increasing the apparent sound energy.
• Early reflections:
This is the sound that arrives at the receiver from 20ms to 50\80ms after the direct sound.
These sound waves are reflected once or twice and usually involve reflections on a wall
on the opposite site of the room or on of the side walls. It is shown in many studies that
for speech the first 50ms are useful and an increase in the energy in this timeframe
improves the SI. For music the first 80ms are proven to improve the SI. Since this thesis
will be about speech we will consider the first 50ms to be early\useful.
• Late reflections:
The sound arriving at the receiver after 50ms is considered as late sound and could have
reflected on surfaces in the room many times. In fact for the diffuse field model the sound
must have reflected numerous times so the position in the room is of no importance
anymore.
The earlier mentioned DRR is considering the energy ratio of the direct sound compared
to all later arriving sound, while the energy ratio models, like D50, considers the energy of
the first 50ms (since the early reflections improve the SI as explained above) compared to
total sound energy that reaches the receiver.
At a certain distance from the source the sound level of the direct sound will be
equal to the sound level of the reverberated sound. This distance is called the critical
distance given by:
Dc = 0.057 VQ
T60
Equation 2-1
Where:
• Q is the directivity of the source
• V is the volume of the room
• T60 is the reverberation time for a decay of 60 dB in sound level
8
3 Acoustics and Speech Intelligibility
The speech intelligibility is influenced by two quantities, the Signal to Noise Ratio which
is determined by the noise level for a given speech signal and the Direct to Reverberant
Ratio which is influenced by the reverberated sound level for a given speech signal.
These two factors will be discussed in sections 3.1 and 3.2.
3.1 Signal to Noise Ratio
The Signal to Noise Ratio, or SNR, is influenced by several parameters in the

room. The SNR is described as the ratio between the levels of the useful and disturbing
signal. The SNR is expressed by:
, signal (r , t )
⎛ p rms
2
⎞
SNR (r , t ) = 10 log⎜ 2 ⎟ _ [dB ]
⎜ p
⎝ rms , noise (r , t ) ⎟
⎠
Equation 3-1
From the equation we see that the SNR is a function of time, the time dependency can be
eliminated by taking the average over a fixed period. The rms function only smoothes the
time effect but doesn’t completely eliminate it.
The useful source signal level is influenced by the distance from the source to the
receiver. The effect of the SNR is illustrated by the example that even at low noise levels
speech can be poorly intelligible, if the sound level of the source is low as well. Therefore
the SNR gives more information than the background noise alone. The best example
where noise alone influences the SI is in an anechoic room or free field (like at sea). In
this situation no sound is reflected from the surfaces and only the noise level reduces the
SNR and thereby the SI.
The SNR is usually calculated as a broadband parameter; however it can be
improved by computing the SNR in several frequency bands. This can be useful in case
the background noise and\or the signal have a specific frequency spectrum. This is done
in the Speech Transmission Index (STI) and Articulation Index (AI) that use the SNR in
several octave bands to arrive at one final figure. These methods are explained in Chapter
4.
To obtain specific information on the noise spectrum in a room sometimes the noise
criteria (NC) curves as explained in [ix] are used. This method measures the noise in
octaves or 1/3 octave bands from 63 to 8000 Hz and compares it to spectral curves as
shown in Figure 3-1. The NC value that characterizes the background noise in a room is
determined by the highest octave band sound pressure level that intersects the NC family
of curves. In the figure three noise curves are plotted each resulting in a NC of 40, but it
can be seen that the main disturbing frequency is different for all three curves. The NC
rating is generally 8–10 dB below the noise level of that room.
9
Figure 3-1 Example of a Noise Criteria measurement
Since the NC indicates the main frequency of the noise more dedicated measures
can be taken to reduce this noise level by means of better isolation of the noise source or
by placing absorbing material in the room that is specially designed to absorb that
specific frequency band.
The frequency spectrum of the noise is greatly determining the hinder that the listener
perceives from it. Spectra that are more aligned with the spectrum of the source reduce
the SI more, since they reduce the SNR in the important frequency bands more. Also
lower frequency noises are disturbing the speech intelligibility more than higher
frequency noises do, due to the upward spread of masking. This means that low
frequency noise not only masks low sounds, but high sound as well. Masking means that
the hearing threshold becomes higher due to the presence of another sound.
3.2 Direct to Reverberant Ratio
The Direct to Reverberant Ratio, or DRR, is the ratio between the direct and
reverberant sound levels and is expressed by:
⎛ [ p rms ,direct (r )]2 ⎞
DRR(r ) = 10 log⎜ ⎟ _ [dB ]
⎜ [p ( r )]2 ⎟
⎝ rms ,reverberant ⎠
Equation 3-2
The DRR is a function of the distance from the source, but not of time since the source
signal for a DRR measurement is a Dirac Delta impulse. The distance to the source,
amount of absorption present in the room, the dimensions and shape of the room and the
10
source directivity determine the level of the reverberated sound. The level of the direct
sound is determined by the distance from the source to the receiver and the source
directivity. An example of the relation of the direct and reverberant sound as a function of
the distance to the source is shown in the figure below.
Figure 3-2 Direct and reverberant sound as a function of the distance to the source
In case of a noise free (reverberation) chamber, only the DRR has an influence on
the SI, a higher DRR will improve the SI and vise versa.
The effect of the reverberant sound is that the direct sound is masked and
sometimes even words will overlap, this is especially the case for vowels since they have
more “power”. Because of this overlapping the reverberant sound fills temporal pauses
between words and thus reduces the speech intelligibility. Generally the reverberant
sound pressure level is about equal for all receiver positions in a room in a diffuse field,
but can differ significantly in a non diffuse field.
The direct sound pressure level is mainly influenced by the distance from the source
to receiver and is reduced by 6 dB every time the distance to the source doubles as
indicated in the equation below:
2
p rms
I (r ) =
4π ⋅ r 2
Equation 3-3
The reason for this is that the direct sound, for a non direction sensitive source, in a
free field expands like a sphere, therefore the surface of the sphere quadruples for every
time the distance to the source doubles. 1/4th of the sound pressure/m2 means a 6 dB
decrease in sound level. For sound waves that expand non spherical this decrease will be
lower, but only for a plane wave the sound pressure does not decrease at all due to this
phenomenon. This means that the further away a receiver is, the lower the sound level of
the direct sound will be at that position.
The distance at which the sound pressure of the direct field is equal to the
reverberant field is called the critical distance as explained in Equation 2-1 and shown as
11
rc in Figure 3-2. When the receiver is within the critical distance, reverberation will have
minimal effects on speech perception. Beyond the critical distance however, reflections
can significantly reduce speech perception, particularly if there is a spectral or intensity
change in the reflected sound to interfere with the perception of the direct sound. This
reduction in the modulation of the original sound is used in the STI method as explained
in paragraph 4.1. If there is no such change in the reflections, the early reflections can
improve the speech perception as mentioned earlier; this is especially the case for the first
50ms. Speech perception scores decrease with the distance to the source until the critical
distance of the room as has been proven in [x] & [xi]. Beyond the critical distance,
perception ability tends to remain essentially constant in the room. This can be explained
by the fact that reverberant field is assumed to be equal throughout the room and thus
also the relation between the early and late reflections is equal. Since the direct sound is
of no significant importance any more beyond the critical distance the SI will also be
equal. Shown as r1 and r2 in Figure 3-2 where the direct sound is not present anymore and
the reverberant sound pressure is equal at both points.
The DRR also holds information on the RT which can be explained as follows.
Both the DRR and the RT are strongly dependent on the amount of absorption present in
the room. There is no one to one relation between the two since the RT is, in a diffuse
reverberant field, not dependant on the distance from the source and the DRR is. But we
can say that, if there is more reverberant sound the DRR will be lower and the RT will be
higher. In case there is no reverberation the DRR will be infinite and the RT will be zero.
This means there is a strong and inverse relation from the DRR to the RT.
If the RT is longer the sound takes longer to decay and thus reflects more often on
the walls. This is best explained by a steady state model, where the source is continuously
creating the same sound. If there are reflections the sound level will continue to rise until
a steady level has been reached, at this level the sound energy produced by the source is
identical to the sound energy absorbed by the materials in the room. When there is less
absorption, smaller alphas in Equation 3-4 resulting in higher RT, the reverberant sound
level will be higher. Since there is no difference in the direct sound the DRR will
decrease, these same relations holds in non steady state situations. Sabine [xii] developed
a formula to calculate the T60 (the time it takes for a sound to lose 60dB of its sound level
after the source has been turned off) from the volume (V (m3)), the surface (S (m2) and
the absorption coefficient of the surface (αi), with i indicating the different surfaces in the
room. This formula is known as Sabine’s law, Equation 3-4.
0.161V 0.161V
T60 = ⇒ _[ s ]
∑ S iα i Sα
Equation 3-4
With α the weighted average absorption coefficient of all surfaces.
From this formula we can conclude that a larger room yields longer RT and a
higher absorption coefficient yields shorter RT. Usually the RT is measured in situ in
single or 1/3 octave bands from 125 to 8000 Hz. Since the RT is not dependent on the
distance from the source to the receiver it is often used instead of the DRR.
12
3.3 Combined effects
There are two disturbing factors for the SI, the SNR in (dB) and the DRR in (dB).
In general these factors are simultaneously present; only in the extreme circumstances
explained in the previous paragraphs (anechoic room, noise free reverberation chamber)
only one of these effects is present.
The combined effects of these two disturbing factors, the SNR and the DRR are
generally bigger than the sum of the individual effects. That is, the interaction of noise
and reverberation adversely affects speech perception to a greater extent than the sum of
both effects taken independently, as is shown in [xiii]. These combined increased effects
appear to occur because when noise and reverberation are combined, reflections fill in the
temporal gaps in the noise, making it a more steady state in nature; this is illustrated in
Figure 3-3.
Figure 3-3 Example of combined effects of noise and reverberation
To illustrate, from [xiii] we learn that if an individual is listening to speech in a

quiet room, the addition of a specific noise (e.g., the starting of an air conditioner) might
reduce the SI by 10%. In another quiet room, the presence of some reflective surfaces,
and thus reverberation, might reduce the SI also by 10%. However, if both noise and
reverberation were present in a room, their combined effects on speech intelligibility
might actually equate to a 40% to 50% reduction in speech perception. From this it can
also be concluded that measurements that take only one of these factors into account are
not sufficient to give an adequate view on the SI in that room.
From Figure 3-4, where in an acoustical environment, with a SNR = +6 dB and a
RT = 0.6 s, the influence of increasing distance to source has been plotted, it can be
concluded that the speech perception decreases with increasing distance. This effect is
also present in Figure 3-5 where it is shown that the direct sound level decreases, while
the noise and reverberant sound level is equal throughout the room again using the
assumption of a diffuse field. In this figure it can also be seen that when noise is added
the distance at which the direct sound is equal to the disturbing sound, now being
reverberant and noise, decreases.
The goal is to find a single number that can be used to represent the SI in which
both the SNR and the DRR are represented. In Chapter 4 several of these parameters will
be discussed.
13
Figure 3-4 Mean speech recognition scores (in % correct) of
children with normal hearing in a “typical” classroom environment
(signal-to-noise ratio = +6 dB, RT = 0.6 seconds) as a function of
speaker-to-listener distance. Figure adapted from [xiv].
Figure 3-5 Relation between distance from source (r) and sound pressure from source and
noise/reverberation.
14
4 Measurement methods of Speech Intelligibility
The Speech Intelligibility can be determined by several methods. There are

several subjective speech intelligibility tests where an expert speaker reads words that the
listeners then write down. Examples of the words used in such a test are phonetically
balanced words, consonant-vowel-consonant words, but there are many other methods.
The score for all these methods is the same, the percentage of correct answers is the
Speech Intelligibility; this method is quite accurate but also quite cumbersome since there
are a lot of people involved and it is very time consuming. Subjective intelligibility tests
are explained in Appendix B of [xxxvi].
Measurement methods based on physical parameters, which are much easier to
perform, were developed to be able to measure the SI objectively. The SI is influenced by
the SNR and the DRR as explained above, but the different measurement methods use
these parameters in a different way. In general a higher SNR and DRR improve the SI. In
the following paragraphs the different objective methods will be discussed.
4.1 Speech Transmission Index
The Speech Transmission Index or (STI) method assumes that the intelligibility of
a transmitted speech signal is related to the preservation of the modulation in the original
signal. This is because the idea was that speech can be seen as fully modulated noise, and
the SI at the receiver related to the reduction of this modulation. The modulation may be
reduced by band-pass limiting, masking noise, temporal distortion (reverberation) and
non-linear distortion. The reduction of the modulation can be quantified by an effective
signal-to-noise ratio obtained for a number of frequency bands. Also human-related
hearing aspects such as the reception threshold, hearing disorders and nonnative speakers
and listeners may reduce the effective signal-to-noise ratio. This is implemented in the
STI by the adapted modulation index, which can be reduced by introducing for example a
reception threshold. This could be seen as increasing the noise level and thereby
decreasing the modulation depth. The effect of the noise level on the modulation is
shown in Figure 4-2. The effective signal-to-noise ratio, evaluated from the modulation
index, in seven relevant frequency bands (octave bands, centre frequencies ranging from
125 to 8 kHz) determines the STI. This SNR is then recalculated to a transmission index
between zero and one. Summation of the weighted contributions of the transmission
index for the seven octave bands results in a single index, the STIr. At first it may seem as
if only the SNR is represented in this measurement, but the DRR is present as well. Since
a higher reverberant sound level reduces the modulation depth, since the peaks will be
lower, but mainly the lows will be higher, the modulation index will be lower. This is
schematically shown in the figure below.
15
Figure 4-1 Modulation reduction due to reverberation
Figure 4-2 Modulation Index
Figure 4-3 Relation of the RT and the modulation (reduction) factor

If there would be only noise present and no reverberation the modulation factor (m)
would be dependent on the SNR by the following equation:
16
−1
⎛ − SNR
⎞
m( F ) = ⎜1 + 10 10 ⎟⎟
⎜
⎝ ⎠
Equation 4-1
Where F is the frequency at which the modulation factor is measured. If the SNR would
be 0 dB (signal strength equal to the noise level) the modulation factor would be ½. An
increasing SNR would improve the modulation factor. If there would be no noise at all
and only reverberation, the modulation factor is calculated by the following formula:
−1
⎛ ⎛ 2πFRT ⎞ 2 ⎞ 2
m ( F ) = ⎜1 + ⎜ ⎟
⎜ ⎝ 13.8 ⎟⎠ ⎟
⎝ ⎠
Equation 4-2
The result of this formula is also shown in Figure 4-3, using RT equal to zero, the
modulation factor equals one, since there is no disturbing factor at all. Increasing the RT
will decrease the modulation factor. The impact is biggest for higher frequencies since
then the temporal gaps are shorter and thus easily filled with reverberant sound.
So the STI takes both reverberation and noise into account and is able to take both
effects into account. The detailed mathematical explanation of the STI is given in Chapter
5.
4.2 Energy ratios
There are several measures that use the energy ratios, the ratio of the direct and
early sound level (useful for the SI) compared to the late sound (detrimental for the SI)
arriving at the receivers position. This is very closely related to the DRR, with the
difference that direct sound energy can be zero (there is no direct path from the source to
the receiver), while the early sound energy will never be zero which is being interpreted
as useful here as well. The boundary time that is used is normally 50ms (for speech) or
80ms (for music).
Measurement methods based on energy ratios thus compare the useful to the
detrimental sound energy (useful-to-detrimental sound ratio (U)). Bradley proved in
[xvii] that the ratio using 50 ms as the boundary for early and late energy results in an
accurate prediction on the SI.
To measure the sound energies the impulse response is used, because the source
shouldn’t produce any sound after the measurement started since that would influence the
sound energy of the late fraction. The two ways normally used to compare the energy
ratios are:
• Comparing the energy arriving at the receiver in the first 50ms to the energy from
50ms to infinity.
• Comparing the energy arriving at the receiver in the first 50ms to all the energy
arriving at the receiver position. The resulting parameter is known as the
“Deutlichkeit” [xv] or “Definition”.
17
The two parameters are shown in Equation 4-3, Equation 4-4 and Equation 4-5. The third
equation is another way of showing the result (known as the Clarity). Clarity is often used
in addition to the SNR, since if half of the total energy arrives in the first 50ms the
Clarity becomes 0 dB, which is in line with an SNR of 0 dB.
50
∫ p (t )dt
2
U 50 = 0
∞
∫ p (t )dt
2
50
Equation 4-3
50
∫ p (t )dt
2
D50 = 0
∞
∫ p (t )dt
2
Equation 4-4
⎛ 50 2 ⎞
⎜ ∫ p (t )dt ⎟
⎜ ⎟ ⎛ D50 ⎞
C 50 = 10 log⎜ ∞0 ⎟ = 10 log⎜⎜ 1 − D ⎟⎟
⎜ p 2 (t )dt ⎟ ⎝ ⎠
⎜∫
50
⎟
⎝ 50 ⎠
Equation 4-5
Background noise is not taken into account in these parameters, since the energy
from 50ms to infinity would become infinitely large. This means the SNR is not used in
this measurement and thus would overestimate the SI if a high noise level would be
present.
4.3 Articulation loss of Consonants
Peutz and Klein of Holland first proposed the concept of the percentage loss of
consonants in 1971 [xvi]. The main discoveries were that intelligibility was proportional
to the reverberation time of a room, the room's volume, and the distance between the
listener and the talker. He also found that there was a limiting distance that, once
exceeded, effectively caused no further loss of intelligibility. Peutz noted that it was the
loss of consonants, not vowels, that most reduced speech intelligibility. After
modification by Klein, the familiar form of the ALcons equation was established. The loss
of consonants is the reduced possibility to hear which consonant was said, meaning that
“p”s are heard while “b”s are said for example. If only vowels are heard the speech
appears to consist of just sounds, while if only consonants are heard the vowels can
usually be interpret from the consonants.
A measurement that correlated to Alcons was not developed until 1986 using the Techron
TEF (Time/Energy/Frequency) analyzer.
18
The direct-to-reverberant ratio of the sound systems’ transmitted acoustic signal is
measured together with the early delay time, which is the first 10dB of the reverberation
decay curve. From these parameters, the TEF computes the Alcons score based on a set
of correlations in three different acoustic environments with a total listening panel size of
almost 100. While the TEF Alcons method allows the impulse response and the Energy-
Time curve to be seen and room reflections evaluated, there are drawbacks. Although the
process is semi-automated the algorithm used can be easily fooled, producing misleading
results that are not easily identified by inexperienced users. Remember that the
ALcons measurement is based only upon the 1/3 octave centered at 2 kHz (1.8 kHz to 2.2
kHz), so the system’s frequency response must be verified in some other way for
the ALcons score to be meaningful. If the frequency response is similar for the other
frequencies as well the ALcons is a good estimate for the SI. Generally measurements at
just one frequency produce misleading and overly optimistic results.
Finally, the Alcons method does not take into account factors like background noise and
the S/N ratio; the frequency spectrum of background noise; the sound system frequency
response, bandwidth and equalization; and late, discrete (isolated) reflections and echoes.
The ALcons concept is mainly used to measure the quality of sound amplification
systems. The method is less accurate than measurements based on energy ratios as
Bradley proves in [xvii]. The ALcons can be derived mathematically as well as using
Equation 4-6, this equation is valid when the DRR >-11 dB. In a free field this is the case
when r (distance from the source) < 3.5* Dc (critical distance), when using the Dc value
calculated using Equation 2-1 (Dc of 3 meters) this means this equation is valid up to 10
meters from the source, as Peutz also noted that the ALcons does not decrease beyond this
distance. The equation using one source becomes:
400r 2 RT 2
ALcons = + K _[%] (3.3)
VQM
Equation 4-6
The parameters used are:
• r = distance from the source to the receiver
• RT = Reverberation time
• V = Volume of the room
• Q = Directivity of the source
• M = Acoustic modifier for reverberant power, 1 is a conservative assumption
• K = listener factor ≈ 2 % for a good listener
The value for the acoustic modifier and the listener factor are derived from intensive
testing [xvii] by measuring the ALcons in practice and comparing it to the outcome of the
formula. The factors are chosen such that the correlation between the test and the formula
are maximized. If more or less reverberation is present or it is known that the listener is
not a good listener the factors M and K can be adjusted to accurately present these
situations.
19
The RT is present in this equation since the reverberant sound causes the one syllable to
mask the next, making it harder to understand. The distance from the source to the
receiver influences the DRR. If either of those increases, the ALcons increases which
means a reduction in the intelligibility.
An increased volume of the room or a higher directivity of the source will reduce
the ALcons, since if the room volume increases the reflections will arrive at the receiver
later and thus will have a lower level and mask less of the useful sound. An increased
directivity means the useful sound is better directed at the receiver, while the reverberant
field is not directed at all resulting in a higher DRR.
Concluding we can say that since the SNR is not present in this parameter it is
less complete than the STI. This can be shown by evaluating the formula using RT = 0,
this would mean the ALcons = 2% while the noise may be louder than the signal and thus
the SI will not be very good at all. If the SNR would show normal figures the ALcons
could still be considered as a good estimate for the SI. In fact a high correlation has been
proven between the ALcons and the STI [xviii], which results in the following equation:
1− STI
ALcons = 10 0.45
_[%]
Equation 4-7
This is shown in the figure below:
Figure 4-4 Relation ALcons and STI
20
4.4 Articulation Index
The Articulation Index (AI) was first described by French and Steinberg in 1947
[xix] as a way to express the amount of average speech information that is available to
patients with various amount of hearing loss. It is usually described as a number between
0 and 1.0 or as a percentage, 0% to 100%. The AI can be calculated by dividing the
average speech signal into several bands and obtaining an importance weighting for each
band. Based on the amount of information that is audible to a patient in each band and the
importance of that band for speech intelligibility, the AI can be computed.
For all of these octave bands an adaptation can be made based on the RT for that
frequency and based on the hearing loss for the specific person at that frequency,
generally the AI is lowered by 0.1 for each second of RT. When comparing this to the
STI we see that the STI uses a reduction in the modulation and not just the SNR, which is
more accurate.
It must be noted that measuring all the necessary values at all frequencies could
be complicated, while the STI uses a specific source and measures the signal at the
receiver as the signal and noise together. When all the values however are known the
calculation and mathematics are then quite simple, since only the AI for the octave bands
and its weight need to be multiplied and added. This is done using Equation 4-8.
AI = ∑ wi AI i
i
Equation 4-8
This method uses the SNR and the RT separately and does not take into account
combined effects when both of these parameters are present. This means that if the
reverberant sound fills temporal gaps in the speech resulting in a mumble since there are
no longer silences to separate the words the SI is usually overestimated using this
method.
From these four paragraphs it can be concluded that when using the STI the best
estimate of the SI will be obtained. Therefore this method will be analyzed to some more
extent in the next chapter.
21
5 Speech Transmission Index
5.1 Calculating the STI
The Speech Transmission Index or STI was introduced by Steeneken and Houtgast
in 1971 [xx], where they proposed the use of an artificial signal to measure the Speech
Intelligibility. There was a great need for such a method, since the subjective methods
(like CVC word score) that were used at that time were very time consuming and difficult
to perform. Since then the method evolved and has been revised to what is now known as
the STIr.
The STI predicts the speech intelligibility by measuring the reduction in the
modulation depth at the receiver for seven octave bands with fourteen modulation
frequencies (0.625 Hz to 12.5 Hz in 1/3 octave band steps). All eighty-nine calculations
contribute to the final STI score. The adapted modulation factor for octave band (k) and
modulation frequency (f) (m’kf ) is evaluated by solving:
Ik
mkf' = mkf
I k + I am ,k + I rs ,k
Equation 5-1
Where mkf stands for the calculated modulation factor for octave band (k) and modulation
frequency (f). Ik, Iam,k and Irs,k stand for the intensity of the octave band, the auditory
masking signal and the reception threshold respectively. This means that if any masking
or elevated reception threshold is present the calculated modulation index can be adapted
using these values resulting in a lower modulation factor. The absolute reception
thresholds for all seven octave bands are shown in Table 5-2. The concept of auditory
masking means that the modulation index is reduced by the sound of lower octaves. The
slope of this masking is not equal for all sound intensities and is given by:
Table 5-1 Slope of masking as a function of the Intensity

Octave level [dB] 46-55 56-65 66-75 76-85 86-95 >95
Slope of masking -40 -35 -30 -25 -15 -10
In the figure below the auditory masking due to a lower octave band is shown for a slope
of -35 dB.
22
Figure 5-1 Auditory masking from band k-1 to band k
The effective SNR, in which both the DRR and the SNR are present as explained
in paragraph 4.1, for the octave band k and modulation frequency (f) now becomes:
mkf'
SNRk , f = 10 log
1 − mkf'
Equation 5-2
From this effective SNR a transmission index (TIk,f) is calculated using:

SNRk , f + 15
TI k , f =
30
Equation 5-3
where the transmission index is bounded by 0<TIk,f<1. According to Steeneken

and Houtgast a SNR from -15 to 15 dB is linearly related to a contribution in
intelligibility from 0 to 1, thus a SNR of 0 will result in a TI of ½.
Now these TIs must be summed over all fourteen modulation frequencies in order
to get the modulation transfer index for each octave band (MTIk). This is done via:
1 14
MTI k = ∑ TI k , f
14 f =1
Equation 5-4
23
With all modulation frequencies contribute equally to the MTI for each octave
band. From the MTI’s the STI and revised STI (STIr) [xxi] are calculated using:
7
STI = ∑ α k ⋅ MTI k
k =1
Equation 5-5
7 6
STI r = ∑ α k ⋅ MTI k − ∑ β k ⋅ MTI k ⋅ MTI k +1
k =1 k =1
Equation 5-6
The formula for the revised STI differs from the original one in the use of the
second part of the formula which was not present in the original STI. The β represents the
redundancy correction due to correlation of two adjacent frequency bands. The MTI
weights, α and β, must add up to one using the following equation:
7 6
∑α k − ∑ β k = 1
k =1 k =1
Equation 5-7
The factors αk and βk represent the octave weighting and redundancy factor
respectively. These factors differ for male or female speech as is shown in Table 5-2.
Table 5-2 STIr octave band specific male and female weighting factors and the absolute reception
threshold in decibel, from [xxii]
Octave 125 250 500 1000 2000 4000 8000

band (Hz)
Males Α 0.085 0.127 0.230 0.233 0.309 0.224 0.173
Β 0.085 0.078 0.065 0.011 0.047 0.095 -
Females Α - 0.117 0.223 0.216 0.328 0.250 0.194
Β - 0.099 0.066 0.062 0.025 0.076 -
Absolute Lrs,k 46 27 12 6.5 7.5 8 12
Reception
Threshold
Next to this standard STIr there were several other STI’s that are developed to
calculate STI’s in specific circumstances to decrease calculation time. The most
important examples for this are the Speech Transmission Index for Public Address
systems (STIPA) and the Room Acoustical 2 Speech Transmission Index (RASTI). The
STIPA is a stripped version of the standard STI and has a robust coverage for distortions
2
Sometimes referred to as RApid Speech Transmission Index
24
in the time domain and limitations in the frequency domain, but a limited coverage of
non-linear distortions is obtained. The biggest gain in this method is the speed at which it
can be calculated: 15s, against 15min for the standard STI.
The RASTI has the advantage that it is only calculated for two octave bands, but
this has a drawback as well. The method has no coverage for band-pass limiting or
spiked\unsmooth noise spectra, since if the noise that is present is not correlated to the
octave bands at which the RASTI is calculated the modulation index is not influenced
resulting in an overestimate of the SI. The method is developed for person-to-person
communications in a room acoustical environment and does account for distortion in the
time domain, which is usually only present in electronic systems.
Table 5-3 Overview of the measuring procedures, the application, and the corresponding test signals
from [xxii]
Application Band- Non Linear Reverberation Test signal Measuring

pass Distortion Echoes types time
limiting
STI-14 Yes Yes Yes Male, female 15min
(7 octaves, 14 fmod)
STI-3 Yes Yes Condition Male, female 4min
(7 octaves, 3 fmod) dependent
STITEL Yes Condition Condition Male, female, 15s
(7 octaves, 7 octave dependent dependent original,
related fmod) phoneme
groups
STIPA Yes Condition Yes Male, female 15s
(7 octaves, 14 octave dependent
related fmod)
RASTI no no yes original 15s
(2 octaves, 4-5 fmod)
• A brief overview is given in Table 5-3, a full overview of all STI methods
and developments is given in [xxii].
Without specific corrections, the STI method is not a reliable prediction measure
of the intelligibility of speech for hearing-impaired listeners [xxiii] or to the wearers of
ear defenders. This is the case because ear defenders and hearing aids introduce
distortions on the received signal and the specific hearing problem of the listener should
be taken into account when calculating the STI.
25
6 Interpretation of the STI
The STI value tells us something about the Speech Intelligibility of the room and
can thus be related to subjective values. There are several subjective measures to indicate
the intelligibility of speech; these measures use different sounds, words or letters to
evaluate the SI. The most commonly known are: CVC words, PB words, fricatives,
plosives, vowel-like consonants and vowels. In general the following table from [xxiv] is
adhered to:
Table 6-1 STI in relation to intelligibility

STI [%] 0 - 30 30 - 45 45 - 60 60 - 75 75 - 100
Intelligibility unintelligible poor fair good excellent
6.1 Relating STI to subjective measures
The relation between three of those subjective tests and the STI is shown in Figure
6-1.
Figure 6-1 Qualification of the STI and relation with subjective intelligibility measures, from [xxv]
The score for a subjective method is given by the percentage correctly heard
words, for CVC these are uniformly distributed phoneme words, PB words are
phonetically balanced (words are chosen so that they approximate the relative frequency
of phoneme occurrence in each language) nonsense words. All these scores can also be
26
estimated for a given STI value, as is shown in [xxii]. Table 6-2 combined with Equation
6-1 shows how to estimate these subjective values from a known STI value using the
A\B\C factors from the table.
Table 6-2 Relation between the STIr, the CVC-word score, and phoneme-group scores for male and
female speech. Source [xxii]
Word or Phoneme Male Female

type
A B C A B C
CVC words -1.5301 -2.0 1.15 -1.7584 -1.5 1.37
Fricatives -0.9000 -4.2 0.90 -0.9466 -4.1 0.90
Plosives -1.1531 -4.1 1.01 -1.1256 -6.0 0.95
Vowel-like consonants -1.4602 -4.2 1.05 -1.3216 -4.0 1.09
Vowels -0.9976 -2.9 1.03 -1.2057 -3.1 1.04
{ }
predicted _ score = A ⋅ e ( B⋅STI ) + C ⋅ 100(%)
Equation 6-1
Using Figure 6-1 can give a good impression on the result of the STI in relation to
subjective measures, but it holds no information on the reason for the reduced modulation
depth at the receiver. However there are some effects that cause a distinct pattern in the
result of the modulation index. The IEC_60268-16_2003 standard [xxv] states that: “As a
rule, the values in each octave-band column should decrease with increasing modulation
frequency. Constant or slightly reducing values indicate the presence of noise. Large
reductions indicate that reverberation is the main effect. Values that first reduce and then
increase with modulation frequency indicate the presence of periodic or strong
reflections, which may produce an over-optimistic conclusion. It is recommended that if
this effect is detected, it should be reported with the results and an estimated correction
applied.”
These rules are explained as follows:
• Increasing modulation frequency means shorter quiet time periods, thus easily
disturbed by noise or reverberation. Thus reduced value for increasing modulation
frequency
• Noise impact on the modulation reduction is less of subject to the increased
modulation frequency, since it’s not dependant of time. Thus the noise reduces the
modulation independent of the modulation frequency. Disturbance due to
reverberation however reduces over time and thus if the quiet time period between
the modulated signal becomes shorter, the signal becomes less modulated. Thus
large decreases in modulation, with increasing modulation frequency are caused
by reverberation.
A note must be made on the precision of the STI method, the IEC standard [xxv]
states that when using a measuring time of 10s the results have a standard deviation of
0,02 for each modulation index. This standard deviation is observed in the presence of
27
stationary noise interference. When there are fluctuating noises present this deviation
may increase together with a systematic error. The systematic error can be found by
performing the measurement in absence of the test signal. This should result in a STI less
then 0,20. Accurate standard deviations can be obtained by repeating the measurement.
Since the IEC 60268-16_2003 [xxv] standard fully describes the current
measurement methods and calculation methods that are to be applied to obtain the correct
STI score, this is left out of the discussion here. The formulas that are to be used are
explained in paragraph 5.1. The standard not only describes the measurement method for
the STI, but also describes the RASTI, STIPA and STITEL measurement/calculation
methods.
6.2 Predicting STI values
In order to compare results of a SNR and RT measurement to a STI measurement

we would like to be able to predict the STI values from SNR and RT measurements. The
SNR can readily be evaluated with an “in situ” measurement. This could be done by
calculating the ratio of the sound level when the signal is on and when the signal is off.
The RT can be measured “in situ” as well, the environment should be identical to the
normal situation since everything or everybody present in the room absorbs sound energy
and thus reduce the RT. In [xxvi] we find a figure showing the relation between the STI
and the RT for several given SNRs, this is shown in Figure 6-2.
Figure 6-2 Schematic graph to estimate STI, for known signal-to-noise ratio (LSN)
and reverberation time (T). Note: for LSN = -15 dB, STI =0 for all T.
28
From this we see that when the STI cannot be measured, or we want to use in situ
measurements, it can be estimated when the SNR and RT are known. This may also offer
an advantage when these figures are known for a room and we need to validate the STI.
The modulation index, which is measured in the STI should now be calculated
from the SNR and the RT, this is done by combining Equation 4-1 and Equation 4-2
which show the relation of the SNR and the RT to the modulation index.
1 1
m(F , f ) = ⋅ − SNR ( f )
1 + (RT ( f ) ⋅ 2πF / 13.8) 1 + 10
2
10
Equation 6-2
Where RT(f) is the local reverberation time for the given frequency. From here the
derivation of the STI is identical to when the modulation index is measured. This means
that if the SNR and RT are measured in situ the STI can be estimated in the actual
environment with the actual source sound level and actual noise present.
29
7 Hearing Impaired persons
From the previous chapter it was concluded that the STI is not directly applicable
for measuring the SI for hearing impaired persons, since their hearing loss should be
incorporated by the test. The difference in SNR (when there is no reverberation) when the
hearing impaired person understands 50% of the speech (also known as the speech
reception threshold, SRT) compared to the SRT of a non hearing impaired person is
needed to make an adaptation in the STI.
7.1 Speech perception for the hearing impaired
To investigate the difference in speech perception, between persons with normal

hearing and persons with a hearing impairment, tests were done while varying the SNR
and RT, see [xxvii]. The results are displayed in Table 7-1.
Table 7-1 Mean speech recognition scores (in % correct) by children with normal hearing (n = 12)
and children with sensorineural hearing loss (n = 12) for monosyllabic words across various signal-
to-noise ratios and reverberation times.
• This data indicates that the children with a hearing impairment performed
significantly poorer than children with normal hearing for all listening conditions.
This is caused by the effect that the hearing impaired children are less able to
predict the words from their context caused by their limited vocabulary and
experience. This is the case for children in general and for hearing impaired
persons especially. Thus if they miss a few words from the sentence they are less
able than other students to fill in those gaps.
• The performance decrement between the two groups increased as the listening
environment became less favorable. For example, in what would be an extremely
30
good classroom environment (SNR = +6 dB; RT = 0.4 second), children with
hearing impairment obtained perception scores of only 52% as compared to 71%
for the normal hearers (19% difference). In acoustical conditions quite commonly
reported in the classroom (SNR = +12 dB; RT = 1.2 seconds), children with a
Sensorineural Hearing Loss (SNHL) obtained perception scores of just 41% as
compared to 69% for children with normal hearing (28% difference). So children
without a hearing loss only performed 2% lower while the children with a hearing
impairment performed 9% lower.
• Although not shown in this table, it is interesting to note that the addition of a
normal hearing aid did not improve perceptual ability and, in fact, made
understanding even more difficult in many listening conditions. Certainly, it is
reasonable to assume that learning and academic achievement will be
significantly compromised with such poor perceptual scores. It must be noted that
there are several new developments in this market that do improve the perception
quality for hearing impaired children, like a modern FM system where the teacher
is wearing a microphone and this signal is transmitted to a child. This method
increases the signal sound level without also increasing the noise level. Also the
recent developments of Cochlear Implants (CI), an electronic device that can
stimulate the hair cells in the cochlea directly, have helped hearing impaired
people to improve their capabilities to understand spoken language.
Crandel [xxviii] expanded this research further and tested how children with
minimal SNHL performed compared to children with normal hearing. The results are
displayed in Figure 7-2.
Figure 7-2 Mean speech recognition scores (in % correct) of

children with normal hearing (shaded bars) and children with
minimal degrees of sensorineural hearing loss (clear bars) in quiet
and at various signal-to-noise ratios.
The main conclusion from Figure 7-2 is that at seriously deteriorating noise
levels, a SNR of -6 dB, the children with a minimal SNHL performed well below
31
children with normal hearing, while at better SNRs this was not so much the case and the
children with a minor SNHL performed only a little below children with normal hearing.
7.2 Speech perception of children compared to adults
A matter which is discussed in [viii] is the speech perception of children with

normal hearing, compared to adults. It was concluded that children below the age of 13-
15 required better acoustical surroundings to achieve the same perception score as adults.
From this and conclusions on the present measurement methods it can be argued that a
special method should be developed to investigate the SI for children in general and
children with a hearing impairment especially. This test should include the increased
acoustical requirements of the children and the SRT of the hearing impaired person. Also
it would have a great benefit if the actual speech of the teacher could be used as the test
signal in the actual classroom environment, since this would include the noise produced
by the other children and the actual sound level of the source, in this case the teacher.
This is called an “in situ” measurement.
32
8 Coninx method
We concluded in the previous chapters that the STI was the best method currently
available for measuring the SI on an objective basis; however the measurement cannot be
performed “in situ” with the actual signal as a source. The Speech Reception Threshold
(SRT) can be used as a correction on the modulation index, but it only reduces the
modulation index as a factor of the intensity, which can be seen in Equation 5-1. In the
previous chapter is has been shown that minor reductions in the acoustical quality can
lead to significantly lower scores for hearing impaired persons, which not taken into
account in that equation.
This chapter will discuss a new measurement method developed by Frans Coninx;
the Coninx method. This method consists of a SRT measurement for the hearing impaired
child and an in situ SNR.
8.1 The SRT measurement
The SRT for a hearing impaired child can be measured using the Adaptive Auditive
Speech Test (AAST) that has been developed by Coninx [xxix]. This measurement uses a
headphone, a variable SNR and basic words understandable for little children. One word
is presented to the child masked with white noise and then a choice must be made
between six figures. This is shown in Figure 8-1.
Figure 8-1 AAST speech test.
33
By repeating the loop of increasing the noise level, while keeping the sound level of the
spoken word equal, until an error is made in the chosen word and then decreasing the
noise level until the right answer is given again an accurate estimation of the person’s
speech reception threshold can be made. Since this test can be performed while the child
is using its hearing aid and no reverberation is present an accurate estimate of the SRT
can be made including the influence of the hearing aid, or CI. A full description of the
measurement is found in [xxix].
8.2 The SNR measurement
The method is based on the SNR on a specific receiver location, usually the
location of the child with the hearing impairment. The signal is the actual speech of the
teacher and therefore the measurement can, and must, be performed during classroom
hours, so with the children present. To identify the noise level the sound pressure level
just before and just after the teacher talks is used. In this way a SNR is obtained that is
actually present during the speech signal. This is an advantage over using an average
noise level; since the exact noise level is know during the specific sentence from the
teacher. To identify when the teacher is talking a microphone is placed at the teacher’s
mouth. The sound pressure level at that microphone will be much higher when the
teacher is talking compared to silent periods, therefore this microphone can be used to
identify the times when the teacher talks. It is therefore vital that the timing of the two
microphone signals is identical. The measurement needs to be performed for at least
twenty minutes to get enough samples to make an accurate calculation of the SNR. Using
this method does not require any additional weighting of frequency bands, since the
actual spectrum of the teacher’s speech is used.
The difference in the sound level at the receiver’s location between “teacher is
quiet” (noise) and “teacher talks” (signal) determines the SNR. From Figure 8-2, with
green being the signal level and red being the noise level, we can see that the SNR is
given by the equations below. The average is taken from the SNR when the signal starts
and the signal ends.
SNRbegin = Pg − Pr1
Equation 8-1
SNRend = Pg − Pr 2
Equation 8-2
[
SNR = 1 2 ⋅ Pg − Pr1 − Pr 2
2
]
Equation 8-3
where:
• Pg is the average sound level during the signal, green in figure
34
• Pr1 is the average sound level before the signal, red in figure
• Pr2 is the average sound level after the signal, red in figure
By keeping the time period after the speech sufficiently short we can be certain
that none of the students are talking, by means of answering a question or other forms of
desired speech. The suggested time value for this is 0.5 seconds. If other students are
talking amongst themselves undesirably that is considered to be noise.
Figure 8-2 Schematic overview of a single SNR test. In green the signal level, in red the noise level.
Since this measurement only uses the SNR, no estimate of the STI can be made; it
is still under consideration to include a RT measurement. Phonak Hearing Systems, a
company that develops hearing systems, developed an algorithm that can calculate the
RT from live speech. This algorithm is used in the new hearing aid (called echo block
[xxx]) and it is being investigated if it is possible to use the algorithm on a recorded
signal as well. If these two parameters are calculated for the separate octave bands the
mathematical method in chapter 6.2 can be used to obtain a STI value and therefore can
be compared to a STI measurement in the same classroom.
From the SNR, the RT and the SRT a conclusion can be made if adaptations to
the classroom should be made, if the child requires another hearing aid system to increase
the sound level of the teacher or that the acoustical surroundings and hearing aid in place
should be adequate.
35
8.3 The differences compared to STI
The differences between the STI method and the Coninx method lie in the way
the SNR and RT are obtained and used, although the RT is not yet incorporated in the
Coninx method. Both methods aim is to obtain an indication of the SI, but the STI
method uses a modulated signal and the reduction in the modulation, due to noise and
reverberation measured at the receiver location. The Coninx method uses actual speech
and “in situ” background noise to obtain a direct SNR.
Concluding we can say that the differences lie in the signal used and the
incorporation of the two parameters that influence the SI. The advantage of the STI is that
this is not influenced by changing parameters and therefore should result in the same
figures over and over again if no changes to the environment have taken place. This is
caused by the fact that an electronic signal which will result in identical results in
identical situations.
In the Coninx measurement method the fact that an actual “live” vocal input is
used is both the advantage and the disadvantage. The measurement is influenced by the
vocal effort of the teacher which might change from measurement to measurement. But
this is also the strength of the measurement since the actual input the listeners receive is
used and any adaptation the teacher makes to become understandable in the classroom is
taken into account. The same argument can be used for the fact that the actual disturbance
of the students and background noise is used during class hours. This ensures that the
noise used in the measurement is the actual noise that disturbs the signal, while the STI is
usually processed in an empty classroom.
The Coninx measurement method also has the advantage that the absorption of
the students and the effect on the SNR is captured in the figures, while with the STI an
adaptation has to be made (if it is made) which is based on assumptions. This same
reasoning holds for the background noise as well, the STI can process a background
noise, but again doesn’t use the actual background noise on the receiver position, but at
most a measured level for several octave bands while the classroom is empty.
There is also a great difference in the mathematical difficulty between the two
measurements. The STI needs various formulas and weighting factors the Coninx
measurement method only needs the SNR. No weighting is required, since the obtained
SNT uses the actual speech of the teacher in which the weighting is captured. Although
the measurement seems promising and is easy to perform, but is just as time consuming
as a STI measurement since sufficient data should be recorded in order to get a reliable
results.
36
Table 8-1 Differences in measurement setup from STI to the Coninx method.
STI Coninx method
Measures\Variables used Modulation reduction SNR
Signal Electronic signal Actual, in situ, speech
Noise used In situ, but no children In situ including children
Measurement time 15min 15-20min
SRT used Adaptation can be made Yes via AAST measurement
Absorption used In situ, but no children In situ including children
Equipment Laptop, Microphone, Laptop, 2 microphones
loudspeaker
Frequency weighting Yes No, actual spectrum used
37
9 Simulations
9.1 Simulations setup
In order to get acquainted with all the acoustic variables relevant for classroom
acoustics (STI, SNR and RT) and effects of adaptations in a classroom two typical
classrooms were simulated in Catt-Acoustic ([xxxi], [xxxii], [xxxiii] or
http://www.catt.se). This program uses RTC-II (Randomized Tail-corrected Cone-
tracing, second version), Ray-tracing (method for calculating the path of waves through a
system [xxxiv]), to evaluate the acoustic variables.
Both rooms were simulated with dimensions of length*width*height =
8m*7m*3m, resulting in a volume of 168 m3. The difference of the two rooms lies in the
absorptive qualities of the materials present in the rooms. The first room had plastered
walls and a high quality absorptive ceiling; this room is referred to as the “good”
classroom. The second classroom had brick walls and a cement ceiling, so it has little
absorption; this room is referred to as the “poor” classroom.
Another important aspect, which could lead to big differences in the tested
acoustic variables, besides the absorption is the amount of background noise. In chapter
13.6 it is shown that it is advisable to keep the noise below Noise Criteria 30 (NC 30).
The noise levels from NC 30 are shown in Table 9-1.
Table 9-1 Noise Criteria 30 noise limits per octave band
Freq. Band [Hz] 125 250 500 1000 2000 4000 8000 16000
Noise [dB] 47,5 40 35 31,5 28,5 27,5 27 26
Catt-Acoustic calculates STI not using any background noise initially and then
makes an adaptation using the given background noise levels. Thus a STI and “STI in
noise” are obtained; these two values can give a good idea of the impact of the
background noise present in a room. If the actual background noise in the simulated room
is known these values should be used to get a better estimate of the STI in noise. With
these noise levels we are also able to calculate a SNR if we assume the background noise
is diffuse.
In order to run the simulations the source and receiver positions need to be given.
The position of the teacher has been chosen front center in the classroom, nine
microphone positions are spread throughout the classroom to get a good idea about the
STI throughout the classroom. In the simulations the teacher is assumed to use his/her
voice at normal vocal effort. The sound levels at 1 meter distance for normal and raised
effort are shown in Table 9-2. The configuration of the classrooms and the microphones
are shown in Figure 9-1 and Figure 9-2 respectively.
38
Table 9-2 Sound levels at 1m in front of the speaker for normal and raised vocal effort, from Catt
Acoustic simulation program.
Freq. Band [Hz] 125 250 500 1000 2000 4000
Sound level @ 1m 51,2 57,2 59,8 53,5 48,8 43,8
Normal effort [dB]
Sound level @ 1m 55,5 61,5 65,6 62,4 56,8 51,3
Raised effort [dB]
Figure 9-1 Classroom orientation in 3-D, tables in brown, children pink, glass light blue and arrows
showing receiver orientation.
Figure 9-2 Microphone distribution top-down 2-D view

The simulations that have been done are described schematically in Table 9-3, in
the first five simulations data for RT, C50 and STI are evaluated, in the last four only
C50 and STI.
39
Table 9-3 Simulation descriptions
Description Name
1) The setup with all children and the teacher present Normal
2) Normal setup with a reflective askew panel over the teacher’s Reflector
head
3) Normal setup with no children or teacher present No people
4) Normal setup with additionally highly absorptive columns in Columns
the corners of the classroom
5) Normal setup with highly absorptive columns and reflector Reflector and
Columns
Extra simulations for C-50 and STI measurements
6) Normal setup with two extra omni-directional sources at the Extra sources
side walls at the same y-position as the teacher. Sources at
normal vocal effort.
7) Normal setup with two extra omni-directional sources at the Extra sources louder
side walls at the same y-position as the teacher. Sources at
raised vocal effort.
8) Normal setup with reflector, highly absorptive columns and Full adaptations
extra sources.
9) Reflector and highly absorptive columns setup with two extra Aimed Source Gain 6
directional sensitive sources at the side walls at the same y- dB Reflector and
position as the teacher. Sources had a gain of 6 dB Columns
In the poor room the influence of loudspeaker arrays, or aimed sources, without
the columns and reflector has been investigated as well. Loudspeaker arrays are small
arrays of multiple sources mounted into one housing, this results in the capability of
aiming the sources at a specific area. The vertical and horizontal directivity from the
Digital Directivity Control (DDC) method as used by Duran Audio are shown in Figure
9-3. Using these arrays results in more direct and early reflected sound and less late
reflected sound. More specifications are available on [xxxv].
Figure 9-3 Directivity characteristic from a DDC system
40
To investigate the STI for hearing impaired people extra simulations were run with
a higher background noise. The reason for this is that hearing impaired people need a
better SNR in order to understand speech. From Table 7-1, when comparing results with
0 dB SNR to +6dB SNR, it can be seen that hearing impaired children score about equal
to non hearing impaired children when the SNR is 6 dB higher. Thus when using a
system developed to investigate the STI for non hearing impaired people, the background
noise can be raised by 6 dB to simulate results for hearing impaired people. Simulations
with 6 dB higher background noise are done for situations 1), 5), 9) and a situation where
the aimed sources had a gain of 12 dB.
9.2 Simulation results
9.2.1 Reverberation time

The first parameter calculated is the reverberation time, in this case using T30
(results are displayed as 2* T30, which is equal to T60). The results are displayed in Figure
9-4 and Figure 9-5.
Good Classroom RT
3,5
2,5
Normal
2 Reflector
T-30 (s)
No people
1,5 Columns
Reflector and Columns
1
0,5
0
125 250 500 1k 2k 4k
Frequency (Hz)
Figure 9-4 Reverberation time for several setups in the good classroom
41
Poor Classroom RT
3,5
3
Normal
2,5
Reflector
T-30 (s)
2
No people
1,5
Columns
1
Reflector and
Columns
0,5
0
125 250 500 1k 2k 4k
Frequency (Hz)
Figure 9-5 Reverberation time for several setups in the poor classroom
From these two figures we see that the effect of taking out the people from the
room is much more significant in the room with poor acoustics, the average decrease in
RT from the normal setup to the setup with no people is 11% in the good room versus
101% in the poor room. This is shown in more detail in Table 9-4 and Table 9-5. The
explanation for this is that in a room with hardly any absorption the presence of twenty
four children and a teacher can increase the amount of absorption present by 100% and
thus by Equation 3-4 will result in halving the reverberation time. Furthermore we see
that adding the sound absorbing columns (measurement 4) reduces RT by 29% in the
good room versus 27% in the poor room compared to the normal situation. Finally we
conclude that adding the reflective panel (measurement 2) has no impact the RT (-1% in
the good room and 0% in the poor room compared to the normal setup).
In the good room the results at a frequency of 2 kHz are inconsistent, since removing the
people reduces the RT and adding a reflector and columns gives a different result than
adding only the columns, which is not in line with the effects on other frequencies. So no
conclusions can be drawn from this result since it is considered an anomaly.
Table 9-4 Reverberation time [s] for the good room
Setup\Frequency [Hz] 125 250 500 1000 2000 4000 average
Normal 0,46 0,66 1,05 0,93 1,01 0,73 0,81
Reflector 0,46 0,65 0,98 0,97 0,94 0,78 0,80
No people 0,51 0,77 1,19 1,07 1,00 0,83 0,90
Columns 0,42 0,53 0,68 0,68 0,66 0,49 0,58
Reflector and Columns 0,42 0,53 0,62 0,68 0,46 0,48 0,53
Setup\Delta from normal [%]

Reflector 0% -2% -7% 4% -7% 7% -1%
No people 11% 17% 13% 15% -1% 14% 11%
Columns -9% -20% -35% -27% -35% -33% -29%
Reflector and Columns -9% -20% -41% -27% -54% -34% -34%
42
Table 9-5 Reverberation time [s] for the poor room
Setup\Frequency [Hz] 125 250 500 1000 2000 4000 average
Normal 1,43 1,67 1,56 1,04 0,94 0,85 1,25
Reflector 1,43 1,67 1,56 1,05 0,96 0,84 1,25
No people 2,2 2,94 3,33 2,75 2,18 1,68 2,51
Columns 1,17 1,12 1,02 0,77 0,72 0,67 0,91
Reflector and Columns 1,16 1,11 1,06 0,79 0,74 0,67 0,92
Setup\Delta from normal [%] 125 250 500 1000 2000 4000 average
Reflector 0% 0% 0% 1% 2% -1% 0%
No people 54% 76% 113% 164% 132% 98% 101%
Columns -18% -33% -35% -26% -23% -21% -27%
Reflector and Columns -19% -34% -32% -24% -21% -21% -26%
9.2.2 Signal to Noise Ratio

The second parameter calculated is the Signal to Noise Ratio (SNR). Since Catt-
Acoustic does not calculate this directly we’ve developed a method to calculate it from
the Sound Pressure Level (SPL). This method is explained in the appendix, Chapter 13.2
With this calculation the SNR for all classroom setups can be examined, the results are
shown in Figure 9-6 and Figure 9-7. The SNR is averaged over all octaves to investigate
the impact of the different setups. If an actual room is measured it is preferred to look at
each octave separately to be able to identify the specific octave(s) which have the biggest
negative influence on the SI.
43
Signal to Noise Ratio averaged for all octave
bands, Good classroom
12,0
10,0 Normal
8,0 Reflector
6,0 No People
4,0
Absorptive columns
SNR (dB)
2,0
Absorptive columns
0,0 and Reflector
0 1 2 3 4 5 6 7 8
-2,0
-4,0
-6,0
-8,0
Mic position
Figure 9-6 Signal to noise ratio in the good classroom.
Signal to Noise Ratio averaged for all octave

bands, Poor classroom
12,0
10,0
Normal
8,0
6,0 Reflector
SNR (dB)
4,0
No People
2,0
0,0 Absorptive columns
-2,0 0 1 2 3 4 5 6 7 8
Absorptive columns and
-4,0
Reflector
-6,0
-8,0
Mic position
Figure 9-7 Signal to noise ratio in the poor classroom
The conclusions from these figures are:

• Addition of absorptive columns increases the SNR, on average 0.4 dB in the
good room and 1.4 dB in the poor room
44
• Addition of the reflector increases the SNR, on average 1.2 dB in both rooms
• Removing the people decreases the SNR, on average -0.1 dB in the good
room and -1.4 dB in the poor room. This is explained by the fact that there is
less absorption, thus more reflections and thus more noise.
9.2.3 Speech Transmission Index

The third parameter to be investigated is the STI; the results are shown in Figure
9-8 and Figure 9-9. For STI in noise simulations a background noise level is used equal
to Noise Criteria 30 (NC30) which is about 35 dB (A).
9.2.3.1 Impact of adaptations on STI
Good Classroom STI

Normal
85
Reflector
80
No people
75
Columns
70
Reflector and
STI (%)
65 Columns
Extra Sources
60
Extra Sources louder
55
Full Adaptations
50
Aimed Source Gain 6
45
dB Col./Refl.
40
0 1 2 3 4 5 6 7 8
Mic position
Figure 9-8 STI simulation in the good classroom for different setups
45
Normal
Poor Classroom STI
Reflector
85
No people
80
Columns
75
Reflector and
70 Columns
Extra sources
STI (%)
65
Extra Sources louder
60
55 Full Adaptations
50 Aimed source Gain 6

dB
45 Aimed Source Gain 6
dB Col./Refl.
40
0 1 2 3 4 5 6 7 8
Mic position
Figure 9-9 STI simulation in the poor classroom for different setups
The conclusions from these two figures are:

• In the normal situation the poor room has an average STI of 56%, regarded as fair
and the good room has an average STI of 65%, regarded as good
• Removing the people results in an average decline of the STI off 4% in the good
room and 13% in the poor room.
• Addition of the absorptive columns increases the STI by 5% in the poor room,
(which just qualifies it as good) and 3% in the good room.
• Addition of the reflector increases the STI by 1% in both rooms, 0.5% in the front
row and 1.5% in the last row, where it is most needed. This is explained by two
factors: One, in the front row the direct sound is more significant and two the
reflector’s angle is set to aim the sound to the back of the classroom.
• Addition of extra omni directional sources increases the STI in the good room
much more, 5% and 7% (extra sources louder), compared to 1% and 2% in the
poor room. This is caused by the fact that there is more reverberation in the poor
room, and thus increasing the signal strength does not increase the SNR as much.
• The addition of the aimed sources, absorptive columns and the reflector increase
the STI in both rooms by 10.5% resulting in an intelligibility score for the poor
room as good and for the good room as excellent.
46
From these results we conclude that solutions which are not impacted by reflections, and
thus less impacted by late sound from the source(s), are the preferred solution in
classrooms with poor sound absorptive qualities.
The overview for all setups with their average STI over all microphone positions
is given in Table 9-6.
Table 9-6 Summary of the STI measurements

Reflector Extra Aimed Source
No and Extra Sources Full Gain 6 dB
Setup Normal Reflector people Columns Columns Sources louder Adaptations Column/Reflector
STI Good
room [%] 65,4 66,2 61,3 68,4 69,4 70,6 72,3 73,9 75,8
STI Poor
room [%] 55,5 56,0 42,6 60,8 61,0 57,0 57,4 63,4 66,1
9.2.3.2 STI for hearing impaired persons

To investigate the STI for hearing impaired persons we only investigate the good
classroom and show the impact of adding 6 dB of background noise.
STIuser [%] with noise STIuser [%] with noise
80 80
70 70
A0 A0
60 60
50 50
40 40
Bkg SPL:<47,5 40,0 35,0 31,5 28,5 27,5 : 27,0 -> dB Bkg SPL:<53,5 46,0 41,0 37,5 34,5 33,5 : 33,0 -> dB
Figure 9-10 Good room, normal hearing impaired
From
Figure 9-10 and Figure 9-11 we see that for hearing impaired people, STI values may
drop as much as 15% (in the back of the room) under the same acoustic conditions. When
no adaptations to the classroom have been made the best position for the hearing
impaired person is the first row (closest to the teacher). But when using the aimed
sources, Figure 9-11, the best position is the third row in the middle of the classroom.
This position is influenced significantly by the direction in which the sources are aimed.
The chosen setup was directed at the middle of the last row as shown by the lines from
B0 and B1 in Figure 9-11.
47
STIuser [%] with noise STIuser [%] with noise
B0 B0
80 80
70 70
A0 A0
60 60
50 50
B1 B1
40 40
Bkg SPL:<47,5 40,0 35,0 31,5 28,5 27,5 : 27,0 -> dB Bkg SPL:<53,5 46,0 41,0 37,5 35,0 33,5 : 33,0 -> dB
Figure 9-11 Good room, Col/Refl. Aimed sources, gain 6 dB Hearing impaired
9.2.4 C-50
The final parameter to be investigated is the C-50; since the C-50 is a measure
that quantifies the relation from early to late sound the background noise is not used. This
is easily explained when looking at the equation of C-50:
⎛ 50 2 ⎞
⎜ ∫ p (t )dt ⎟
⎜ ⎟
C50 = 10 log⎜ ∞0 ⎟
⎜ p 2 (t )dt ⎟
⎜∫ ⎟
⎝ 50 ⎠
Equation 9-1
When there is a steady background noise the numerator would be infinite and thus C-50
would be zero.
The results for the two rooms are shown in Figure 9-12 and Figure 9-13.
48
Good Classroom C-50 Average
10
6 Normal
Reflector
4 No people
C-50 (dB)
Columns
2
0 Extra Sources
0 1 2 3 4 5 6 7 8 Extra Sources louder
-2 Full Adaptions
-4
-6
Mic Position
Figure 9-12 C-50 simulation in the good classroom
Poor Classroom C-50 average
10
6 Normal
Reflector
4 No people
C-50 (dB)
Columns
2
0 Extra sources
0 1 2 3 4 5 6 7 8 Extra Sources louder
-2 Full Adaptions
-4
-6
Mic Position
Figure 9-13 C-50 simulation in the poor classroom
49
From these two figures we can make the following conclusions
• The C-50 is lower for the poor classroom (6 dB in the normal setup)
• Removing the people reduces the C-50 by 3.5 dB for the poor classroom and by 2
dB in the good classroom
• Addition of extra omni directional sources results in a lower C-50 in both
classrooms by about 0.5 dB in both classrooms and regardless of their signal
strength
• The addition of absorptive columns results in a higher C-50, in the poor classroom
by 2.5 dB and 2 dB in the good classroom
• The addition of the reflector also has a positive impact, especially again in the
back of the classroom. For both rooms no impact for the first 3 receiver positions
and 0.5 dB for the last four receiver positions. Which is as expected since the
angle of the reflector is such that it is “aiming” the sound reflections to the back
of the classroom.
• So our conclusion on C-50 is that it is not a useful measure when background
noise is a disturbing factor. It does however hold information on the ratio of
useful and disturbing sound from the source alone.
9.3 Simulation Conclusions
From these simulation results the following conclusions have been made:
• The reverberation time can be manipulated by placing absorbing materials
• The impact of increasing the absorption present in the room is more significant in
a room with poor basic acoustics
• The SNR is relatively stable under various conditions
• The SNR can be improved by strategic placement of reflecting panels
• The SNR is positively correlated to the amount of absorption present in the room
• The STI is sensitive to all changes in the acoustic conditions in place
• The STI improves with the amount of absorption present in the room
• Using additional sources to increase the STI has more positive impact when the
RT is lower. This is explained by the fact that increasing the sound level is more
beneficial when the DRR is good. This is shown as well in the decrease of C-50
when omni directional sources are added to the setup.
• Aimed sources, or any other method that increases only direct or early reflected
sound, can improve the STI (by improving the SNR) even when the basic acoustic
setup is poor.
• The acoustics are extremely important for people with a hearing impairment,
increasing the background noise with 6 dB, and thus simulating hearing
impairment decreases the STI significantly and even an acoustically good room
gives poor results.
• The C-50 simulations are only sensitive to the amount of absorption present and
do not show any positive impact from adding sources; this is explained by the fact
this increases the early sound just as much, or even less as the late sound.
50
• The combined conclusion is that neither RT nor SNR nor C-50 alone can predict
the SI in all situations, only the STI captures all influences.
51
10 Case Signis
As a separate assignment a classroom at a Signis, a community of schools for
hearing impaired children, school has been investigated. There had been complaints from
teachers that they had difficulties hearing themselves speak due to reverberant sound.
Results from a RT measurement were already available; therefore it was decided to
reproduce the classroom in Catt-Acoustic. This case has been used to make the step from
the simulation program to actual measurements and to compare the simulation results of
an actual classroom to in situ measurements. After sufficient testing adaptations to the
classroom were suggested and implemented. The final situation has been tested as well
and the results were as predicted by the program. A complete description of this process
can be found in the Appendix, Chapter 13.1. An interesting additional feature that has
been investigated in this case is the influence of applying carpet in stead of a cement,
wooden or linoleum on the floor. The conclusion is that not the absorbing qualities but
the fact that the background noise is much lower, since the sound created by the
movement of the chairs, or feet is much less, is the main benefit from using a carpet floor.
52
11 Measurements and Results
11.1 Measurement method
The measurements in actual classrooms aim to investigate the acoustic quality of

the classroom; therefore a measurement is performed without the presence of the children
to investigate only the effects of the room. In this measurement the RT and STI are
measured; this is done for various microphone positions. The setup is similar to that of
the simulation (Figure 9-2) with the source in front of the classroom and the microphone
positions evenly distributed throughout the classroom. Also an additional measurement is
done with the source in a representative position to simulate the case where one of the
students is talking; this is referred to as the second source position.
The second measurement performed, is a measurement of the background noise,
the sound from the surroundings were recorded on a laptop and analyzed to fit an NC
curve. In this measurement a position in the middle of the classroom is chosen. This
measurement indicates the amount of background noise present in the room when the
children and the teacher are quiet.
This measurement can be done both in absence and in presence of the children, so
the effect of their presence on the background noise can be investigated. It is important
that in that case they do not make more noise than in situations that the teacher is
speaking; also the teacher should not speak at that time.
The last measurement on the classroom acoustics, the Coninx method, is done in
situ. The results of this measurement need to be validated using the two previous
measurements. The setup of the Coninx method is explained below:
Via two wireless microphones, one at the teacher’s mouth and one attached to the
ear of the child with the hearing impairment, the sound level is monitored for about one
hour. During this period the teacher is educating the class as normal, but it is important
that the education is classical and not individual. This is important since during individual
lessons the teacher is normally close to one child and thus no information on the
classroom acoustics are obtained during that measurement time.
Since the teacher microphone is so close to the teacher’s mouth head movements
and background noise do not influence this measurement, it is expected that the
background noise is always at least 25 dB lower. The sound recorded by the microphone
at the child’s ear is affected by background noise. The noise levels are in the same order
of magnitude, expected is a range of -5 to +15 dB, as sound level of the teacher. Via this
measurement the SNR can be obtained.
To analyze these recordings the function spectrum analysis is used in the program
Audacity which calculates the sound pressure level for the relevant frequencies for a
selected audio fragment. The analysis method and an example of an audio fragment are
shown in the appendix, chapter 13.3
53
11.2 Measurement results
The measurements have been performed in five different schools in nine classrooms
in total. The list of classrooms and a short description is shown in Table 11-1, an
impression of the outside and inside of the classrooms is given by the pictures in Figure
11-1.
Table 11-1 List of classrooms, see pictures below
# School Teacher Description
1 De Meule Female School in quiet neighborhood in Venlo. Brick walls
and acoustic panels in the ceiling
2 De Meule Female School in quiet neighborhood in Venlo. Brick walls
and acoustic panels in the ceiling
3 Willem Female Small school in a quit neighborhood in Nieuwegein.
Alexander Wood plated walls and acoustic panels in the ceiling
4 De Rietpluim Female Modern school in Nuenen, sometimes heavy trucks
passing the school. Well engineered classrooms with
attention to walls and ceiling regarding absorptive
qualities
5 De Rietpluim Female Modern school in Nuenen, sometimes heavy trucks
passing the school. Well engineered classrooms with
attention to walls and ceiling regarding absorptive
qualities
6 De Touwladder Female Old school in quiet neighborhood in Sint-
Michielsgestel. Classroom with very high ceiling and
brick walls. Larger room volume could lead to higher
RT.
7 De Touwladder Female Old school in quiet neighborhood in Sint-
Michielsgestel. Classroom with very high ceiling and
brick walls. Larger room volume could lead to higher
RT.
8 De Masten Male Old school in quiet neighborhood in Rosmalen. Brick
walls and wood plated ceiling.
9 De Masten Female Old school in quiet neighborhood in Rosmalen. Brick
walls and wood plated ceiling
Measurements were performed in all nine classrooms with the Coninx method and
the RT and STI for classrooms 1,3,5,6 and 8. For these classrooms the RT and STI have
been tested on 10 positions. From Figure 11-2 it can be concluded that there is no
significant difference in the RT for the 10 positions (from -0.03 to +0.03 seconds, or 6%
from the average), therefore the average RT of all ten positions is used. For position 3
and 3a this is extra important since they were placed 5-10 cm apart, thereby testing the
reproducibility and sensitivity of the test.
54
Figure 11-1 Picture 1 and 2 De Meule, 3 and 4 Willem Alexander, 5 and 6 De Rietpluim, 7 and 8 De
Touwladder, 9 and 10 De Masten
55
Average Reverberation tim e per m icrophone position
1,00
0,90
0,80
0,70
RT (s)
1st source position

0,60
2nd source position
0,50
0,40
0,30
0,20
3a
0
8
ic
ic
ic
ic
ic
ic
ic
ic
ic
ic
M
M
M
Mic position
Figure 11-2 Investigation for source position on RT
11.2.1 STI Results
The results per microphone position for the STI in classroom 1 are shown in
Figure 11-3. We conclude that the difference from the average ranges from -0.01 to
+0.02, or 2% ignoring the three positions closest to the source. So here also the average is
used.
STI per m icrophone position
1,00
0,95
0,90
0,85
0,80
STI (%)
1st source position

0,75
0,70 2nd source position
0,65
0,60
0,55
0,50
Mic0
Mic1
Mic2
Mic3
Mic3a
Mic4
Mic5
Mic6
Mic7
Mic8
Mic position
Figure 11-3 Investigation on source position for the STI
56
Now that it has been shown that we can use the average RT and STI in the
classroom an overview of these parameters is given for the investigated classrooms in
Figure 11-4 and Figure 11-5.
Reverberation Time for five classrooms
0,60
0,50
0,40
RT (s)
0,30 RT
0,20
0,10
0,00
1 3 5 6 8
Measurement
Figure 11-4 RT overview for all classrooms.
STI for five classrooms
1,00
0,90
STI (%)
0,80 STI
0,70 STI in noise
0,60
0,50
1 3 5 6 8
Measurement
Figure 11-5 STI overview for all classrooms
For the classrooms 3, 5, 6 and 8 also an overview is given for the STI in noise.
From this it is clear that noise significantly reduces the STI as was shown in the
simulations as well. From these figures we also see that the RT and STI-in-noise in all
classrooms is between 0,41s and 0,57s and between 63% and 69% respectively. These
figures are good for normal hearing students. However as is shown in the simulations
children with a hearing impairment would require a better STI to be able to understand
57
the teacher. This can be solved by any device that would increase the sound level of the
speech of the teacher without increasing that of the background noise. From these
measurements we conclude that no changes have to be made to the acoustic environment
for any off these classrooms.
In Figure 11-6 the measurement for classroom 1 for the background noise is
shown. This figure is representative for the other classrooms as well; there are slight, but
insignificant, differences in the results. From the figure we can conclude that NC-30 is
indeed the right noise level to use in our simulations. It must be noted that these
measurements are performed in an empty classroom. So no noise made by the children is
taken into account in these measurements.
Figure 11-6 Noise Criteria investigation for classroom 1.
11.2.2 Coninx method results
Using the results of the Coninx method ten sound fragments have been analyzed
for each classroom. To calculate a SNR sound fragments just before, during and just after
the sentence are analyzed, as shown in Figure 8-2. These fragments are in the order of
0.2s, 2-5s and 0.2s respectively. Fragments are chosen such that the teacher was teaching
classically and it was important for all children to hear what the teacher said. During the
recording it has been monitored when this is the case so useful audio fragments can be
58
found. It is very important for the Coninx method to use random but proper audio
fragments, since the choice of the audio fragments are very determinant for the final
results. This is explained by the example that if only audio fragments are chosen when
the teacher is reading a book (in this situation children are normally very quiet) the SNR
will appear to be extremely good, while during other (i.e. math) lessons children will be
louder and therefore contribute more to the background noise. The results of this
measurement are shown in Table 11-2, which gives the SNR per octave for all the
classrooms. The explanation of the analysis in Audacity is shown in the appendix,
chapter 13.3.
Table 11-2 SNR per octave

Classroom 125Hz 250Hz 500Hz 1000Hz 2000Hz 4000Hz 8000Hz
1 7,1 13,2 18,2 16,8 13,6 14,2 14,5
2 9,1 17,8 18,6 17,0 12,0 12,2 14,1
3 9,7 12,6 17,9 16,1 13,1 10,8 11,3
4 7,2 15,1 15,3 12,1 9,5 9,4 9,9
5 7,3 20,3 23,0 17,8 14,7 16,8 19,8
6 6,5 15,6 17,3 11,7 10,7 8,9 9,0
7 9,1 16,0 20,8 20,0 16,5 15,0 16,1
8 14,5 23,1 24,8 20,0 16,9 15,9 11,2
9 12,3 22,7 22,1 16,0 16,5 15,9 13,0
An average and weighted average, using the weighting factors from the STI
(weighting factors give a percentage score to each octave to contribute to a total of
100%), comparison has been made as shown in Table 11-3.
Table 11-3 Difference in average SNR and weighted SNR

Classroom SNR SNR weighted Difference
1 15,1 15,1 0,0
2 15,3 14,8 -0,5
3 13,6 13,6 0,0
4 11,9 11,4 -0,5
5 18,7 18,2 -0,5
6 12,2 11,8 -0,4
7 17,4 17,4 0,0
8 18,1 18,3 0,2
9 17,7 17,3 -0,4
From this table it can be concluded that the SNRs in all classrooms are good, and
only classrooms four and six may need some acoustical adaptation, or vocal training for
the teacher to improve the intelligibility for all children. The difference between the
average and weighted average SNR (for female speech the 125Hz band is ignored) is -0.5
dB or less then 4%, thus no further investigation on the weights of the frequency bands is
required.
59
The results from the Coninx method the results from classroom 1,3,5,6 and 8 are
shown in Figure 11-7 below.
SNR for five classrooms
20
15
SNR [dB]
10 SNR weighted
0
1 3 5 6 8
Classroom
Figure 11-7 SNR results of Coninx method.
In the previous analysis an average SNR for each classroom has been used, but
the results also hold information on the variation in the SNR for different lessons (reading
a book, math, grammar etc.). The standard deviation is used as a measure of the variation
of the SNR. When assuming a normal distribution and taking the average SNR minus the
standard deviation a SNR value is obtained that is the lower boundary for 84% of the
situations (So for only 16% of the situations the SNR can be less than this boundary).
These results are shown in Table 11-4 and Figure 11-8
Table 11-4 SNR and standard deviation

Classroom SNR weighted STDEV SNR - STDEV
1 15 6,2 9
3 15 7,3 8
5 18 3,7 14
6 12 3,1 9
8 18 3,6 15
60
SNR for five classrooms
25
20
SNR [dB]
15 SNR weighted
10 SNR - STDEV
0
1 3 5 6 8
Classroom
Figure 11-8 SNR results including standard deviation.

From these results we see that the standard deviation in the results can have a significant
impact and that in classrooms 1 and 3 the 84% boundary of the SNR can be regarded as
fair (below 10 dB), while the SNR was considered good (10-15 dB).
Since the standard deviation is strongly dependent on the number of data points used,
additional audio fragments have been analyzed for classroom number three. The results
of using five, ten and twenty data points are shown Table 11-5, from which we can see
that the standard deviation decreases by 1,7 dB from five to ten data points and by 0,5 dB
from ten to twenty data points, and also the SNR varies by 4 dB and 1 dB respectively.
This means the acoustic circumstances are indeed very variable in this classroom and the
high standard deviation was not only caused by the low number of audio fragments
analyzed. It can be concluded that ten is the absolute minimum of data points to be
analyzed and at least twenty is preferred. When the data processing would be automated
it is suggested to increase the number of data points used to come to a more reliable
result. This would not only assure that all variations are taken into account, but also that
if there would be only one peak value the impact on the SNR and STDEV is minimized.
Table 11-5 Influence on STDEV with increased amount of data points used
Classroom 3
5 data points 10 data points 20 data points
SNR 11 15 14
STDEV 9,0 7,3 6,7
SNR - STDEV 2 8 7
11.2.3 Comparing STI to Coninx method
In Figure 11-9 the STI results are compared to the results from the Coninx
method.
61
STI vs SNR y = 0,6366e0,0126x
R2 = 0,4137
0,82
0,80
0,78
STI
0,76
0,74
0,72
0,70
5 10 15 20
SNR
Figure 11-9 Comparing STI to SNR

From the figure above we see that a higher SNR means a higher STI, but the correlation
is poor. It is a straightforward conclusion that the SNR alone can’t predict the STI, but
that alone does not completely explain the low correlation. The fact is that the
measurements are not using the same source signal and not completely the same
background noise (the noise from the children is not used in the STI measurement), thus
this would result in differences even if the reverberation time would be completely
identical in all classrooms. When comparing the STI in noise to the SNR the following
figure is obtained.
STI in noise vs SNR y = 0,5704e0,0094x
R2 = 0,6381
0,70
0,69
0,68
0,67
STI
0,66
0,65
0,64
0,63
0,62
5 10 15 20
SNR
Figure 11-10 Comparing STI in noise to SNR

From this figure we see that the correlation improves significantly, but is still low.
Correlation on five data points is of course not very strong, but what can be seen is that
the SNR measurement alone is not completely in line with the STI measurement.
62
From the comparison between the Coninx method and the STI it can be concluded that
information on the RT is required. But also an important conclusion is that the STI is
unable to simulate completely all the factors that play a role in the acoustics when the
classroom is in use.
11.3 Measurement results summary
11.3.1 STI Measurement
• The RT measurement result shows, Figure 11-2, that the RT is equal throughout
the classroom and thus an average RT can be used.
• The STI measurement result shows, Figure 11-3, that the position in the room is
important. The closer the position is to the source the better the STI, if an average
is used for one room, results within the critical distance, STI approaches 1, should
not be taken into account.
• The STI measurement is reproducible and not very sensitive to small differences
in the situation. This is shown by the repetition test where the receiver is moved 5
cm from the original receiver location. This is shown by point 3 and 3a in Figure
11-3
• The RTs for the tested classrooms differ from 0.4 to 0.6 seconds, which ranges
from good to sufficient for non hearing impaired children. For the classrooms 6
and 8 an improvement in the acoustics to reduce the reverberation time would be
beneficial.
• The STI in noise results are closer together, the influence of the acoustics of the
room are in line with the results. Exception is classroom number eight, where the
good results for the STI in noise are explained by the lower background noise, as
shown in the figure below.
Figure 11-11 Left part, classroom number five. Right classroom number eight.
63
11.3.2 Coninx method results summary
• The results from the SNR measurement show three classrooms where the SNR is
less then 15 dB, which is to little for hearing impaired children
• The results from the SNR alone do not completely explain the results from the
STI measurement.
• The standard deviation is a measure of the change in acoustic circumstances and it
is shown that it does not decrease significantly when the amount of data points is
doubled. This proves that the amount of data points used is not the cause of the
high standard deviation, but indeed the variability in circumstances during the
measurement.
• The Coninx measurement method gives more information on the actual situation
in the classroom regarding background noise, due to the children and
surroundings, vocal effort of the teacher and influence of the changing position of
the teacher and\or children.
12 Discussion
In Chapter 11 it has been concluded that the Coninx method is able to produce results that
are only poorly correlated (assuming an exponential relation) with the STI measurement.
This means that some development is required in order for the Coninx method to be able
to reliably measure the SI. In the next section an improved measurement method is
proposed based on the results and conclusions from the Coninx method and STI
measurement
12.1 Coninx-Zeilstra method
It has been shown that the Coninx method contains more information than the STI
method regarding variable circumstances, influence due to the presence of the children,
changing vocal effort of the teacher, change in background noise, changing location of
the teacher and changing location of the children. Therefore the improved measurement
method should capture these aspects from the Coninx method and uses a similar
measurement setup, but incorporate the RT to be able to show a complete picture of the
acoustics in the room. Also the data processing of the Coninx method is now manual
which leaves room for subjectivity in the choice of data points; this should be changed to
an automated process.
Taking all these remarks into account we aim at an improved method that has the
following characteristics:
• Use of a real time, “live” source (the teacher)
• In situ measurement (children present, normal lesson)
64
• Use of a teacher and student microphone (the teacher microphone indicates when
the teacher is talking since the background noise level will not exceed the speech
level, the audio fragment recorded by the student microphone is used to calculate
the SNR)
• Use of Reverberation Time, per octave
• Automated data processing for SNR measurement
• Result showing both average SNR and standard deviation
• Use of at least 10 data points (preferably more than twenty)
• An observer should be present to indicate useful or non useful timeframes during
the recording (this is required to indicate when the teacher is teaching all children
and not an individual group)
• Overall result incorporating both measures, SNR and RT, using details per octave
With all these adaptations the improved method is no longer referred to as the Coninx
method, but as the Coninx-Zeilstra method. Finally it is required that the results from the
Coninx-Zeilstra method will be benchmarked against the STI measurement. This enables
the method to calculate a single result from the separate SNR and RT measurements.
12.1.1 Measurement protocol
In this section the measurement protocol of the Coninx-Zeilstra method is

described step by step which should make it possible to perform the measurement
in any classroom.
Step Description Required devices

1 Setting up audio receiver and transmitters: Laptop, Wireless audio
The wireless receivers for the microphones are transmitter and receiver (For
connected to the audio input of the laptop to enable example 2 times Sennheiser
the record function. It is required to be able to SK2 body pack transmitter,
record two channels (one for the teacher and one EM1 diversity receiver and 2
for the student). The two microphones are placed, times MKE2 microphone)
the teacher microphone should be close to the
mouth, the location of the student microphone
preferably close to the ear.
2 Start audio recording: -
The recording is started and the lesson can proceed
as normal. The observer notes which time intervals
are relevant for processing and which are not.
3 Measurement of the RT: When using the algorithm no
The calculation of the RT can be performed on the additional devices are
recorded audio fragment using the algorithm from required.
Phonak or using an audio analyzer. In the case of In case of the audio analyzer:
using an audio analyzer that is unable to measure Audio playback system,
RT per octave bandwidth limited noise can be used. Audio analyzer ( For
example the Phonic PAA3)
65
4 Data analysis: Audio analysis program (for
The recorded data is analyzed to compute the SNR example Audacity)
using the time intervals indicated as useful. The
method to analyze the data is explained in appendix
13.3. No method is available at this time to analyze
the data automatically
5 Calculate the STI: -
The results from step three and four are combined
to calculate one STI figure. The method is
described below
12.1.2 Calculating STI from the SNR and RT
To compare the results from the Coninx SNR measurement to the STI, the RT of the
classrooms is required. When using Equation 6-2, Equation 5-2 and Equation 5-3 to
calculate a STI, this can be compared against the measured STI. The overview is shown
in the table below. From the STI measurement performed two STI figures are obtained,
one STI disregarding background noise and one where an adaptation has been done for
the separately recorded background noise. The latter has been used to compare the results
of the Coninx-Zeilstra method.
Table 12-1 Comparing calculated STI (from RT and SNR) to STI in noise
STI (calculated) STI in noise
Classroom RT SNR from SNR and RT (measured) delta
1 0,52 15,12 0,67 0,67 0,00
3 0,45 13,64 0,70 0,66 0,04
5 0,41 18,22 0,73 0,66 0,07
6 0,57 11,81 0,65 0,63 0,02
8 0,56 18,29 0,67 0,69 -0,02
In the figure below, the SNR and the RT of the classrooms are plotted, the blue line
indicates a STI score of 70% calculated from the SNR and RT, the brown line indicates a
65% score and the label shows the STI measured.
66
STI relation
0,70
20,00
C5 0,66 C8 0,69
18,00
16,00
C1 0,67
SNR [dB]
14,00
0,65
C3 0,66 C6 0,63
12,00
10,00
8,00
0,30 0,35 0,40 0,45 0,50 0,55 0,60
RT [s]
Figure 12-1 RT and SNR versus measured STI

From the table and figure above it is shown that classroom number five has the largest
difference compared to the measured STI, 0.73 calculated from SNR and RT versus 0.66
actually measured. This can be explained by three factors;
1. The STI does not use actual background noise but an assumed amount of
background noise and the actual background noise is lower.
2. The teacher raises her voice when the background noise rises.
3. The absorption due to the presence of the children improves the STI.
The measured STI excluding the noise adaptation for classroom number five was 0.81, so
this could well explain the difference in STI calculated compared to STI in noise.
Plotting the calculated STI versus the measured STI in noise, results in the figure below.
67
Com paring STI calculated vs m easured
0,80
0,75
STI measured
0,70
0,65
0,60
0,60 0,65 0,70 0,75 0,80
STI calculated
Figure 12-2 STI calculated (Using SNR) compared to STI measured

From Figure 12-2 and Table 12-1 comparing the calculated STI (from the SNR and RT)
and measured STI the following is concluded:
• The STI calculated, is in four out of five classrooms higher than the measured STI
• The average difference is only 0,02 point (or 2%), which is less then 2.5%, the
standard deviation of a STI measurement
It is clear that more measurements are required to be able to benchmark the Coninx-
Zeilstra measurement against the STI, but from these results it can be concluded that the
SNR measurement combined with a RT measurement is able to predict STI outcomes
with good.
To investigate the influence of the variable circumstances on the STI, the SNR-STDEV
(as opposed to the SNR in the two figures above) with the measured RT are shown
against the measured STI in noise, in the figure below.
68
STI relation
0,70
20
18
SNR-STDEV [dB]
16
C5 0,66 C8 0,69
14
0,65
12
10
C1 0,67 C6 0,63
C3 0,66
8
0,30 0,35 0,40 0,45 0,50 0,55 0,60
RT [s]
Figure 12-3 RT and SNR-STDEV versus measured STI
Now classrooms number five and one show the biggest difference against the measured
STI, 6% and -5% relatively. The difference in classroom number one can be explained by
the following factors:
1. The STDEV is relatively large 6 dB, compared to an average of 15 dB.
2. The calculated STI was 3% lower without the STDEV adaptation.
Also for this calculated STI a comparing figure is shown against the measured STI.
Com paring STI calculated vs m easured
0,80
0,75
STI measured
0,70
0,65
0,60
0,60 0,65 0,70 0,75 0,80
STI calculated
Figure 12-4 STI calculated (using SNR-STDEV) compared to STI measured

This figure shows a similar result as Figure 12-2 that the calculated STI is in line with the
measured STI, and the average difference is now even 0.00 points.
69
Finally it can be concluded that the Coninx-Zeilstra method can predict STI
measurements and thus the SI of the classroom. The method even holds more information
compared to the STI regarding the variability of the acoustic circumstances and actual
speech signal, background noise and absorption present in the classroom during lessons.
70
13 Appendix
71
13.1 Case Signis
13.1.1 The Problem
In the Signis classrooms education is given to small groups of children with a

hearing impairment. In one of these classrooms the acoustics is of such low quality that
the teachers are hindered in their work probably due to long reverberation times in the
low frequency range. Permission has been given to buy and install sound absorbing
panels with the possibility to choose from two panels. The first are Tectum panels the
second are Merfocell panels.
The specifications of these panels are shown in Figure 13-1 en Table 13-1.
Figure 13-1 Absorption coefficient Merfocell panel
Table 13-1 Absorption coefficient Tectum panel (25 mm)

Freq. [Hz] 125 250 500 1000 2000 4000
Abs. Coef [%] 3 7 13 25 47 72 48
The first straightforward conclusion is that thicker panels of Merfocell have a

higher absorption coefficient and that the Merfocell panel has a higher absorption at
equal thickness.
3
Bij bevestiging direct tegen de wand
72
13.1.2 The Classroom
The classroom has dimensions of: L 5,51 x W 7,21 x H 3,15 and is

photographically depicted in Figure 13-2.
Figure 13-2 Photo shoot of the classroom
73
From this set of pictures it can be immediately concluded that there is very little
sound absorption present and only the ceiling seems to have a significant absorbing
function. To investigate the problem the school hired an external company to measure the
SIT and RT. The results of their measurements are shown in Table 13-2.
Table 13-2 RT and STI in empty classroom

Freq. [Hz] 125 250 500 1000 2000 4000 Gem.
RT [s] 0,56 0,60 0,54 0,51 0,49 0,44 0,52
STI 72 61 72 76 74 81 73
The STI values in Table 13-6 are categorized as good (meaning 60<STI<80), but
at 250 Hz, in an empty classroom, the STI value is dangling at the bottom of this
category. Without background noise that value should be somewhere around 80. The
reverberation time for this size of classroom (125 m3) is also a bit high, and this goes
even more so for hearing impaired children. When children would be present in the
classroom this RT would decline a bit, but the background noise also increases which
would result in a lower STI values. Thereby children with a hearing impairment require a
more then average acoustic quality of a classroom and thus need higher STI values and
lower RT to be able to perceive speech at sufficient quality.
13.1.3 The Analysis
We know that for an acoustically good room the absorption coefficients for each
octave should be in the order of 20 to 30%. In a room where high quality acoustics is
required these values should be closer, and may sometimes succeed, to the upper bound
being 30%. To determine the absorption coefficients for this classroom the acoustics have
been simulated in Catt-Acoustic. This resulted in an underestimation of the absorption
coefficients for low frequencies. This was probably due to the fact that the ceiling is most
likely mounted over an air gap, which results in much higher absorption of the low
frequencies compared to ceiling panels mounted directly on the concrete. The simulation
has been run with the presence of children in the classroom, however since the simulated
results are corrected with the measured results this does not have a negative influence on
the results. The children themselves are also an absorbing surface of about 10 m2, which
is about 2.5 % extra absorption for each octave. They contribute a bit more to the higher
frequencies and a bit less to the lower frequencies. The results of the simulation are
shown in Table 13-3.
Table 13-3 Calculated reverberation time and space average absorption coefficient percentages for
each octave band.
Freq. [Hz] 125 250 500 1000 2000 4000 Gem.
RT [s] 0,91 0,64 0,53 0,47 0,43 0,4 0,56
Abs. Coef. [%] 10,7 15,3 18,5 20,7 22,5 22,8 18,4
74
From this measurement the actual space average absorption coefficient
percentages can be calculated. The result of this calculation is shown in Table 13-4.
Table 13-4 Calculated absorption coefficient percentages based on the measurement and simulation.
Freq. [Hz] 125 250 500 1000 2000 4000 Gem.
Abs. Coef. [%] 17,4 16,3 18,9 19,1 19,7 20,7 18,7
It can be concluded that for a classroom that requires a higher quality of acoustics
the values of the absorption coefficients are to low.
13.1.4 The solution
To solve this problem neither of the two panels can be used when mounted
directly to the wall, since more low frequency sound should be absorbed. Mounting a
rigid panel over an air gap could offer the solution here. The soft panel can then be
mounted on top of this rigid panel to absorb the higher frequencies, or on any other
location in the room. To calculate the depth of the air gap Equation 13-1 has been used:
1 κp 0
f res =
2π md
Equation 13-1 Resonance frequency
fres is the resonance frequency of the construction, meaning that at that frequency the
absorption is most effectively. κ is the “heat” ratio for which 1,4 is used in this particular
case. p0 is the standard atmospherically pressure of 1013,25 hPa. m is the mass of the
panel in kg/m2 (this value depends on the choice of the panel, for Merfocell it is 1,2 and
for Tectum it is 8,4) and d is the depth of the air gap in m. The goal is to calculate an air
gap at a certain frequency so the equation is transformed to:
2
⎛ 1 ⎞ κp 0
d = ⎜⎜ ⎟⎟
⎝ f res 2π ⎠ m
Equation 13-2 Depth of the air gap as a function of fres
The 40 mm Merfocell panels have been chosen since these panels have higher
absorption coefficients at all frequencies. Since the problem lies in the low frequency
range a resonance frequency of 175 Hz has been chosen. This yields an air gap of 98 mm
which is to be filled with glass wool to improve the absorption near the resonance
frequency. If the Merfocell panel would be mounted over an air gap directly this should
be done at a distance of ¼ λ, which is 34 cm using a frequency of 250 Hz. If a smaller
gap is used this will result in less absorption in the low frequency domain.
With these two possible setups it is the question how much m2 of panels should be
used to accomplish a good acoustic environment.
75
Absorption coefficient percentages of 70% for low frequencies are assumed to be
reasonable after contact with the producer of the Merfocell panels. Furthermore the extra
absorption for each octave should be in the order of 8 % which is 20 m2 of an effective
absorbing surface. In the normal situation there are children and a teacher present who
provide about 10 m2 effective absorption, this leaves 10 m2 of effective absorbing surface
for the panels. Dividing this by the absorption coefficient of the panels this results in 14
m2 of panels to be used.
13.1.5 Advise
With the current classroom in mind, an advice is given to apply a 1m wide lane of
40 mm thick Merfocell panels over the entire length of the classroom (on the side of the
blackboard and opposite) immediately under the ceiling. These panels are to be mounted
over an air gap of 9 to 10 cm filled with glass wool. This construction results in 14,42 m2
of panels. The estimation of the space averaged absorption coefficients and the estimated
reverberation time are shown in Table 13-5.
Table 13-5 Estimated reverberation times and space averaged absorption coefficients after suggested
adaptations
Freq. [Hz] 125 250 500 1000 2000 4000 Gem.
RT [s] 0,45 0,45 0,40 0,40 0,35 0,30 0,39
Absorption [%] 22 22 24 26 27 28 25
Concluding from this table the reverberation time decreases by 0,13 s, an

improvement of 25 %. In the low frequencies this improvement is also achieved which
was the main goal of the case.
13.1.6 Result
The results of the adaptations have been measured with the RT, STI measurement
and are shown in Table 13-6.
Table 13-6 Measured RT and room average absorption coefficients after adaptations.
Freq. [Hz] 125 250 500 1000 2000 4000 Gem.
RT [s] 0,38 0,39 0,38 0,34 0,32 0,27 0,35
Absorption [%] 26 26 25 31 30 31 28
So it can be concluded that a 50 % increase in absorption has been achieved and a

27 % decrease in reverberation time. These results are even slightly better then the
predicted results and therefore it can be concluded that the given advice was correct and
that the simulation program is able to predict adequate RT results. Additionally carpet
was applied to the floor which according to the teachers had the effect that the noise
created by the movement of the chairs and feet was much less present.
76
13.2 From SPL to SNR
The Catt-Acoustic program is not capable of calculating the SNR directly, but does
calculate a Sound Pressure Level (SPL) at all the microphone positions. Since Catt-
Acoustic calculates only the total SPL, both early and late sound, from the source
increase the SPL, while the late sound should be considered as noise. To achieve a clean
SPL the SPL is simulated using 100% absorptive surfaces (except for the reflector), this
means now only the direct sound contributes to the SPL. This results in a rapidly
declining SNR as the distance to the source grows, but the differences between different
setups are visible.
Calculating the background noise is done using an energy balance, assuming the
background noise in steady state.
dE
= Ein − E out
dt
Equation 13-3
In a steady state this reduces to.

Ein = E out
Equation 13-4
Where Ein is the energy from the source, in this case the background noise and
Eout the energy lost through absorption or transmission. Eout, is thus directly related to the
amount of absorption present in the room. The reason to calculate the background noise
using this balance is that when the background noise is identical, but the absorption is
different in two situations the noise present in the room itself will be larger in the room
with less absorption. This is best explained by taking the two extremes:
• Assume a steady source in an anechoic room; the sound level will be equal to the
direct sound level from the source.
• Assume the same steady source in a reverberation room; if absolutely no sound
energy is lost, trough absorption etc, the sound level will continue to rise to
infinity.
To clarify the calculation an example will be discussed here. Suppose in comes a steady
noise of 30 dB and the average absorption of all the surfaces is 50%. Equation 13-4 will
be in balance if 50% of the noise is 30 dB. Since the fraction going in and out are then
equal. So the total noise is double that of 30 dB, meaning 33 dB.
77
13.3 Audio fragments analysis
13.3.1 Analysis of the test signal
To test the use of the spectrum analysis of the program Audacity a reference
signal, 1 kHz noise, of 90dB has been recorded and analyzed. The results of which are
shown in the figure below.
Figure 13-3 Spectrum analysis test signal

In the figure it is shown that a specific fragment of the signal is chosen and the
result of the frequency spectrum analysis. In the spectrum analysis we can see that there
is a peak at 1 kHz at 0 dB. Since no absolute levels are required to obtain a SNR not
further benchmarking has been performed. But it can be seen from the analysis that both
the frequency and sound level from the analysis can be used.
13.3.2 Analysis of audio fragment
In the figure below an example of an audio fragment analysis is shown, the blue marked
area is the time period where the teacher speaks (in this case 5s). A period of 0,2 to 0,5s
are analyzed before and after the speech period.
78
Figure 13-4 Audio fragment analysis example
13.3.3 Reproducibility test for audio fragment analysis

To make sure that the analyzed audio fragments are not very sensitive to small
differences in the time frame used, two fragments have been analyzed twice using
deliberately different time frames. The results of this test are shown in the tables below,
using the following abbreviations:
• b, the signal before the speech measured by the student’s microphone
• s, the signal during the speech measured by the student’s microphone
• a, the signal after the speech measured by the student’s microphone
• t, the signal during the speech measured by the teacher’s microphone
Table 13-7 First results

Measure b s a t SNR
ment
1 54,3- 42,2- 55,3- 43,8- 12,6
2 55,4- 45,9- 56,0- 43,5- 9,8
average 54,9- 44,0- 55,6- 43,6- 11,2
Table 13-8 Repeated results

Measure b s a t SNR
ment
1 53,1- 42,1- 55,2- 43,8- 12,0
2 55,4- 45,9- 57,3- 43,6- 10,4
average 54,2- 44,0- 56,3- 43,7- 11,2
79
Table 13-9 Difference between two results
Measure b s a t SNR
ment
1 1,2 0,0 0,1 - 0,6-
2 0,1 0,0- 1,3- 0,1- 0,6
average 0,6 0,0- 0,6- 0,1- 0,0-
From these tables we can conclude that there can be differences when repeating
the test (mainly in the audio parts before and after speech) and thus care must be taken to
analyze the correct part of the audio fragment. In the two chosen points these deliberate
changes in the audio fragment analyzed cancel out and mainly effect the before and after
measurements. This is as expected since they can be influenced significantly by one noise
signal. The measurement of the sound pressure during speech at both the student and the
teacher are far less sensitive to differences in the analysis.
80
13.4 International regulation
In the ISO 9921_2003 [xxxvi] standard speech intelligibility and vocal effort of
the speaker is regulated. Classroom communication is classified in the category as
explained in paragraph 5.3 of [xxxvi], which means that a good intelligibility rating is
required and maximally a normal vocal effort of the, in this case, teacher can be
demanded. Person-to-person communication for communication in work situations,
offices, meeting rooms and auditoria all classify under paragraph 5.3. Since a relaxed
type of communication is in place there. In these situations that occur in offices, during
meetings, lectures and performances, which take place over a longer period of time, a
good level of intelligibility is recommended allowing for a normal vocal effort. Critical
short sentence speech also qualifies under paragraph 5.3 only here a fair intelligibility is
sufficient with a loud vocal effort. All intelligibility demands and vocal efforts are
summed in Table 13-10.
Table 13-10 Recommended minimal performance ratings for intelligibility and vocal effort in four
applications, from [xxxvi]
Application Minimum intelligibility Maximum Description
rating vocal effort
Alert and warning situations Poor Loud 5.2
(correct understanding of
simple sentences)
Alert and warning situations Fair Loud 5.2
(correct understanding of
critical words)
Person-to-person Fair Loud 5.3
communications (critical)
Person-to-person Good Normal 5.3
communications (prolonged
normal communication)
Public address in public Fair Normal 5.4
areas
Personal communication Fair Normal 5.5
systems
The vocal effort is expressed by the equivalent continuous A-weighted sound-

pressure level of speech measured at a distance of 1 m in front of the mouth. The relation
between vocal effort and the corresponding level is given in for a typical male speaker.
81
Table 13-11 Vocal effort of a male speaker and related A-weighted speech level (dB re 20 μPa) at 1 m
in front of the mouth.
Vocal effort LS, A,1 m
dB
Very Loud 78
Loud 72
Raised 66
Normal 60
Relaxed 54
A more elaborate explanation on vocal effort is given in appendix A of the ISO

9921-2003 standard [xxxvi]. The intelligibility rating is given by the STI value or
objective tests, the relation to the STI is shown in Table 13-12.
Table 13-12
STI value Intelligibility score
0-0,3 Bad
0,3-0,45 Poor
0,45-0,6 Fair
0,6-0,75 Good
0,75-1,0 excellent
13.5 Dutch Noise Regulation
In the Netherlands the “Wet Geluidshinder” and the “Bouwbesluit” regulate noise
levels and noise reduction/shielding levels. Additional info on the regulation can be
found in the NVN_3438 [xxxvii] standard. This standard regulates noise levels and
reverberation times for all rooms and circumstances.
The law states, among other things, that noise levels at the front of the school should be
measured from 7:00 until 19:00, since the noise levels in the evening and at night are not
relevant if the school is not in use at that time. The noise levels are measured per noise
source, such as industry, roads, railways or airplanes and are in general maximally 50
dB(A) at the front of the building to be considered. The noise level is the equivalent
sound level in dB(A) over the above specified time for educational buildings. The
equivalent noise level (LAeqw) is given by Equation 13-5.
⎛ 1 t2 p A2 (t )dt ⎞
L Aeqw = 10 log⎜ ⎟
⎜ t 2 − t1 ∫t p 0 ⎟⎠
2
⎝ 1
Equation 13-5 Equivalent noise level in dB.
With pA the A-weighted sound pressure level in Pa, and p0 the reference level of 20 μPa.
82
Further regulation is left to the bouwbesluit of which the 2003 version is being
considered here. The, for this research relevant information is from paragraph 3.1-3.5 on
noise control. The regulation for educational buildings will be discussed here, further
information can be found in [xxxviii].
• A building protects the residential area from noise of its surroundings
• The external walls of educational buildings should provide a, according to NEN
5077 characteristics determined, sound reduction no less than the difference in
external noise (of industry, road and railway) and internally allowed noise (30
dB(A) for noise sensitive 4 and 35 dB(A) for other educational rooms) with a
minimum of 20 dB(A).
• If by law a higher noise (of industry, road and railway) level is allowed at the
front of the school the noise reduction, determined according to NEN 5077,
should be no less than the difference of the noise level and the maximally allowed
internal level in dB(A)
• Internal walls that are adjacent to non noise sensitive rooms should also provide
enough shielding to assure noise levels of a maximum of 30 dB(A) + 2 dB(A),
upward margin. For example the walls shielding a classroom from the adjacent
auditorium should reduce the noise to 32 dB(A) averaged over the above specified
time interval.
In case of Air traffic noise the noise shielding is determined by
• Table 13-13 and if the noise level is in between mentioned Ke values linear
interpolation yields the minimal noise shielding. The same reasoning as above
goes for internal walls on air traffic noise.
Table 13-13 Source Bouwbesluit 2003 [xxxviii]

Noise shielding in case of Air traffic noise
Noise level in Ke Minimal noise shielding in dB(A)
36-40 30-33
41-45 33-36
46-50 36-40
>50 40
• In case of a renovation or temporary building the Mayor and Aldermans (Dutch

Burgemeester and Wethouders (B&W)) can reduce the noise shielding for both
industry, road and railway as air traffic noise, with a maximum of 10 dB(A)
In case of a temporary building
• Table 13-13 is overruled by Table 13-14.
Table 13-14 Source Bouwbesluit 2003 [xxxviii]

Noise shielding for temporary buildings in case of Air traffic noise
Noise level in Ke Minimal noise shielding in dB(A)
4
Noise sensitive rooms are classrooms etc. and non-sensitive are auditoria etc.
83
40-50 30-35
51-55 35-40
>55 40
• A new building protects from noise of installations.

• A flushable toilet, a tap, a mechanical ventilation system, a warm water device, an
installation to increase water pressure or an elevator on the same parcel can only
cause a maximum of 30 dB(A) measured according to NEN 5077, this value can
be increased by B&W with 10 dB(A) in case of renovation or a temporary
building.
• A new building protects from noise of adjacent rooms with the same function
(like education).
• The according to NEN 5077 determined isolation index on air bound noise on the
sound transfer from playgrounds, gyms etc. to a room where education is given is
at least 0 dB.
• The according to NEN 5077 determined isolation index on contact noise (via
floor, walls, ceiling) on the sound transfer from playgrounds, gyms etc. to a room
where education is given is at least 10 dB.
• The according to NEN 5077 determined isolation index on air bound noise on the
sound transfer from rooms where materials are being processed with tools to a
room where education is given is at least 0 dB.
• The according to NEN 5077 determined isolation index on contact noise (via
floor, walls, ceiling) on the sound transfer from room where materials are being
processed with tools to a room where education is given is at least 10 dB.
• Here again the B&W can reduce these indices by 10 dB in case of a renovation.
• The building must be built in a way that assures sound absorption such that sound
hinder from reverberation is limited.
• For classrooms the average reverberation time limit as measured according to
NEN 5077 must be below 1s.
• A new building must provide protection against cross over noise between
different usage functions.
• The according to NEN 5077 determined isolation index on airborne noise
transmission from a closed room to on another parcel situated usage function is at
least 0 dB
• The according to NEN 5077 determined isolation index on contact noise
transmission from a closed room to on another parcel situated usage function is at
least 0 dB
transmission from a closed room to another closed room situated on another
parcel is at least -5 dB
transmission from a closed room to another closed room situated on another
parcel is at least -5 dB
84
transmission from a closed room to on the same parcel situated residence living
function is at least 0 dB
transmission from a closed room to on the same parcel situated residence living
function is at least 0 dB
transmission from a closed room to another not being a residence closed room
living function situated on the same parcel is at least -5 dB
transmission from a closed room to another not being a residence closed room
living function situated on the same parcel is at least -5 dB
transmission from a residential area used as playground or where materials are
being processed with tools to a on the same parcel situated education function is
at least 0 dB
transmission from a residential area used as playground or where materials are
being processed with tools to a on the same parcel situated education function is
at least 10 dB
• All these indices can be lowered by the B&W with 10 dB(A) in the case of a
renovation or temporary building
From the NVN 3438 standard we learn that the maximum noise levels in
classrooms are also subject to the distance from the teacher to the students. We can safely
assume that the distance is at all times over three meters for some students so the
maximum noise values in Table 13-15 are lowered by 5 dB(A) (till a minimum of 35
dB(A)). For distances below one meter the values are increased with 5 dB(A). A similar
table is given for measure of concentration in the standard as is shown in Table 13-16.
These base levels are corrected for the kind of noise, where it must be noted that different
correction do not add up but the largest correction is used. Again no levels under 35
dB(A) are obtainable.
Table 13-15 Noise base levels for communication categories, from NVN 3438 standard [xxxvii].
Category Communication level Noise base value (dB(A))
A None 80
B Very low 75
C Low 65
D Intermediate 55
E Fair 45
F High 35
85
Table 13-16 Noise base levels for concentration categories, from NVN 3438 standard [xxxvii].
Category Concentration level Noise base value (dB(A))
A None 80
B Low 75
C Intermediate 55
D high 35
1. Steady noise
• Steady noise at constant sound level (computer cooling fan); no correction
• Steady noise at varying sound levels (machine turning on and of, refrigerator);
5 dB(A) correction.
• Noise contains sudden increases in sound level and/or tonal components; 10
dB(A) correction.
2. Information of the noise
• No information; no correction
• Does contain information (conversation from others); 10 dB(A) correction
The explanation for the different categories is properly explained in [xxxvii] and
will not be discussed here. It is clear that educational tasks are classified in the highest
concentration and communication classes.
The NVN 3438 standard [xxxvii] also provides guidelines for reverberation times, Figure
13-5 is copied from the standard.
The exact measurement method for the reverberation time is explained in [xxxvii] in
appendix B and is left out here since a different setup is used in this project.
Figure 13-5 RT guidelines as a function of room volume for multiple concentration levels, from
[xxxvii]
86
13.6 European Guidelines
For the separate national guidelines an elaborate discussion is left out since it’s
beyond the scope of this research, none the less a comparison between the different
regulations is useful. Therefore all the values are simply copied into this chapter without
further discussion, all values and figures are from [xxxix] unless stated otherwise. It
should be noted that these values should be valid for empty classrooms. In the ISO
9921_2003 standard minimum requirements are given for the speech intelligibility and
the maximum vocal effort that can be demanded from the speaker.
13.6.1 Reverberation Time
The RT is standardized in countries like Portugal, France, Belgium and Italy

among others.
In Portugal the maximum RT is coupled to the frequency, for the range from
500-4000 Hz, being most representative for human speech the maximum RT is 0,8 and
the minimum 0,6 s for classrooms.
In France the RT is also coupled to the volume of the room, for classrooms
smaller then 250 m3 the range of the RT is 0,4-0,8 s and for larger rooms this is 0,6-1,2 s.
In Italy the RT is dependent on both room volume and frequency as is shown in
Figure 13-6 and Figure 13-7. From these figures we conclude that a RT of 1 s is allowed
when room volume is larger then 500 m3 at 2000 Hz where the RT has its minimum.
Figure 13-6 Italian standard for the maximal RT as a function of the room volume
87
Figure 13-7 standard for the maximal RT as a function of frequency
In Belgium the RT is determined by the “Type-Bestek” 110 (1979) and is a

function of the volume as well as is shown in Figure 13-8. At room volume of 250 m3 the
RT has a range of about 0,65-1 s.
Figure 13-8 Belgium RT as a function of room volume.
Besides these national regulations a lot of research and recommendations have

been done with varying results, so for now we’ll assume that the above values are
sufficient.
All values are shown in Table 13-17.
Table 13-17 Reverberation Time Limits for several European Countries
Netherlands Portugal France Italy Belgium

[s] [s] [s] [s] [s]
minimum
0,4 0,7
maximum
1 0,8 0,8 1 1
88
13.6.2 Signal to Noise Ratio
The SNR is not only dependent on the classroom acoustics but also on the sound
level of the speech to be received and the distance to the source. Therefore guidelines on
these values are very hard to obtain. Furthermore it must be noted that different groups of
listeners (like people with a hearing impairment, elders or youngsters compared to people
with normal hearing. require different SNRs to obtain a good Speech Intelligibility (SI).
Generally it is accepted that a SNR of 15 dB should result in a good SI, which is also the
maximal value in the STI method. For the group of people who have difficulties with
hearing or understanding the speech a SNR of 20 dB is the optimal value. This value is
mainly observed in literature for people with a mental handicap or hearing impairment,
like in [xl].
Unlike the SNR itself the background noise can be directly influenced by
acoustical parameters and is therefore more readily available in the literature.
Portugal regulation states that the background noise must not succeed 35 dB.
France, Italy and Belgium regulation state that the background noise must not succeed
38 dB, 36 dB and 30-45 dB respectively. The Belgium regulation states that the
background noise can vary along with the surroundings the school is in, so for louder
surroundings a higher background noise level is allowed. Here again the literature in
general agrees with these values.
All values are shown in Table 13-18.
Table 13-18 Maximum acceptable background noise levels for several European countries
Netherlands Portugal France Italy Belgium

in
(LAeq) (LAeq) (LAeq) (LAeq)
[dB(A)] [dB(A)] [dB(A)] [dB] [dB(A)]
Classroom
30 35 38 36 30-45
Music room
35 30-40
Gym
40 43 40 35-50
From [xli] we find a similar table as shown in Table 13-19.
89
Table 13-19 Guidelines for Reverberation time and Noise levels from several regulations and
researchers.
Reverberation Background Noise Level Other

Room Equipment
(R-60) NC dBA RC dBA
ASHA/Consortium Signal-to-noise ratio >15

Recommendations 5 0.4 seconds 20 30 -- -- dB
ASA Signal-to-noise ratio +15

Recommendations 0.6 - 0.8 seconds -- 30-35 25 -- dB
25-
ANSI S12.2-1995 -- 25-30 -- 30 --
Acoustic Guidelines,
Swedish Board of 90% ceiling area
Housing, Building and equivalent to 0.6 absorbent; walls provide
Planning (1994) seconds 30 -- -- 30 44 dB sound reduction
New schools not
School Standard, permitted where sound
Portugal (DIN 254/87) -- -- -- -- -- level (exterior) >65 dBA
Walls permit <50 dB

--classrooms generally 1 - 1.3 seconds 30 -- -- -- sound transmission
--classrooms for
students with hearing Walls permit <45 dB
loss 0.4 - 0.6 seconds 25 -- -- -- sound transmission
Equipment Standard,
Los Angeles County 25-
Unified School District -- -- -- 30 35 --
Washington State
Health Department
WAC 248-64-320 -- -- 35 -- -- --
Architectural
Acoustics, Egan 6 --
5
A consortium of organizations representing persons with hearing, speech, and language impairments
(Alexander Graham Bell Association for the Deaf, Inc., the American Speech-Language-Hearing
Association (ASHA),, Auditory-Verbal International, Inc., The National Center for Law and Deafness, The
National Cued Speech Association, and Self Help for Hard of Hearing People (SHHH)) organized to
submit consensus recommendation on classroom acoustics in comment too the Biard’s proposed rule in
access to State and local government facilities
6
Egan classifies a Noise Criteria range of less than 20 as necessary for excellent listening conditions, as in
concert halls, broadcast and recording studios, recital halls, large auditoriums, and churches. An NC range
between 20-30 produces ‘very good’ listening conditions, appropriate for theaters, small auditoriums, large
meeting rooms, teleconferencing facilities, executive offices, courtrooms, chapels, and large meeting
rooms. An NC range of 25-35 is recommended for sleeping rooms. NC 30-35 will produce ‘good’ listening
conditions for offices, small meeting rooms, libraries, and classrooms.
90
Walls permit <50 dB
--classrooms generally 0.6 - 0.8 seconds 30-35 38-42 25 34 sound transmission
--classrooms for
students with hearing Walls permit <35 dB
loss < 0.5 seconds 25-30 -- 20 -- sound transmission
Sound field
Amplification,
Crandell et al. < 0.4 seconds 25 35 -- -- --
Range of classroom
recommendations
from 18 acoustics
textbooks 7 -- -- 30-47 -- -- --
7
Taken from Rettinger, Michael, A Handbook of Architectural acoustics and Noise Control: A Manual for
Architects and Engineers, TAB, Blue Ridge Summit, PA, 1988 (pps.232-233).
91
13.7 ANSI Classroom Requirements
In June 2002, the American National Standards Institute, Inc. (ANSI) released a
new classroom acoustics standard entitled “Acoustical Performance Criteria, Design
Requirements, and Guidelines for Schools” (ANSI S12.60-2002). This standard was
developed by an interdisciplinary working group in cooperation with the U.S.
Architectural and Transportation Barriers Compliance Board (the Access Board).
The need for a standard has been researched on multiple occasions; one of them was the
1995 research by the General Accounting Office (GAO) in America. They concluded that
the noise in classrooms was the single most prevalent problem in classrooms. And it
turned out that not only children with hearing impairment had problems with the noise
present. The noise also influenced the speech perception of:
• People with permanent hearing problems due to sensorineural, conductive, mixed
or central losses, which affected approximately 10-15% of school aged children
• Children with an attention deficit disorder, learning disabilities, phonological
disorders, auditory processing deficits or specific language impairment
• People who do not have the spoken language as native language
• Children with ear infections
• Children with Otitis Media with Effusion (OME), a temporary condition that is
very common in young children that causes impaired hearing; this is now the
most common cause for children to visit doctors, affecting approximately 2/3 of
first-graders
It even turned out that children in general had problems with noise, since their
language skills are not yet that developed. All these groups therefore need a better signal
to noise ratio and lower reverberation times.
This study along with many other resulted in the current ANSI S12.60 standard which
states that:
• Background noise levels due to steady noise sources such as road traffic and
Heating, Ventilating, and Air-conditioning (HVAC) systems, are limited to an
overall A-weighted sound level of 35 dB and an overall C-weighted sound level
of 55 dB in most classrooms.
• When the noisiest hour is dominated by unsteady noise from transportation
sources (aircraft, highways, and trains), the limits for most classrooms are: (1) an
hourly average A-weighted sound level of 40 dB, and (2) the A-weighted sound
level must not exceed 40 dB for more than 10% of the hour.
• The limits for all large core learning spaces (volumes over 20,000 cubic feet) and
all ancillary learning spaces (spaces used for informal learning and social
interaction) are 5 dB higher than the limits of the two paragraphs above. Core
learning spaces include classrooms, instructional pods, libraries, music rooms, etc.
Ancillary learning spaces include corridors, gymnasia, cafeterias, etc.
• The T60 reverberation time in typically sized core learning spaces (up to 10,000
cubic feet) must not exceed 0.6 seconds. The maximum reverberation time in
larger core learning spaces (10,000 to 20,000 cubic feet) must not exceed 0.7
92
seconds. General recommendations are provided for all ancillary learning spaces
as well as core learning spaces over 20,000 cubic feet.
The Standard also regulates isolation between classrooms and its surroundings,
trough walls, ceiling and floor. These values are expressed in the Sound Transmission
Class (STC) and Impact Insulation Class (IIC). The latter being from impact noises
isolation provided from one space to the one below it. We will not go in to these values in
this research, however interested readers are directed to [xlii].
93
14 Bibliography
[i] Evans, G.W., Bullinger, M., Hygge, S., Chronic noise exposure and psychological
response a prospective study of children living under environmental stress.
[ii] Noise from Civilian Aircraft in the Vicinity of Airports – Implications for Human
Health, Healt Canada.
[iii] Evans, G.W., and Maxwell, L (1997), "Chronic noise exposure and reading deficits:
The mediating effects of language acquisition," Environment and Behavior 29(5), 638-
656.
[iv] Lukas, J.S., DuPree, R.B and Swing, J.W, (1981) "Effects of noise on academic
achievement and classroom behavior", Office of Noise Control, Cal. Dept. of Health
Services, FHWA/CA/DOHS-81/01, Sept 1981.
[v] Bronzaft, A.I. and McCarthy, D.P. (1975), "The effect of elevated train noise on
reading ability," Environmental Behavior, 7, 517-528.
[vi] Bronzaft, A.L. (1982) "The effect of a noise abatement program on reading ability",
J. Environmental Psychology, 1, 215-222
[vii] Sutherland, L.C., and Lubman, D., The Impact of Classroom Acoustics on
Scholastic Achievement, 17th Meeting of the International Commission for Acoustics,
Rome, Italy, Sept. 2-7, 2001.
[viii] Crandel, C., Smaldino, J., (2000), Classroom Acoustics for children with normal
hearing and with hearing impairment
[ix] Beranek, L. (1954). Acoustics. New York: McGraw-Hill
[x] Peutz, V. (1971). Articulation loss of consonants as a criterion for speech
transmission in a room. Journal of the Audio Engineering Society, 19, 915–919
[xi] Leavitt, R., & Flexer, C. (1991). Speech degradation as measured by the Rapid
Speech Transmission Index (RASTI). Ear and Hearing, 12, 115–118.
[xii] Sabine, W. (1964) Collected Papers on Acoustics. Dover Publications
[xiii] Crandel, C., Smaldino, J., (1995), An update for classroom acoustics for children
with a hearing impairment. Volta review, 1, 4-12
[xiv] Crandell, C., & Bess, F. (1986, October). Speech recognition of children in a
“typical” classroom setting. Asha, 29, 87.
[xv ] http://de.wikipedia.org/wiki/Deutlichkeit_(Akustik)
[xvi] V. M. A. Peutz, “Articulation loss of consonants as a criterion for speech-
transmission in a room, 1971
[xvii] Bradley, John S.; “Predictors of speech intelligibility in rooms”, in: Journal of the
Acoustical Society of America, 1986, (80) nr. 3, pp. 837-845.
[xviii] T. Houtgast, H. Steeneken, "A Review of the MTF Concept in Room
Acoustics and its use for Estimating Speech Intelligibility in Auditoria," J. Acoust.
Soc. Am., Vol. 77, No. 3, (1985).
[xix] French, N. R., and Steinberg, J. C. (1947). "Factors governing the intelligibility of
speech sounds," J.Acoust.Soc.Am. 19, 90-919
[xx] Houtgast, T and Steeneken, H.J.M. (1971) “ Evaluation of Speech Transmission
Channels Using Artificial Signals”, Acustica 25, 355-367
94
[xxi] Steeneken, H.J.M., Houtgast, T, The revised STI r method, Speech Communication,
v.28 1999
[xxii] Houtgast, T, Steeneken, H.J.M. et al (2002) Past, present and future of the Speech
Transmission Index, 24
[xxiii] Duquesnoy, A.J.H/M and Plomp, R. (1980) “Effect of reverberation and noise on
the intelligibility of sentences in case of presbyacusis”. J. Acoust. Soc. Am. 68, p.537-
544.
[xxiv] Steeneken, H. J. M., & Houtgast, T. (1980). A physical method for measuring
speech-transmission quality. Journal of the Acoustical Society of America, 67, 318-326.
[xxv] IEC_60268-16_2003 International standard, Sound system equipment – Part 16:
Objective rating of speech intelligibility by speech transmission index
[xxvi] Hongisto, V., (2005) A model predicting the effect of speech of varying
intelligibility on work performance. Blackwell Munksgaard
[xxvii] Finitzo-Hieber, T., & Tillman, T. (1978). Room acoustics effects on monosyllabic
word discrimination ability for normal and hearing-impaired children. Journal of Speech
and Hearing Research, 21, 440–458.
[xxviii] Crandell, C., Smaldino, J., & Flexer, C. (1995). Sound field FM amplification:
Theory and practical applications. San Diego, CA: Singular Press.
[xxix] Coninx, F., Konstruktion und Normierung des Adiptiven Auditiven Sprach Test
(AAST), 2005
[xxx] www.phonak.com
[xxxi] Real Time Walkthrough Auralization - The First Year (about CATT-Walker™)
B.-I. Dalenbäck CATT, M. Strömberg Valeo Graphics, IOA Copenhagen, May 2006
[xxxii] The Rediscovery of Diffuse Reflection in Room Acoustics Prediction
Bengt-Inge Dalenbäck, ASA Cancoun, December 2002
[xxxiii] Verification of Prediction Based on Randomized Tail-Corrected Cone-Tracing
and Array Modeling, B.-I. Dalenbäck, 137th ASA/2nd EAA Berlin (March 1999)
[xxxiv] General ray tracing Procedure, G. H. Spencer and M. V. R.K. Murty (1962)
[xxxv] http://www.duran-audio.com/
[xxxvi] ISO 9921_2003 Ergonomincs, Assesment of speech communication.
[xxxvii] NVN_3438 Standard, Ergonomics. Annoyance due to noise at the workplace.
Target values for noise levels and reverberation time in relation to disturbance of
communication and concentration.
[xxxviii] Bouwbesluit 2003
[xxxix] Vallet, Invited Paper: Some European Standards on Noise in Educational
Buildings”.
[xl] Van Berlo, D. “De akoestische kwaliteit in woon- en werkruimte”, in: Meuwese-
Jongejeugd, J.; Van Berlo, D.; (ed.), Akoestische aanpassingen in zorginstellingen voor
mensen met een verstandelijke handicap en slechthorendheid, Delft, 2000
[xli] http://www.access-board.gov/acoustic/Acoustics-request.htm
[xlii] ANSI S12.60 (2002) Acoustical Performance Criteria, Design Requirements, and
Guidelines for Schools.
95

Gerben Zeilstra 20090827 Speech Intelligibility

Diunggah oleh

Informasi Dokumen

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

Gerben Zeilstra 20090827 Speech Intelligibility

Diunggah oleh

Hak Cipta:

Format Tersedia

Faculty of Applied Sciences

Speech Intelligibility in Classrooms

A new measurement method

Master Thesis Project

Name: G.J. Zeilstra

Why is it so important that the acoustical surroundings in classrooms are good?

Figure 0-1 Effects of noise on grade equivalent scores, source [iv]

2.1 Impulse response

3.1 Signal to Noise Ratio

The Signal to Noise Ratio, or SNR, is influenced by several parameters in the

3.2 Direct to Reverberant Ratio

Figure 3-3 Example of combined effects of noise and reverberation

To illustrate, from [xiii] we learn that if an individual is listening to speech in a

The Speech Intelligibility can be determined by several methods. There are

4.1 Speech Transmission Index

Figure 4-2 Modulation Index

Figure 4-3 Relation of the RT and the modulation (reduction) factor

4.2 Energy ratios

4.3 Articulation loss of Consonants

Figure 4-4 Relation ALcons and STI

5.1 Calculating the STI

Table 5-1 Slope of masking as a function of the Intensity

From this effective SNR a transmission index (TIk,f) is calculated using:

where the transmission index is bounded by 0<TIk,f<1. According to Steeneken

Octave 125 250 500 1000 2000 4000 8000

Application Band- Non Linear Reverberation Test signal Measuring

Table 6-1 STI in relation to intelligibility

6.1 Relating STI to subjective measures

Word or Phoneme Male Female

6.2 Predicting STI values

In order to compare results of a SNR and RT measurement to a STI measurement

7.1 Speech perception for the hearing impaired

To investigate the difference in speech perception, between persons with normal

Figure 7-2 Mean speech recognition scores (in % correct) of

7.2 Speech perception of children compared to adults

A matter which is discussed in [viii] is the speech perception of children with

8.1 The SRT measurement

Figure 8-1 AAST speech test.

8.2 The SNR measurement

9.1 Simulations setup

Figure 9-2 Microphone distribution top-down 2-D view

Figure 9-3 Directivity characteristic from a DDC system

9.2 Simulation results

9.2.1 Reverberation time

Setup\Delta from normal [%]

9.2.2 Signal to Noise Ratio

Figure 9-6 Signal to noise ratio in the good classroom.

Signal to Noise Ratio averaged for all octave

Figure 9-7 Signal to noise ratio in the poor classroom

The conclusions from these figures are:

9.2.3 Speech Transmission Index

9.2.3.1 Impact of adaptations on STI

Good Classroom STI

50 Aimed source Gain 6

The conclusions from these two figures are:

Table 9-6 Summary of the STI measurements

9.2.3.2 STI for hearing impaired persons

STIuser [%] with noise STIuser [%] with noise

Figure 9-12 C-50 simulation in the good classroom

Poor Classroom C-50 average

Figure 9-13 C-50 simulation in the poor classroom

9.3 Simulation Conclusions

11.1 Measurement method

The measurements in actual classrooms aim to investigate the acoustic quality of