GDP 7
Electronic Timpani
ABSTRACT
A group design project report submitted for the award of Master of Engineering
The timpani is a large orchestral instrument that lends weight and substance to music.
The timpani must be large to produce these low frequencies, but its size renders it
awkward, expensive and difficult to transport. A training instrument could be used to
learn the skills of timpani playing, without requiring access to a set of classical timpani.
A design for an electronic timpani is here proposed to meet this requirement.
The system described includes a bespoke ‘pad’ that senses the player’s motions, an
electronic circuit for initial signal conditioning and a software program that emulates
the output sound of a timpani based on these inputs. The algorithms required for input
processing are detailed, along with several areas for their future development. During
the course of the project a series of timpani recordings was collected. This then formed
the basis of the sound generation algorithms. It is likely that any future projects will
wish to improve and expand this collection of recordings.
A working prototype has been produced which can be used as a platform for further
development. This project may eventually evolve into a commercially viable product.
Contents
Acknowledgements iv
1 Introduction 1
1.1 Project Brief . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Introduction to the Timpani . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3.1 History and Role . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3.2 Timpani Constitution . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3.3 Playing a Timpani . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4 Ambiguous Terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
4 Specification 19
4.1 General attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.2 Playing area . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.3 Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.4 Sound . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
5 Hardware 22
ii
CONTENTS iii
7 Input Processing 56
7.1 Accessing Serial Port Data . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
7.1.1 POSIX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
7.1.2 Processing Requirements Issues . . . . . . . . . . . . . . . . . . . . 57
7.1.3 Framework of Serial Port Software . . . . . . . . . . . . . . . . . . 57
7.1.4 Interpretation of Serial Data . . . . . . . . . . . . . . . . . . . . . 58
7.2 Input Processing Operation . . . . . . . . . . . . . . . . . . . . . . . . . . 58
7.2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
7.2.2 Framework of Input Processing Software . . . . . . . . . . . . . . . 59
7.2.3 Buffering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
7.2.4 Noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
7.2.5 Zero Point . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
7.2.6 Finding the Noise Threshold and the Zero Point . . . . . . . . . . 61
7.2.7 Hit Strength . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
7.2.8 Peak Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
7.2.9 Determining Radial Distance . . . . . . . . . . . . . . . . . . . . . 64
7.2.10 Ringing Effects and Bouncing of Pad . . . . . . . . . . . . . . . . . 68
7.2.11 Damping Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
7.2.12 Trade-off . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
7.2.13 Main Structure of Software . . . . . . . . . . . . . . . . . . . . . . 70
7.2.14 Interfacing with the Sound Generation . . . . . . . . . . . . . . . . 72
7.3 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
7.3.1 Digital Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
7.3.2 Machine Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
7.3.3 Improved Peak Detection . . . . . . . . . . . . . . . . . . . . . . . 74
7.3.4 Improved Hardware Implementations . . . . . . . . . . . . . . . . . 74
8 Sound Generation 76
8.1 Anechoic Chamber Recordings . . . . . . . . . . . . . . . . . . . . . . . . 77
8.1.1 Purpose of Experiment . . . . . . . . . . . . . . . . . . . . . . . . 77
8.1.2 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
8.1.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
8.2 Wavetable Synthesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
8.3 Pitch Shifting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
8.3.1 Zero Order Hold / Linear Interpolator . . . . . . . . . . . . . . . . 85
CONTENTS v
8.3.2 Looping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
8.3.3 Waveform Based Pitch Shifter . . . . . . . . . . . . . . . . . . . . 86
8.3.4 Separate Source/Filter Pitch Shifting . . . . . . . . . . . . . . . . 86
8.4 Data Driven Modelling Approach . . . . . . . . . . . . . . . . . . . . . . . 89
8.4.1 Principal Component Analysis . . . . . . . . . . . . . . . . . . . . 90
8.4.2 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
8.5 Dataspace Creation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
8.5.1 File Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
8.5.2 Clipping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
8.5.3 Noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
8.5.4 Frequency of Recordings . . . . . . . . . . . . . . . . . . . . . . . . 92
8.5.5 Hit Strength of Recording . . . . . . . . . . . . . . . . . . . . . . . 93
8.5.6 Trimming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
8.5.7 Tailing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
8.5.8 Forming the Dataspace . . . . . . . . . . . . . . . . . . . . . . . . 98
8.6 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
8.6.1 Recording Generation . . . . . . . . . . . . . . . . . . . . . . . . . 98
8.6.2 Frequency Interpolation . . . . . . . . . . . . . . . . . . . . . . . . 98
8.6.3 Hit Strength Interpolation . . . . . . . . . . . . . . . . . . . . . . . 100
8.7 Damping and Glissando . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
8.7.1 Glissando . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
8.7.2 Damping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
8.8 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
8.8.1 Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
8.8.2 Global Data Structures . . . . . . . . . . . . . . . . . . . . . . . . 103
8.8.3 Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
12 Conclusions 124
12.1 Success . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
12.2 Team . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
12.3 Input Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
12.4 Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
12.5 Product Viability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
Bibliography 127
Glossary 130
B Table of Authorship 4
C Time Planning 6
D Software Listings 7
D.1 timpani.h . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
D.2 PIC program(timp8.c) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
D.3 Matlab Prototype Sound Generation Modelling . . . . . . . . . . . . . . . 18
H Timpani Photographs 29
J M-File index 33
K CD Contents 35
L Recording Index 37
List of Figures
vii
LIST OF FIGURES viii
The team would like to express their gratitude to the following for their invaluable help:
Mr John Abendstern, timpanist with the Welsh National Opera Orchestra, for initi-
ating and supporting the project, and for his generosity and encouragement
Dr Steve Gunn, for initiating and supervising the project, providing technical assis-
tance and guidance
Dr Christine Shadle for helping supervise the project and the generous provision of
laboratory space
Dr Matthew Wright for providing invaluable comments and criticism from an acous-
tics perspective
Dr John Fithyan for allowing access to the ISVR anechoic chamber
Dr Keith Holland for help with, and use of, recording equipment
Mr Rob Stansbridge for the loan of the sound level meter
Mrs Ros Austin for ordering the PC components
ix
Chapter 1
Introduction
1
Chapter 1 Introduction 2
“Timpani are the most important orchestral percussion instruments. They play a central
role in the orchestra because they underline important chords. However, they are large
and expensive items. The goal of this project is a challenging but interesting one: to
design and build an electronic timpani that is portable, cheap and that reproduces the
fidelity of a classical timpani. Since the unit could be used for teaching, it should
imitate a classical timpani by producing sounds when struck with a mallet (the sound
will depend upon the region struck) and allow for damping when a hand is applied to
it. The project would be undertaken with John Abendstern, principal timpanist of the
Welsh National Opera Orchestra.” (33)
1.2 Objectives
There are several different inputs that the electronic timpani system should respond to.
These inputs were prioritised, in consultation with our sponsor and supervisors.
1. Strike strength
4. Damping
5. Glissando
6. Timpani type
7. Mallet type
For a useful training instrument at least the first four of the inputs must be included.
The project will therefore attempt to design and implement a prototype that can respond
to the first four, and create a reasonable timpani sound in real-time.
The timpani, also called ‘kettledrum’, was the first drum to be used in the orchestra over
300 years ago. The instrument is capable of a wide range of effects, from a low rumble to
Chapter 1 Introduction 3
resounding and powerful drumbeats. It can emphasise or add a crucial dramatic element
to the melody. The timpani is therefore the most important percussion instrument of
the orchestra.
The timpani normally consists of a large copper bowl with a drumhead made of calfskin
or plastic stretched across the top. When struck with mallets of different composition
and hardness, the timpani produces a specific pitch that is determined by the drum’s
size. Most orchestras use four or five timpani of varying sizes, from 16 inches to 32 inches.
The pitch is controlled by tightening the drumhead with a mechanical foot-pedal. The
instrument is fine tuned by adjusting the tension of the drum head with a set screws
arranged around the edge of the drum skin.
strike can produce a wide range of effects. Due to the dynamics of the skin, and the
interaction with the outer shell, the sound produced by the timpani is strongly affected
by the position and strength of the strikes. A very common form of playing is a ‘roll’,
which entails fast consecutive strikes on the timpani head.
When necessary, the player attenuates the sound prematurely by damping the timpani’s
drum head, either by placing a finger or the whole hand gently on the drum skin. Again,
changing the way damping is performed modifies the impact it has on the note.
Most timpani incorporate a mechanical pedal that changes the pitch of the sound. The
pedal alters the tension of the membrane in a range of approximately 3:1, which corre-
sponds to a musical sixth. The player can judge the position of the pedal by looking at
a tuning gauge mounted on the side of the timpani. If the pedal position is adjusted
whilst a note is still ‘ringing’, then a ‘glissando’ (or sliding note) will be generated.
The timpanist is required to have a good musical ear, as the drums may need to be
tuned several times and to different pitches during a performance. Two or three notes
(depending on the number of drums) are named at the beginning of the score and these
are the notes that the player must initially tune to.
Finally, some orchestral musical pieces require quite obscure forms of timpani playing,
such as hits on the shell of the timpani.
Chapter 1 Introduction 5
Throughout the report, due to its inherent ambiguity, the term ‘sample’ been avoided,
The following terms have been used:
• Recording: Recording of the sound created following a single strike of the tim-
pani, i.e. there is a recording of a strike for each ‘frequency’, ‘position’ and ‘hit
strength’
6
Chapter 2 Research into the Sound of Timpani 7
This project spans both the engineering and musical worlds and as such it is important
that commonly used terms are mutually comprehensible. Table 2.1 compares some
important terms.
2.1 Psychoacoustics
This section explores some useful background information into the psychoacoustics of
musical instruments.
2.1.1 Attack
The attack of a note includes both the initial transients and onset of harmonics formed
as an instrument is struck or excited in some way. Many of these initial harmonics, and
the frequency contribution formed from the initial transients, will quickly decay, leaving
the ‘ringing’ section of the note. The brain primarily uses the attack part of the note
to identify an instrument. Listeners cannot reliably identify musical instruments when
the onset and offset phases of the harmonics are removed. The attack contains a high
portion of information which gives the instrument its characteristic sound. Experiments
have been used to demonstrate this, using the phase information from one instrument,
such as a clarinet, to replace the phase information from another instrument, such as a
saxophone. Listening tests showed that the majority of people identified the instrument
as the one which the phase information had been taken from. Fig 2.1 shows the onset
and offset phases several different instruments (28).
Chapter 2 Research into the Sound of Timpani 8
2.1.2 Latency
Mode: 11 21 31 41 51
Freq. Ratio: 1 1.5 2 2.44 ( 2.5) 2.9 ( 3)
2.1.3 Harmonics
Only the first five to seven harmonics of the fundamental note can be resolved separately
by the hearing system (i.e. if one of these harmonics were to be removed or altered it
would be clearly audible). For higher harmonics this is not the case. The pitch of
the instrument is governed by the relationship between these harmonics or ‘modes’. For
percussive instruments these modes form non-harmonic ratios which give a characteristic
percussive sound where it is more difficult to determine the pitch. The timpani sound
lies between a tonal and percussive instrument, with a definite sense of pitch portrayed.
2.2.1 Modes
1. Air loading of the membrane lowers the frequency of the modes below 350Hz.
2. Modes of air enclosed within the kettle interact with the membrane, with a
similar shape.
3. The bending stiffness of the membrane raises the frequency of the higher har-
monics.
Chapter 2 Research into the Sound of Timpani 10
The first of these probably has the highest significance, whereas the latter three are
likely to have more effect on the note decay. It might be concluded that the shape of
the timpani is optimum, fine tuning the prominent partials into harmonic ratios, due to
the interaction of the enclosed modes of air within the kettle, however it is suspected
that this is not a strong effect.
The most significant loss in the timpani membrane is due to the radiation of sound to the
air (over mechanical loss in membrane, viscothermal loss in confined air and mechanical
loss in walls). A baffled membrane vibrating in its ‘01’ mode radiates its energy far more
Chapter 2 Research into the Sound of Timpani 11
quickly than other modes and therefore decays rapidly. The ‘0n’ modes are excited by
a centre hit which generally produce an unpleasant sound. The ‘01’ mode radiates
with monopole radiation as it is enclosed within the kettle (fig. 2.4). In the absence
of the kettle this would become an unbaffled source and radiate with dipole radiation
from either side of the membrane. Other modes will radiate with dipole or quadrapole
radiation and as such the energy from these modes will be radiated less rapidly (40).
For this reason we know that a centre hit on the timpani will damp quickly, as it excites
the ‘0n’ modes, which partly accounts for the ‘dull thump’ sound produced. The other
reason for the less tonal nature of a centre hit, is that the ‘0n’ modes excited are not in
tuned harmonic ratios.
Good numerical models for the timpani already exist (1), detailing the evolution with
time a timpani’s sound and vibration. In essence, the non-linear interaction between
Chapter 2 Research into the Sound of Timpani 12
mallet and membrane can be considered, taking into account the contribution from
various damping parameters, including:
• Membrane material
• Boundary absorption
• Losses in felt
• Acoustic radiation
The motion of the membrane is then coupled with both the external and internal sound
field, so that the acoustic pressure field can be calculated for a given point in space. This
simulation uses three-dimensional finite element modelling to validate results, producing
time-domain snap-shots of the membrane displacement, pressure fields and the pressure
jump across the boundary (fig. 2.5). Results are surprisingly accurate as can be seen
by comparing theoretical and simulated modes (table 2.4). The model could easily be
used to help separate parameters and determine geometrical and physical parameters.
Potential applications include optimisation of timpani cavity volume.
The initial hit can be seen at the onset by the negative displacement (shown in green),
followed by propagation of the transverse bending wave. In the final diagram a positive
displacement is shown as the membrane pushes the mallet away from the drum. The
presence of a guided wave in the acoustic pressure field is also of interest, as it pre-
cedes the elastic wave front and is travelling at three times the speed (compare (b) in
diagrams).
Chapter 2 Research into the Sound of Timpani 13
The numerical modelling paper (1) considers a strike from the initial point of contact
regardless of how the mallet is set in motion. However this model neglects the variation
of contact time and the hysteric cycle of relaxation in the felt and the stick. Other sources
use the same method to measure the input to the timpani from the mallet (47). These
methods consist of measuring the force and acceleration with an impedance head. (An
impedance head is simply a force gauge and accelerometer in one package.) The force
and acceleration yield a similar shape and the effective mass can therefore be calculated
from the transfer function of these signals on a dual channel FFT analyser. Research
has found the effective mass to be 20% higher than the static mass of the mallet head,
which is likely to be due to the inertia of the mallet handle and player’s arm. The force
is obtained from the acceleration of the mallet, using an accelerometer on the mallet,
and the previously calculated effective mass.
Depending upon the modelling approach used, it may be necessary to be able to quantify
a hit in this way. Brief experiments were made to investigate this but were not continued
as it had a low priority.
Chapter 3
At the outset it was important to explore the overall system structure. This section
outlines how and why the current system was chosen.
15
Chapter 3 Initial System Development 16
3.1 Brainstorming
During the course of several meetings a range of ideas were generated and researched.
Initially, brainstorming sessions were used, with all participants’ ideas written on a white
board. This helped stimulate discussion and highlight areas where decisions were to be
made. A zero-criticism approach to idea generation was used to foster a fast and flowing
spirit of creative interaction. At the end of each brainstorming session, group members
were assigned areas to research before the next meeting.
3.2.1 Inputs
Inputs have to be received from the timpani player and passed to a sound generator
at some point in the process. There are several well-established protocols that allow
digital musical instruments to interact with other hardware. The most popular of these
is Musical Instrument Digital Interface (MIDI), which has played a pivotal role in the
development of digital music. It offers a limited amount of control, but has the advantage
that a large range of existing hardware and components can be used to develop and test
it. An alternative protocol, ZIPI, is much more flexible. This provides a method of
communicating a greater range of musical inputs than MIDI, such as vibrato or position
of percussive strikes. It also allows control of single notes rather than just whole channels.
It is unfortunately quite rare.
Another approach would be to send readings from the input devices directly to the
processing unit, via some from of digital to analogue conversion and communication. It
should be remembered that this system is likely to have several separate input devices
corresponding to each timpani. Consequently, any complex hardware that is required
would have to be repeated for each. The possibility of “daisy chaining” the input devices,
to cut down on the amount of cabling required, should be considered. It may even be
possible to use some form of wireless communication.
• Direct output from each input device (or ‘timpani’) via internal loudspeaker and
amplifiers.
• If any form of analogue to digital conversion is used, then the sampling rate must
be specified.
In addition to all of these, the number of output channels must be selected: either mono,
stereo or some form of surround sound.
There are three main levels at which conversion to output signal could happen:
It is clear that the prototype system should closely resemble a potential final solution,
but must also serve as a development platform. It is therefore proposed that all the
complex processing is performed in a central processing unit (i.e. desktop PC), to which
the external hardware supplies data via some standard form of communication. If this
system was developed into a commercial product, symbolic communication and integra-
tion with protocols like ZIPI may be practical. This is however considered impractical
at this stage.
This system will therefore consist of a single external hardware device that captures
user inputs and passes this information to a desktop PC via a standard input port. The
PC software will then use this data to generate a waveform, which will be outputted
through a sound card.
Chapter 3 Initial System Development 18
Low frequency sound is non-directional; it is difficult for the human auditory system to
determine the direction from which the sound is coming. From an audience’s perspective,
a mono source at the position of the timpani would be as effective as a stereo source.
It may be considered that the timpani player is experiencing a stereo effect due to the
positioning of several timpani, but for the purposes of development this may be ignored.
Digital to analogue output will therefore be mono at 44100Hz. (This sample rate allows
for higher frequency components than the electronic timpani may need to produce, but
is the standard rate for CDs and is therefore available on most sound output hardware.)
It is clear that the proposed system divides into two parallel components, namely hard-
ware and software. Although these systems interact they can largely be developed
independently. The team therefore split into two task groups to tackle these areas. The
following chapters detail the development process.
Chapter 4
Specification
The specification for the electronic timpani is derived from the requirements of the user,
as presented in the description of the project brief.
19
Chapter 4 Specification 20
• The mass of the electronic timpani system should be such that it can be easily
carried by one person.
• For the purpose of portability, the size of the entire electronic timpani system
should fit within the luggage compartment of a typical saloon car.
• The system should be rugged enough to survive the handling it is likely to receive
with acceptable levels of degradation.
• If non-standard timpani mallets are required, they should be of similar mass and
dimensions to typical classical timpani mallets.
• The electronic timpani must contain a playing area which is struck with mallets.
• The playing area must be circular, or a section of a circle, and the surface must
be flat.
• The size of the playing area must be comparable to that of a classical timpani:
between 16 and 32 inches in diameter.
• The playing area must be at a similar height to the drumhead of a classical timpani,
or must be suitable for placement on a table or similar surface.
• The surface of the playing area must be strong enough to withstand a reasonable
succession of strong strikes from all varieties of timpani mallet.
• The surface must be such that the reaction force applied to the timpani mallet is
as similar as possible to that from a classical timpani drum head.
• Noise produced by a strike on the playing surface must not interfere with sound
generated by the system.
4.3 Operation
• The system must incorporate a tuning device similar to the pedal of a classical
timpani and should be responsive to the position of this device.
4.4 Sound
• The electronic timpani should create a sound similar to that of a classical timpani
given the same input conditions.
• The electronic timpani should be able to reproduce playing techniques such as rolls
and glissando (as described in section 1.3.3).
• The level of noise and distortion present in the sound output of the electronic
timpani should not be noticeable under normal operation.
• The output of the electronic timpani should be compatible with commonly used
audio equipment.
• The frequency content of the sound output need not contain components higher
or lower than the audible range of human hearing.
Chapter 5
Hardware
This section describes the design processes and development of the electronic timpani
hardware. First the requirements of the hardware are defined, and a detailed assessment
of the available sensor technologies is presented. The development sections then describe
the design processes for the other principal features of the hardware.
It is important to re-iterate that the hardware is not required to calculate the position
and strength of the mallet strikes, but instead to provide raw sensor information to the
software. (The algorithms used for the software calculation are described later in Section
7.1.4.) This design feature aims to maximise the flexibility of the system, since changes
to software can be implemented significantly faster than changes to hardware.
22
Chapter 5 Hardware 23
One of the first stages in the design process was to research the available sensor tech-
nologies. This was an immediately tangible area that was certain to be an essential part
of the electronic timpani. The choice of sensors used would dictate the requirements and
structure of the remaining hardware. To gain a larger range of ideas, the brief for the
research simply stated that some method would be required to transform the player’s
movement into a signal that could be interpreted by the software. Limiting the sensor
requirement to this simple brief allowed the broadest spectrum of technologies to be
considered.
The initial brainstorming session yielded the following principle ideas (as well as various
other ideas):
• Camera
The output of a camera ‘watching’ the strikes can be processed to extract the
position of the mallet and/or playing surface.
• Drum Pads
• Microphone
Signals from a microphone placed under a simulated timpani skin can be processed
to extract information about the mallet strikes.
• Piezo Transducers
Piezo sensors produce a voltage across the output terminals when subjected to a
mechanical stress.
• Strain Gauges
The electrical resistance of the gauge varies with strain.
• Accelerometers
An electrical output signal is generated, proportional to the acceleration to which
the device is subjected. This was not further investigated as it would rely on other
technologies to provide positional and damping information, which would render
it superfluous.
• Resistive Foam
Force can be detected by changes in the resistance of the foam as it is compressed.
As there was no one apparent optimum solution, many of these ideas warranted further
investigation. The following sections present each method, and compare the advantages
and disadvantages.
5.1.1 Camera
Cameras are being used in many measuring environments where it is important that
the measurand is undisturbed. Their traditional Achilles heel has been the processing
power requirements, however recent advances in the personal computing market have
gone a long way to mitigating this. Other problems, such as cost and ruggedness, have
been substantially surmounted by the emergence of the webcam - a small and very cheap
device. It is therefore sensible to assess them for inclusion in the project.
Upward Looking
This method relies on placing an upward looking camera underneath the skin of a mock
timpani. The camera would measure the disturbance of a graticule (grid of lines, “blobs”
and fiducial marks) that would be printed on the underside of the timpani skin. This
approach has been successfully used on a much smaller scale in robotic finger tips (30).
Side Mounted
Alternatively a side mounted camera could track the movement of the mallet. This
could potentially give much better information about strike speed, but would sacrifice
information about strike position. For this approach to work it is essential to distinguish
the mallet from other background data. This could be done in several ways:
• Frame Differencing
Video hardware often allows the difference between two frames to be accessed,
rather than the frames themselves. This would make extracting the moving com-
ponents of a scene reasonably trivial. Once this has been done the processor must
then distinguish between the mallet head, the mallet stick, the arms of the player
and any other background movement. This should be possible in a controlled en-
vironment but may be difficult in a practical situation where there may be a lot
of background movement.
• Colour Keying
The processor could be “keyed” to look for a certain colour. If a colour were found
that rarely occurred in the background then identification of the mallets would
Chapter 5 Hardware 25
be easier. The system would have to be able to compensate for different lighting
conditions, and may therefore not be practicable.
All of the above techniques rely on good calibration; if the system is to be able to detect
a strike it must first know where the timpani skin is. This may be achievable by placing
the camera in the same plane as the skin (which would waste half the image), but if the
camera were placed elsewhere then it would be difficult to handle the issues presented
by depth compensation.
Advantages
• There is little or no effect on the behaviour of the timpani skin itself, thus allowing
the playing experience to be as realistic as possible.
• This method would give very detailed information about the playing surface. If
the data were in a processable format, this could lead to an extremely hi-fidelity
system.
• One potential advantage of a camera based system is to avoid latency. The camera
could literally “see it coming”. This may allow the system to compensate for slight
processing delays by pre-empting the input.
Disadvantages
• The cost of the camera system is likely to be high. This may be prohibitive
depending on what grade of system is required.
• This type of system is likely to produce far too much data, creating a processing
load that does not contribute to the final output.
• Technologically this is a difficult option and although it may produce good results
it may not be possible to achieve them in the given time frame.
Chapter 5 Hardware 26
After their first appearance in the 1970s, drum pads have evolved to give an impressive
sound output and good tactile response. Appendix A contains details of a currently
available position sensing drum pad. There are two main types of pad, those that use a
rigid rubberised surface and those that use a skin for a more realistic playing experience.
The drum pad will need little adaptation for use as an electronic timpani. The design is
such that the units can be mounted on commercially available stands, played with un-
modified timpani sticks and interfaced to the rest of the system with the manufacturers’
standard connectors.
Advantages
• The electronic drum pad is already robustly designed and attractively styled, and
the outputs are engineered to interface with a drum synthesizer module. They can
be readily adapted for the purpose of the electronic timpani.
• The two primary sensing priorities are fulfilled in a single unit: force and position
sensing.
Disadvantages
• The maximum diameter of position sensing pads is limited to twelve inches (300mm).
The specifications of the electronic timpani require that a diameter of at least six-
teen inches are required.
• Drum pads are considerably more expensive than raw sensors. However any sensors
built from scratch will entail the extra costs of a custom enclosure, mounting, and
signal amplification hardware.
• Current position sensing drum pads do not provide the radial accuracy required
for this application.
Resistive foam has been used in the robotics industry in an attempt to construct tactile
sensors (7). The technique uses a resistive elastomer, normally carbon doped rubber.
This rubber is placed on a substrate that contains an array of dots and rings. The foam
provides the conducting path between each dot and ring. During use, each ring-dot pair
is energised in turn, and the resistance between them is measured. The resistance of the
foam varies when it is compressed, which means a “image” of the tactile surface can be
reproduced.
Chapter 5 Hardware 27
Advantages
• The system is reasonably rugged and should be able to survive general handling.
Disadvantages
• The elastomer has a non-linear time constant. It is also hysteric: the time constant
for the application of force is different to that of force removal. This may mean
that this system does not give useful dynamic information, which is essential for
this application.
• The sensor will wear out relatively rapidly, particularly with the cyclic loading.
The skin is likely to be struck non-uniformly, so high-use areas will degrade pref-
erentially.
This sensor is based on the physical principle of the Hall effect. A voltage, known as
the Hall voltage, is generated transversely to the current flow direction in an electric
conductor if a magnetic field is applied perpendicularly. The Hall effect sensors are build
from semiconductor platelets as the effect is most pronounced in this material.
If a grid of small magnets were placed on the underside of a flexible membrane, an array of
hall sensors could be used to detect membrane deformation, and hence strike information.
If the weight of the magnet proved too high and interfered with the movement of the
membrane, they could be replaced by small electromagnets.
Advantages
• Inexpensive
• The “feel” of a hit on the drum would be very realistic as a real timpani skin could
be used
Disadvantages
• If magnets significantly added to the mass of the membrane, its dynamic charac-
teristics would be altered and may not respond fast enough
• As the Hall voltage is only a few mV, the system may not be sensitive enough to
detect damping
5.1.5 Microphone
It could be possible to extract all the information required by listening to a drum skin
with a microphone. The position, velocity and damping will generate a unique signal
which could be processed into separate components. Microphones are often used to pick
up musical instruments, however they are not usually used to pick up precisely how and
instrument is played.
Using Microphones
If the microphone is placed too close to the source there could be the possibility of
clipping or distortion. The frequency response of microphones is usually fairly good,
however there is usually a tail in the response at higher frequencies. Microphones are
tuned to work in either a diffuse or free field. A flat frequency response is available but
at a reduction in sensitivity and increase in cost.
The microphone will be very sensitive to ambient noise, which could be a significant
problem, for example in a teaching environment such as a classroom. The mounting of
the microphones may be difficult and complex processing would be required.
Advantages
• Electret microphones are cheap (as little as 50p), rugged and reasonably stable
• All the information of hit profile; location and damping are included and could be
contained in a single signal
• The microphone would also pick up any nuances of the playing that could be
missed by other types of sensor
Chapter 5 Hardware 29
Disadvantages
• The microphone will require a high level of processing to determine some of the
parameters, such as location of the hit
The word “piezo” is derived from the Greek word for pressure. Piezoelectric transducers
can be used both as sensors and actuators. Acoustic electronic drum heads generally use
piezoelectric transducers because they are effective at detecting deformation velocity.
Piezo sensors can be used to develop a customised “drum pad” that suits the require-
ments of the timpani, in terms of size, surface and responsivity. The construction of the
pad would have to shield the sensors from excessive shock loading (which could damage
them) without muffling them.
• Sensor Matrix
A matrix of piezos could be placed under a thin rubber sheet. Location could then
be determined by comparing their relative responses.
• Coaxial Cable
Piezoelectric coaxial cable is available which could be used as an alternative to a
matrix. It may be easier to range into concentric rings. This would give accurate
distance from centre, but no angle information.
• Rigid Plate
A rigid plate could be mounted upon piezos. This system would require more
processing to extract location information.
Advantages
• Inexpensive
• Piezos are small and light, allowing the final pad to be portable
Disadvantages
• There is the possibility that if the piezos were inadequately protected, they could
break. The ceramic piezoelectric material is brittle and this must be taken into
consideration when using this technology.
• This method alone would not easily allow the electronic timpani to detect damping.
While piezos are very good at detecting short-term impulses, their detection of the
sustained pressure on the pad (a hand damping the drum skin) would be poor.
Position sensing can be achieved by placing multiple strain gauges around the perimeter
of the playing area and analysing the distribution of the force between the sensors. The
playing surface will be rigid to allow the force of the hit to be transmitted instantly, and
undistorted, to the perimeter. Two force sensors would only resolve the hit location to
one dimension, but three would allow the resolution of two dimensions. The use of four
sensors would simplify processing requirements but will introduce problems if any one
sensor is marginally out of vertical alignment.
A force sensor would consist of a strain gauge bonded to a solid material. Selection
of a suitable material, and a suitable shape, may prove unnecessary - force sensors are
already available in the form of load cells. These are purpose-built housings with internal
strain gauges already bonded in position, allowing the user to simply apply a force on
the load cell.
Load cells generally incorporate the first stage of the necessary signal processing elec-
tronics: the use of a Wheatstone bridge to negate the effects of temperature and provide
a balanced output.
Recently patented load cell technology from C-Sensor uses polymer strain gauges and in-
corporates all processing circuitry in one commercially available, extremely cost-effective
solution. The units interface directly to the RS232 COM ports of a PC.
The polymer strain gauges operate on the principle of variations in capacitance, not
resistance. DC output signals are achieved by integrating the polymer voltage with
respect to time, whereas with the resistive principle no such processing is required.
Chapter 5 Hardware 31
Advantages
• Strain gauges and load cells can easily be configured to provide position sensing
and force/velocity measurement. For purposes of damping, they are especially
suitable as the output is not capacitive (as in the case of microphones or piezo
sensors) and so will sense a ‘DC force’ such as a hand placed on the playing area.
• Load cells are a neater solution that raw strain gauges, as the calibration and
assembly have been performed by the manufacturer. Polymer load cells are con-
venient, as the amplification and communication electronics are integrated.
Disadvantages
• Raw strain gauges will require careful assembly and calibration, and the signal
conditioning circuitry will need to be designed.
• Traditional load cells suffer the disadvantage of cost: the very cheapest available
are about £50; in more convenient housings they will cost £250 each. (Minimum
of three required per timpani.)
• The patented polymer load cells may not be accurate enough - more research will
have to be conducted into the forces involved in heavy hits and light damping.
• They may also require calibration when turned on; this may be achievable auto-
matically.
• The polymer load cells do not inherently sense DC force. The processing has been
developed by the manufacturer and so should prove reliable.
5.1.8 Conclusion
Several sensors were ruled out because they had significant disadvantages, these were:
• Camera: Too complex to build and the required processing power would not be
justified
• Drum Pads: Not available in the required size and do not manage damping or
position well enough
Following this research, the team had sufficient information to decide on the technology
to use. All the techniques have their advantages and disadvantages, but the two most
Chapter 5 Hardware 32
suitable appear to be the piezoelectric transducers (piezos) and the strain gauges. Piezos
are cheap, rugged and readily available, but do not respond to DC signals. Strain gauges
have the advantage of providing better low frequency/DC information, but at a higher
cost.
Strikes and damping create signals with significantly different frequency content. The
duration of a mallet strike is of the order of milliseconds, and therefore high-frequency,
while the pressing of a hand on the timpani may last for several seconds, and so consists
of much lower frequency components.
Given the suitability of piezos to the detection of high frequencies, and the suitability of
strain gauges to the detection of low frequencies, the team concluded that a combination
of the two devices would provide the ideal solution. Piezos could be used to detect strikes,
and strain gauges would detect damping. The detection and processing of strikes and
damping would thus be separated as early as possible, at the sensing stage.
• Following consultation with the project sponsor, the detection of strikes was to be
a higher priority than the detection of damping.
• Compared to piezos, strain gauges would either require more development time to
implement into a hardware solution, or would cost significantly more if a ready-
made implementation were to be purchased.
• Obtaining a stable hardware platform was essential in order to allow the software
development to be started.
• At this early stage in the project it was important to conduct some experimentation
and gain some practical insight into the operation of the sensors. Piezoelectric
transducers, costing less than £1 each, were an ideal starting point.
It was decided that some piezoelectric transducers should be purchased and assessed.
Strain gauges would be added at a later date, once the hit detection was operational.
On experimentation with some layers of rubber to protect the transducer, observing the
voltage output of the transducer with an oscilloscope, it was decided that the piezoelec-
tric transducers were suitable as sensors for the electronic timpani. The development of
the sensor assembly is described in Section 5.2.2.
Chapter 5 Hardware 33
An overall structure was then chosen based on the hardware requirements and the se-
lection of sensor technology:
• A flat ‘drum plate’ (the playing surface) should be supported on a sensor system,
allowing all the force of the mallet strikes to be transmitted to the sensors.
• The output from the sensors would require some stages of analogue processing,
followed by a conversion to digital values.
• The digital information would require ordering and formatting, before being trans-
mitted to the PC.
• All the electronics would ideally be fitted onto one circuit board.
Initial tests with piezoelectric transducers had been performed with the transducer
placed between layers of rubber to provide protection for the brittle ceramic layer in
the transducer. When struck hard with a timpani mallet, output pulses of about 5V
were evident on an oscilloscope. When the plate was placed on top of the transducer
and rubber, this output was reduced significantly to only 0.5V, as the rigid plate on
top of the transducer merely compressed it and did not flex it. Therefore some system
was required both to protect the transducer and also to ensure that the impact of a
strike would cause it to flex. Figures 5.1 and 5.2 illustrate the sensor assembly that was
devised to meet these requirements. With this assembly, output voltages of up to 50V
are typically produced for a hard strike. The hard rubber base provides a relatively stiff
surface, against which the force of the strike can act from above. The layer of foam
rubber damps the mechanical oscillation of the plate.
5.2.3 Configuration
The number of sensors used, and their configuration underneath the drum plate, affect
other areas such as the data to be transmitted and the nature of the analogue processing.
• Exactly three sensors would be required in order to correctly resolve strike position.
Firstly this is a minimum governed by the requirement to obtain resolution in two
Chapter 5 Hardware 34
Figure 5.1: Sensor assembly between drum plate and base plate
dimensions: one sensor measures a single magnitude, two sensors provide resolution
in one dimension, and three sensors provide resolution in two dimensions. Secondly,
three contact points will sit firmly on any surface, whereas four may not. More than
three sensors would provide redundant information, which would require careful
handling.
• Alternatively an array of sensors under the entire pad - possibly hundreds - would
indicate the location of a hit with one output being considerably higher than the
others. The disadvantage of this approach is the volume of information to be
transmitted and processed.
As a result of these design considerations, the three sensor assemblies would be placed on
the underside of the plate at separations of 120◦ as shown in Figure 5.3. These assemblies
would consist of a combination of strike detection sensors and damping detection sensors
if necessary.
Chapter 5 Hardware 35
Figure 5.3: Configuration of the sensor assemblies on the underside of the drum plate
As a sounder, converting electrical energy into acoustic (or mechanical) energy, a piezo-
electric transducer is commonly regarded as being primarily capacitive (8) (9). A typical
sounder consists of two thin conductive plates, separated by a thin layer of insulating
piezoelectric material, giving it capacitive characteristics. This capacitance is also ob-
served when the device acts as a sensor, and the output signal is therefore differentiated.
It is important to limit the current drawn from the piezoelectric transducer if the series
output resistance is high (10-100kΩ), as it will lose a large proportion of the output
voltage when current is drawn. Output resistance of the transducer is determined by
comparing loaded and unloaded voltage outputs. The difference between these voltage
outputs, combined with the value of the loading resistance, gives a value of approximately
30kΩ. Any resistive loading of the transducers of below 30kΩ will therefore more than
halve the output voltage, and so the signal must be amplified.
A piezoelectric sounder typically exhibits an extremely high peak in the frequency re-
sponse spectrum (typically at around 4kHz), due to the lack of mechanical and electrical
damping. The useful bandwidth is extremely narrow, and the low sensitivity across most
of the frequency spectrum renders the sensor unusable.
Research and initial testing of the piezoelectric transducers confirmed that they are not
suitable for picking up DC signals, and so could not be used for damping. The output
is effectively proportional to the velocity of the drum plate, and not its displacement.
A potential solution, to allow them to be used for damping, would be to integrate the
output and thereby derive a value for displacement. Although this solution is initially
appealing, drift errors will occur due to the integration of noise.
Damping information can be extracted from the piezoelectric sensors because a distinc-
tive profile is generated. This is discussed in Section 7.2.11. Using this method, piezo-
electric transducers will suffice for the purposes of the prototype without the addition
of strain gauges (Section 5.1.8). Strain gauges may still present an optimal solution.
As the principal interface between the player and the electronic timpani, the drum plate
is required to accurately transmit information to the sensors when hit by the player.
Key parameters of the plate that had to be qualitatively defined were mass and stiffness.
Mass should ideally be low, as the entire plate must move in order to transmit the impact
of the mallet through to the sensors, and a high mass would decrease the movement
for a given impact energy. Stiffness should be high in order to minimise the bending
experienced by the plate when hit, again in order to transmit the impact.
The following stages are needed to pass the values from the sensors through to the PC:
1. Amplitude adjustment
2. Buffering
Chapter 5 Hardware 37
3. Level shifting
4. Analogue-to-digital conversion
5. PIC microprocessor
6. RS232 communication to PC
The first requirement of the analogue processing circuitry is that the output of the
piezoelectric transducers should be individually adjustable. This would account for
manufacturing tolerances and ensure that the outputs are the same for a given strength
of hit. The scope for this adjustment is provided by the variable attenuation of the
signals with a 1MΩ potentiometer.
A balance must be achieved between resolution and headroom. If the sensor signal level
after attenuation is too high, it will exceed the maximum voltage detectable by the buffer
and analogue-to-digital converter (ADC), and cause clipping to occur, with undesirable
effects. If the attenuation is too low, the full span of the resolution of the ADC will
not be used, and resolution is lost. The calibration was performed with a professional
timpanist, the project sponsor, playing what was considered to be the hardest reasonable
hit.
The signals must be buffered in order to meet the input requirements of the analogue-
to-digital converter (ADC). The ADC specification sheet (11, pp16) recommends that
the output impedance of the analogue source should be less than 5kΩ to prevent noise
pickup. However as discussed, the output from the piezoelectric transducer is adjusted
through a 1MΩ potentiometer, which in this application will have an output impedance
of up to 250kΩ. To overcome this incompatibility, the output from the potentiometer
can be amplified using the circuit shown in Figure 5.4. An operational amplifier (‘op-
amp’) connected to present a high impedance to the input source while also having a
low impedance output and unity gain.
Input
Output
The buffer is supplied from both positive and negative rails, and so is capable of buffering
both positive and negative signals from the sensors.
Experimentation with the piezoelectric transducers and the analogue signal processing
circuitry indicated that the expected output from the potentiometer, and therefore from
the buffer, is within -2.5V and +5V. Negative voltages are produced during the short
period of oscillation after the initial pulse of a strike. The input to the ADC must not
fall below 0V, and must not exceed +5V. The signal levels must therefore be shifted,
and this is achieved using a voltage divider to scale the voltage relative to +5V (fig.
5.5). This shift creates an output of 80 from the ADC for a zero voltage output from
the piezos. Subsequent software must take this into account.
+5V +5V
Output from
Input to
sensor
ADCs
potentiometers
+1.56V
0V 0V
-2.5V -2.5V
Figure 5.5: Level shift required to meet input voltage range of ADC
5.4.4 Pedal
The pedal input should transform a physical displacement into a quantity measurable
by the PC. This is achieved using a commercially available electronic keyboard pedal
as an analogue signal. The pedal simply contains a potentiometer, which is connected
between the positive and negative supply rails, giving an output voltage proportional to
its angular position.
The only electrical requirement of the pedal potentiometer is that its output impedance
should be low enough to allow the holding capacitor of the PIC ADC sufficient time to
charge. The maximum output impedance, as recommended by the PIC datasheet (12),
is 10kΩ. The output impedance of the potentiometer can be shown to be a maximum
of 0.25R where R is the total potentiometer resistance, so in this case R is limited to a
maximum of 40kΩ.
Chapter 5 Hardware 39
The system used for communication to the PC affects the digital components of the
hardware.
The deciding factor for a communication system is likely to be the data transfer rate, so
an approximate value should first be established. There are three sensor values to trans-
mit, and a digital resolution of 10 bits. Previous timpani modelling work (1) suggests
that the duration of a mallet strike is of the order of 10ms. To capture this information
accurately, a sampling period of 1ms would suffice. Based on these approximations, a
minimum data rate of 30kbps would be required.
5.5.1.2 USB
The USB interface is rapidly becoming the standard for PC peripherals. Printers, scan-
ners, modems, digital cameras and MP3 players are all making use of this flexible and
compact serial protocol. High data transfer rates are achievable: 12Mbps and 1.5Mbps
are the two common speeds and USB2 promises a 480Mbps rate. Online support is
available, but the protocol is complex and development time may be high (13) (14).
Parallel port communication is the traditional standard for printers, with data rates of
up to 150kbps. The use of the parallel port would obviate the need to convert the data
to serial format. However the ready availability of purpose-built serial chips allows this
conversion to be performed conveniently.
5.5.1.4 Bluetooth
Bluetooth is a wireless protocol and has only recently arrived on the market. As such,
support is limited and hardware is expensive, but data transfer rates can be as high as
10MBps for the latest versions.
Chapter 5 Hardware 40
5.5.1.5 RS-232
The RS-232 serial protocol is widely acknowledged as the most user-friendly interface.
As such, it is frequently used for basic electronics projects, and the support available
online is extensive (15). Data rates of over 100kbps are possible, and debugging software
can be downloaded freely.
All of these communication systems exceed the requirements for the data transfer rate.
RS-232 was chosen as the preferred option, due to the comprehensive support, low cost
hardware, and above all the availability of peripheral controllers with built-in RS-232
support.
The hardware needs some components that can be configured to control the communi-
cation to the PC. These must provide:
All of these features were available on a single PIC microcontroller, the PIC16F871 (12),
which is commonly available. The analogue inputs are digitised with an internal 10-bit
ADC.
The original intention was to use the PIC analogue inputs for all three piezoelectric
transducers. Once the PIC had been purchased and programmed, it became apparent
that it operated by multiplexing the analogue input channels into one ADC, as illustrated
in figure 5.6.
The PIC would therefore sample the analogue channels sequentially, with a delay of at
least 40µs between successive samples (12), or 80µs between the first and last of the three
conversions. Approximating the pulse of a mallet strike as a linear rise over a period of
2ms, this would lead to an error of 4% in the readings. This should be avoided, so the
PIC internal ADC was deemed unsuitable.
Chapter 5 Hardware 41
PIC
Switch
(multiplexer)
Channel 1
Channel 2 ADC
Channel 3
Analogue Analogue
Analogue
channels
signals signals
Analogue sample-
Sensor
and-hold
Analogue sample-
Sensor PIC
and-hold
channels
Digital
Analogue sample-
Sensor
and-hold
Analogue
Analogue
channels
signals
Digital
Sensor ADC
signals
Sensor ADC
The ADC sampling rate must be faster than the rate at which the PIC will be reading it,
to ensure that no readings are repeated. The delay between successive conversions of the
ADC is 100µs, as specified in the ADC datasheet, (11). This gives a sample frequency
of 10kHz. The PIC would be sampling at around 1kHz as calculated in Section 5.5.1.1,
and so the ADC sample rate meets this requirement.
Chapter 5 Hardware 42
The PIC internal ADC would still be suitable for the tuning pedal, as timing is not
critical - the pedal output is a much slower signal than the output from the piezoelectric
transducers.
The development of the code written to program the PIC microcontroller evolved through
several stages. Initially the functionality was simple, in order to verify the operation of
the analogue and serial circuitry. As development of the serial input processing code
continued, and to facilitate debugging of the hardware, additional modes of operation
were included in the PIC code. Once the final mode of operation was decided upon,
the code was trimmed back down to an efficient minimum and optimised to exclude
redundant functions.
One of the intermediate stages of code involved a mode that only transmitted the sensor
readings when they exceeded a certain noise threshold. This allowed strikes on the drum
plate to be captured on the PC instead of being lost in the stream of continuous data.
Another example enabled the full ten bits of data from the PIC internal ADCs to be
used, to increase resolution. As a maximum of 8 bits per data word were admissible
over the RS-232 serial link, this involved rearrangement of the bits before transmission.
Two different voltage levels are used for serial communication. Between microchips
(TTL or CMOS), 0V represents the low signal level (digital 0), and +5V represents
a high signal level (digital 1). For communication over longer distances, the RS-232
voltage is higher in order to improve reliability, so +10V is typically the low signal level
and -10V is typically the high signal level. Figure 5.9 shows the voltage levels of the
data word ‘137’.
Figure 5.9: Voltage levels of serial communication for TTL/CMOS and RS-232
The PIC is required to transmit three 8-bit readings from the piezoelectric sensors via
the external ADCs and one 8-bit reading from the pedal input via the internal ADC. As
one byte of RS-232 data can contain a maximum of 8 bits of information, these readings
will each fit neatly into one byte. Each byte is made up to ten bits with the addition of
the start bit and stop bit. The data must be sent sequentially, so a known start byte is
required to ensure that communication is synchronised and correctly interpreted. This
start byte is given the value of 255, as sensor readings are rarely expected to reach this
high. All sensor or pedal readings with a value of 255 are trimmed to 254, again to avoid
misinterpretation. To maximise the frequency at which sensor readings are taken, start
bytes are alternated with pedal readings, so that for every two sets of sensor readings
there is one start byte and one pedal reading.
5.6 Noise
Minimising noise is essential to successful circuit design. The prototype circuit board
for the electronic timpani has been constructed using wire-wrap pins to facilitate devel-
opment, and the noise level is consequently higher than it would be with a commercial
equivalent.
Noise is evident on the data received by the PC: the two lowest significant bits on
the ADCs can be seen randomly oscillating between 0 and 1 when monitored with an
oscilloscope.
• The design of the circuit features dual power supply regulation to separate analogue
and digital power rails. This prevents noise from digital devices affecting the
processing of the analogue signals.
• Filtering capacitors have been added to the power supply connections of many of
the ICs, wherever circuit board space allowed. These prevent both the input of
noise into the IC power supply as well as the output. Clocking ICs are particularly
susceptible to the production of noise, due to the increase in current drawn on
each clock edge (16). An increase in current causes a voltage drop across the series
resistance of the connection from the power supply, and this drop in supply voltage
will affect other ICs. These capacitors must be placed as close as possible to the
power pins of the IC in order to maximise their effectiveness.
Chapter 5 Hardware 44
• The wire-wrap pins of the IC sockets are typically 25mm long. This exacerbates
the noise levels by acting as a tiny aerial, transmitting and receiving noise elec-
tromagnetically. Trimming these pins down to a more suitable length of 5-10mm,
without jeopardising the ability to make future wire-wrap connections, reduced
the noise by one quantisation level (out of 255). In fact the datasheet for the
ADC specifically recommends against the use of wire-wrap sockets (11), but for
prototyping purposes, the flexibility of the wire-wrap connections are a priority,
and the decrease in signal-to-noise ratio (SNR) must be tolerated.
There are many areas for further improvements to the circuit design to reduce noise.
• Signal routing: Analogue and digital signals must be well separated. Efforts have
been made to route the signals with this constraint, but cross-talk is unavoidable
when wire-wrap wires are used (17).
The requirements of the electronic hardware were significantly simpler than if a fully
embedded system had been used. This allowed the hardware to be rapidly defined and
implemented, so that the software (particularly the input processing) could be devel-
oped. The issues arising during the hardware development had to be resolved quickly
to produce a working prototype. Consequently there are several areas for future devel-
opment, notably the addition of strain gauges and potential improvements to resolution
and communications speed. These developments are detailed further in the project
recommendations, Section 11.
Chapter 6
This section will outline the design processes that were followed during the selection and
programming of the electronic timpani system software.
45
Chapter 6 Software System Design and Implementation 46
The proposed system uses a single processing unit and will be used to generate the
sound for all of the timpani in the system. Rather than designing a system capable of
supporting multiple timpani, the prototype will only manage one, however it will be
designed so that the extension to multiple timpani inputs is trivial. The software will be
responsible for scanning a serial port and extracting piezo and pedal position information
from this data stream. It will then perform hit detection and sound generation (hit
detection details will be covered in chapter 7). The objective here is to produce a
modularised platform that can be used to test and develop sound generation techniques
for this application. As part of this process, initial sound generation modules will be
designed and implemented (chapter 8 covers sound generation details).
This system will certainly require some form of signal processing. There are commercially
available signal processing PCI boards that would manage this effectively. They provide
very high performance calculations using hardware acceleration, including dedicated
digital sound processors (DSP’s) and field programmable gate arrays (FPGA’s). No
group members had enough experience to use these devices effectively given the project
time frame. It was concluded that it was impractical to use them. An alternative
approach would be to use a small, embedded style, PC. These are compact systems that
would be ideally suitable for inclusion in a final product. They were not considered
suitable as a development platform due to their limited processing capability and high
cost. A standard desktop computer was used as it could serve a dual role, both as
workstation and a target platform.
6.2.2 Language
Several possible language options were available to the project. There were two con-
flicting requirements: the language needed to be flexible and feature-rich to speed up
development, but also needed to be efficient to allow the final process to be implemented
in real-time. Some languages, like Java, may have partially filled both these require-
ments, but it was decided that a combination of Matlab based prototyping and a C based
implementation would meet both these requirements fully. C++ was considered, as it
would have given distinct advantages in terms of modularity and structure. Experience
with C++ varied dramatically within the group and it was decided the effort required
outweighed the potential benefits.
Chapter 6 Software System Design and Implementation 47
Rather than interacting directly with a piece of sound hardware, it is normal to use some
system services to handle the communication. This would allow the same application to
work independently from the actual device. There are several API available:
• Direct DOS Sound (Microsoft): Many older Microsoft based programs ac-
cessed the systems soundcard, via a system of hardware drivers and direct inter-
rupt calls. These programs are still partially supported by more recent releases of
the operating system.
• Open Source Sound (OSS/Free) (Linux): A very basic system that opens
the sound card as a character device, using ioctrl() calls to configure output
options (46).
• aRts (Linux): Analogue Real Time Synthesizer. This sits on top of either OSS
or ALSA and acts as a sound server using the Common Sound Layer interface.
This is usually available within the KDE desktop environment. There is also a
Gnome equivalent.
Most of these were specifically designed to allow multiple programs to access the sound
hardware resources simultaneously. From an electronic timpani perspective, this is at
best a waste of effort, or at worst potentially embarrassing (a “you have mail” notice half
way through a concert would certainly be undesirable.) The system therefore needed an
API that would give direct and exclusive access to the sound hardware with a minimum
amount of services. Early research indicated that the OSS/Free system provided by
Linux may suit these needs well.
Although OSS looked like a promising sound API, it was still important to compare the
relative merits of the available operating systems given the application’s requirements.
There were two main contenders, namely Windows 2000/XP or Linux. A Windows
based system was considered for reasons of familiarity.
Chapter 6 Software System Design and Implementation 48
Linux was selected as it seemed well suited to semi real-time applications and also
provided the OSS sound API which appeared suitable for this project. The software will
be developed with a minimum amount of operating system dependence to allow it to be
easily ported at a later date should the need arise.
This section briefly outlines some of the principles that lie behind the implementation
of this system.
6.3.1 Modularisation
To the greatest extent possible, given the restrictions of time and programming language,
the software should be separated into logically independent modules which have clearly
defined interfaces. Once established these interfaces should allow parallel, independent,
development of software, and allow future maintenance and development of isolated
components with a minimum amount of system modification. It is possible that different
parts of the program may run concurrently, using some time-slicing method (e.g. POSIX
Threads), and this should be considered when defining the interfaces.
Modules should communicate via globally accessible data structures, with only one mod-
ule allowed to write to each data structure. It will be considered good practice for a
module not to rely on the values/state of its outputs (i.e. only read upstream, only
output downstream). This is particularly important if the interface operates across a
thread boundary.
Chapter 6 Software System Design and Implementation 49
6.3.3 Concurrency
It may be necessary to run several parts of the program concurrently. This is very
common in programs that manage asynchronous hardware input and output. Any part of
the program that handles hardware should not “busy-wait”. Busy-waiting is effectively
polling the hardware resource until it is in a state to supply/receive data. Most operating
systems provide services where a program can pause, freeing the processor until the
hardware is ready. Use of such processes is essential if the system resources are to be
managed correctly.
Where possible, functions and variables that serve similar purposes should be named
consistently. This will increase system maintainability and facilitate development.
6.4 Layers
The system architecture was to be divided into three logically independent layers, namely
hardware interface, storage (effectively communications) and processing. To the greatest
extent possible, these layers would be developed and tested independently (fig. 6.1).
Processing
Storage
Hardware
6.5 Interfaces
The layers communicate via a series of interfaces. These are defined in two ways: by the
functions that each piece of software must provide, and then by the data structures that
Chapter 6 Software System Design and Implementation 50
they must use. The result of this process can be seen in timpani.h header file which is
included in Appendix D.1.
There are two processes over which the program does not have control, namely the rate
at which the PIC sends data to the program and the rate at which the sound card plays.
It must therefore adapt the rate at which it is running so that it does not send more
data than the soundcard can deal with (otherwise the write() function will just pause
the program until it is ready to deal with the excess data), and also so that it reads
the data coming from the PIC correctly. Managing this task is not trivial. The system
generates the waveform of an entire timpani sound at the moment of a strike. It does
this as a single loop, but due to latency constraints, the playing of this waveform must
commence before the generation has completed. (It takes several output write cycles to
complete the generation.) The program must either handle all this scheduling itself, or
run parts of the program concurrently. It has been found that using POSIX threads is
an ideal solution.
6.6.3 Limitations
Great care must be taken to ensure that threads do not damage each others’ variables.
There are several methods of controlling access including Mutual Exclusive locks (mu-
Chapter 6 Software System Design and Implementation 51
tex’s) and semaphores. It has been unnecessary to use either of these approaches in
the program, because the data structures have been arranged so that only one thread
writes to them (the others just read them). It should be noted that some of the variables
are declared as “volatile”, which is very important because it stops the compiler’s opti-
misations that would otherwise break the thread connectivity (it will shadow variables
locally, in a way that other threads would not see).
Handling the exit of the threaded program can be complex. The application is not
responsible for writing files, so to exit neatly all it must do is release the memory that
it has allocated, free the hardware and exit. A single global variable is therefore used.
A signal handling function waits for a signal interrupt (SIGINT) to arrive and then sets
this global variable to be false. Once this is set to false all the threads’ main loops exit
simultaneously.
• Main Loop Busy Wait: The main loop currently busy-waits which is not ideal.
There may be some method of using a function to trigger it, but for now a solution
is not apparent.
• Scheduling: Currently the default scheduling is used, which means clicks may be
heard on the speaker if any other program is run at the same time as the timpani
application. These are caused by the output loop not keeping up with the sound
card. There are several methods of increasing a thread’s priority, which should
stop this happening, but a satisfactory arrangement has yet to be achieved (in
effect re-prioritising the thread locks the whole system).
This section outlines the steps that compose the life cycle of the timpani software.
6.7.1 Initialisation
Each main section of the program has an initialisation function of the form
void init module name(), where module name is the name of the module. This func-
tion allocates all the memory required by the module during its entire life cycle, (i.e.
Chapter 6 Software System Design and Implementation 52
they should not do dynamic allocation in any other function). This is done so that the
mlockall() function can be used to ensure that all the program’s memory stays resident
in RAM and is not archived to disk. (Loading memory pages to and from disk can cause
significant unexpected time delays, which may have an impact on a real-time program.)
Once they have allocated memory, the initialisation functions then assign meaningful
start values, which may involve scanning hardware, loading data from files or simply
zeroing arrays.
Initialisation functions are called in a specific order because, where possible, modules
refer to variables in each other rather than using global definitions (e.g. the WaveTable
struct queries the SoundCardConfiguration struct to find out how many samples per
second are being outputted). The basic order is:
• Storage layers (input processing; strike table; continuous input table; wave table;
channel buffer and sound output buffer)
After initialisation functions have terminated and the memory has been allocated, the
input and output threads are created. It is not necessary to spawn a thread for the
generation module, as it uses the ‘main’ thread.
1. InputFrame Generated
The serial port class creates a continuous stream of InputFrames, which contain a
reading for each of the piezo sensors and the pedal. These InputFrames are passed
Input Channel
Strike Channel
Serial Port Processing Filter Post Write
Wave Combination
Capture (Hit & Damp (Glissando Filter Output
Generation (Clipping)
Dectection) Damping)
to the input processing module via the save input frame(InputFrame *input)
function.
2. Hit Detection
The input processing module monitors InputFrame stream, saving a damping and
pedal value for every InputFrame using save continuous input(). If the in-
put processing module detects a hit is saves it to the StrikeTable via the save strike()
function.
3. Wave Generation
The generation module (mod generate) continuously scans the StrikeTable to see if
a new strike has been recorded. If it has, then it selects the parameters it requires
and then uses them to store the generated wave in the WaveTable. If all the waves
in the WaveTable are currently still being used (all notes are still ringing) then the
generation sends a kill request to stop the oldest note before re-using it.
5. Channel Combination
Once the channel filter has created a small fragment for each channel, these are
combined by the combine channels() function, which also ensures that the frag-
ments clip properly (rather than overrun the variable value). There are several
techniques that can be used to combine channels, whilst avoiding clipping. The
simplest is to divide the amplitude of each channel, by the number of channels
before combining them. This is a very conservative approach, but would entirely
remove the possibility of the output clipping. It does this at the expense of dy-
namic range. The sounds being playing are fairly specialised (i.e. sharp attack
followed by rapid decay). This means that clipping is only likely to occur if two
strong strikes are played in quick succession. It may be necessary to perform some
sort of amplitude reduction on the channels to avoid clipping, but this will have
to be calibrated manually once the multi-timpani system has been implemented.
6. Post Filtering
This single output fragment is passed to the post filter module (which currently
unused). This could be used to implement some system-wide features, like tone
control.
7. Sound Playback
Once passed via the post filter module, the output fragment is written to the sound
Chapter 6 Software System Design and Implementation 54
Generated Sample
Channel Output
Wave Table Buffer Buffer
6.7.3 Termination
Once a termination request has been detected (i.e. a SIGINT has been received), the
input and output threads terminate. After they have terminated the main thread calls
munlockall() to free the memory lock, and calls terminate functions in the reverse
order to the initialise functions. Once all the memory is free, the hardware file locks are
released and the program exists normally.
Originally the post filter class was provided as a useful way of optimising the sound
generation (by only performing some operations on the combined channels rather than
individually), however it has been found to be unnecessary. This means that extension
to multiple timpani would involve minimal changes to the software structure. Changes
would include:
• Update sound generation module to reflect different size timpani (this is optional,
it would be possible to have several different timpani of the same type without
this)
The current software architecture assumes that the output of the initial sound generation
model can be expressed as a single sound wave, representing a timpani strike. These, as
mentioned, are stored in the WaveTable structure and accessed by mod channel filter.
This is a reasonably good assumption, but does not allow for a model that can respond
to inputs that occur after the initial strike. It may be possible to generate a model
that has two free parameters (for example damping and pedal position), that would
then have to be generated in the channel filter. In this case the WaveTable would
have to be replaced by some data structure that stored the parameters for that model,
instead of a direct output waveform. This would require significant reconstruction of the
system architecture as the WaveTable/ChannelFilter interface happens across a thread
boundary. Where possible it is recommended that two separate models, one for strike
related inputs and one for continuously variable inputs, are used in parallel.
Chapter 7
Input Processing
This chapter aims at describing how the serial port on a computer can be read and how
this was implemented in the system.
56
Chapter 7 Input Processing 57
As mentioned earlier, the electronic timpani is to provisionally use the RS232 serial
interface which allows easy communication between the hardware (PIC processor) and
the computer (for later versions, other standards such as USB or FireWire could be
considered).
7.1.1 POSIX
POSIX stands for Portable Operating System Interface, and is an IEEE standard de-
signed to facilitate portability by creating a single version of Unix. Most ‘UNIX-like’
systems are therefore POSIX-compatible operating systems. Applications designed for
such operating systems could theoretically also run under Microsoft Windows providing
appropriate libraries are also included.
When using the POSIX standard, the serial port is accessed via a character device file,
which is opened with some connection parameters defining interface protocol (such as
parity, baudrate and handshaking). For Linux, the serial port character device file names
are /dev/ttyS0 and /dev/ttyS1. The specifications of these parameters in the context of
this project will be detailed in section 9.1. Inputting and outputting to the serial port
is then performed by reading and writing to this device file.(48)
Receiving the data from the serial port can be achieved by polling it. This involves
the computer continuously looking for data waiting at the serial port. This is a very
inefficient technique, as processing time is wasted when the CPU is polling and checking
the port when no data is available. Other parts of the program require the CPU resources
and this section should therefore only run when it is required. A system function call,
select(), allows the input processing to pause until new data is available, freeing CPU
resources.
As shown in the timpani header file (Appendix D.1), the serial port part of the software
contains three functions:
• init serial port() opens the port, stores old and sets new configuration param-
eters.
Chapter 7 Input Processing 58
• update serial port() is responsible for extracting an individual set of three piezo
sensor values within a frame from the incoming serial data, and then calling the
appropriate save input frame() function to store them (section 5.5.6 contains
details of the frame constitution).
• term serial port() closes the port and restores the old configuration parameters.
The update serial port() function manages the serial input stream. From this, it
extracts sensor readings and stores them in an InputFrame. These InputFrames are
stored using the save input frame() function.
A state variable is used to ensure that the input stream frames do not get unsynchronised
when individual bytes are lost in transmission.
When the program is waiting for a start byte, it is set into an ‘idle state’ where the state
variable is equal to 0. Every time a start byte is received, the state variable is set to
1. At any other point, the state variable is incremented. When the last sensor value is
read in (the state variable is 8), the program is set back into the idle state. This process
is illustrated by fig. 7.1.
7.2.1 Introduction
Before any sound generation can take place, the raw data must be converted into a series
of strikes, damps and pedal movement. These are provided to the next processing stage
as two structures: a ContinuousInputTable and a StrikeTable.
• The StrikeTable is only updated when a hit has been detected, and provides in-
formation about the strike hit strength, distance from centre and pedal position
at time of strike.
Chapter 7 Input Processing 59
"!$#
b
YZ\[5]7^`_aa
lnmCo
OP
QSRMTVUXW+PT jk i h
c(d*e+df.g.
&('*)+',.-0% /
9;:=< >@?ACBED*F(G(H+IKJLM?N.B
1(2*3+25476.8
As can be seen in the timpani header file (Appendix D.1), the input processing part of
the system involves four functions called from the main program:
• init input processing() is used to allocate memory to the various table buffers
needed for this processing stage.
• save input frame() fills the table buffers appropriately with values based on cal-
culations from the raw sensor readings.
• update input processing() calls the appropriate internal functions to detect and
interpret hit and damping information.
• term input processing() frees the memory allocated for the buffers.
7.2.3 Buffering
Initially, the save input frame() function fills a rolling buffer, holding the values of the
sensors obtained from the serial port. This is done because later calculations require
past values.
Chapter 7 Input Processing 60
7.2.4 Noise
Section 5.6 highlighted the noise present within the data at the input to the PC. High
and low frequencies of noise need to be considered separately. On a digital signal, run-
length averaging can be used to remove high frequency noise (it is effectively a low-pass
filter, fig. 7.2).
• A smoother curve without quantisation errors and with reduced glitches and high-
frequency noise fluctuations.
TU0V,W
Figure 7.3 displays the magnitude of the frequency response of a 5-point run-length
averaging filter. The X-axis represents normalised radian frequency ω̂ (in radians),
which is defined as:
2πf
ω̂ = (7.1)
fs
where
f is frequency
fs is the sampling frequency
Only values from ω̂ = −π to ω̂ = π are shown, as the plot is periodic with period equal
to 2π. The magnitude of the frequency response of a run-length averaging filter is zero
at specific frequencies. These are equal to: (36)
2πk
ω̂ = (7.2)
L
where
L is the length of the running averager
k is a constant = 1, 2, 3, ... L − 1
And so, for a 5-point running average, the first zero point is equal to 2π/5. This signifies
that signals with a period of less than 5 sampling periods will be attenuated considerably.
Chapter 7 Input Processing 61
0.9
0.8
0.7
Magnitude 0.6
0.5
0.4
0.3
0.2
0.1
0
−3 −2 −1 0 1 2 3
Normalised Radian Frequency
Figure 7.3: Plot of the magnitude of the frequency response for a 5-point running
averaging filter
In order to account for low frequencies a noise threshold value must be set. Any readings
that are below this threshold are not certain to be accurate as they may be due to noise.
No calculations should therefore be performed and no interpretations should be made
on such readings.
As the input data, generated from the ADC’s is limited from 0 to 254 (the number 255
being used as a start byte in the frame), representing a voltage range from -2.5V to
5V (section 5.4.3), it is necessary to discover exactly which quantisation level a sensor
reading of 0V corresponds to. The precise quantisation level generated by the ADC’s
cannot be established from theoretical knowledge due to the fact that the resistors used
for the 2.5V reference voltage may have tolerances associated with them. It is crucial to
calculate this value precisely, as all quantities need to be normalised according to this
level.
The best way to set both the zero point and the noise threshold accurately is to perform
a quick calibration during the initialisation of the input processing. The zero point is
found by calculating the average over 1000 readings. The noise threshold is set as being
twice the maximum reading.
Chapter 7 Input Processing 62
Once a strike has been detected, it is necessary to evaluate the strike strength. The
definition of the interface to the sound generation software specifies the strength of hit
must be represented by a value from 0 to 1.
The following measures could typically be used to estimate this strength from the three
piezo readings:
• Maximum
• Median
• Minimum
• Mean
• RMS value
The maximum, median and minimum cannot be used as an estimation of the hit strength
because it is difficult to decouple position from strength information. Calculating the
hit strength using these measures is further complicated by variation in sensor response.
The RMS value could justifiably be used as a measure of the hit strength. However,
as negative values invalidate this measure, the piezo outputs would have to be carefully
limited and normalised before they can be used in this fashion. The mean was selected
as a hit strength measure to simplify implementation. Few considerations need to be
taken into account, and values can be used in any form. It is a sufficient approximation
of a true measure of hit strength, providing it is scaled up.
It should be noted however that these measures all assume that the piezo outputs vary
linearly with strength of hit. Although this may be a close approximation, experi-
mentation on the prototype should be carried out to determine whether linearisation
transformations are necessary.
A strike can be detected by locating the sharp peak, characteristic of its shape (fig. 7.4).
Strike Profile
250
Sensor A
Sensor B
Sensor C
Mean
200
Strength (quantisation levels)
150
100
50
0
0 5 10 15 20 25 30 35 40 45
Time (sample)
where y is the function, yn is the value of the function at time t = n and ∆t is the
separation between samples.
The second derivative is determined from the difference between two first derivatives
over a known time period:
dy dy
d2 y
dt − dt
n n−1
w (7.4)
dt2 n ∆t
Because the signal is digitised, the zero may not occur at a sampling point. However,
its presence can be inferred from a zero-crossing, which is indicated by a sign change.
The following equations indicate a zero-crossing:
dy dy
>0 and <0 (7.5)
dt n dt n−1
or
dy dy
<0 and >0 (7.6)
dt n dt n−1
Chapter 7 Input Processing 64
To distinguish maxima from minima, the second derivative is then required. Only a
transition from positive to negative of the first derivative will indicate a negative second
derivative:
d2 y
<0 (7.7)
dt2 n
Equations 7.4 and 7.7 therefore imply that only equation 7.6 would indicate a maximum.
So, from equations 7.6 and 7.3 the following conditions are requisite and sufficient for
the detection of a maximum:
yn−1 > yn (7.8)
and
This demonstrates that if the middle out of three values is the highest, a maximum
exists at point yn−1 .
To avoid detection of noise-induced local maxima and minima, a larger set of readings
can be used. A peak is then redefined as the condition when the middle value is higher
than all the others in a larger buffer:
It can also be observed that this derivation filter requires N future samples. As a causal
system is required, and no forward-prediction is feasible, a delay of N samples needs to
be introduced.
Two distinct quantities need to be obtained from the sensor readings: radial distance
and strength of hit. To extract radial distance, the readings must be normalised in such
a way that this information is separated from strength information. This is achieved by
dividing all three values by the largest. The result will be that none of the normalised
values are greater than 1, and the values are all relative to one another.
Chapter 7 Input Processing 65
300
Sensor A
Sensor B BUFFER
Sensor C length = 7
Mean
250
200
Mean Middle Value
150
100
50
A peak is detected when the Middle Value
is higher than the others in the buffer.
0
1 2 3 4 5 6 7 8 9 10 11
Assuming the perceived strengths of hit of sensors vary linearly with distance to the
sensors, the following vector calculations can be performed:
The three sensor readings can be represented by three radial vectors pointing towards
the sensors as shown in figure 7.6. Through simple trigonometry, it is then possible to
find the expressions for the x and y coordinates of the sum of the vectors. They are
given by:
√ √
◦ ◦3 3
X = B sin 120 + C sin 240 = B −C (7.10)
2 2
B C
Y = A + B cos 120◦ + C cos 240◦ = A − − (7.11)
2 2
The radial distance is then obtained by calculating the norm of the vector:
p
Radial distance = X2 + Y 2 (7.12)
Chapter 7 Input Processing 66
Testing the position-finding equations above, in conjunction with the electronic timpani
hardware, revealed that the strength perceived by the piezo sensors does not vary lin-
early with distance. The assumption made previously is therefore inaccurate and it is
necessary to linearise equation 7.12 for the radial distance by applying some unknown
function to it. This function would depend on the response of the sensors, the mechan-
ical interaction of the sensors with the drum plate and layers of material in the sensor
assembly.
The number of variables involved, and the complexity of the modelling problem, sug-
gested that an empirical approach would be a more efficient use of time and would
allow adjustment to account for unpredictable variations in the sensors’ characteristics.
Raising equation 7.12 to a power is a convenient way of attempting this linearisation.
A range of experiments were performed to determine the optimum power. Any further
attempts at refining this transformation would not be productive due to inconsistency
in the piezo readings.
These equations only hold inside the triangle formed by the piezo sensors (fig. 7.7).
Strikes that occur outside this triangle result in radial distances larger than 1. Only
radial distances that lie between the centre and a sensor are accurate.
To compensate for this, angular position was required. The following two methods were
envisaged to calculate this:
• Calculating the angular distance from the X - Y coordinates. (see figure 7.8).
Y
θ= arctan mod 120◦ (7.13)
X
Sensor A
Position = 0
Valid range
(Position < 1)
Sensor B Sensor C
Sensor A
Location of Hit
Sensor B Sensor C
the three sensor readings, and median is the middle one. When the pad is struck
directly above a sensor, the other sensors should both be equal and close to zero,
so median − minimum = 0. When the pad is hit exactly between two sensors,
median = maximum, so median − minimum is at a peak. This measure thus
varies from 0 to 1 with angular distance from the nearest sensor.
The first method was rejected because the values of the X and Y coordinates could not
be relied upon. In effect, it would be using X and Y to correct X and Y .
As the original transformation is accurate on the lines between the centre and each
sensors, when median − minimum = 0, the new transformation should not differ from
Chapter 7 Input Processing 68
the original in this case. This could be taken as a boundary condition. Thus, the
transformation formula was set to:
Old distance
Transformed distance = (7.14)
1 + k(median − minimum)p
where k is the amount to scale the values down when the strike occurs on the edge
between sensors, and p is a constant that makes this transformation consistent over the
whole edge of the pads
The values of p and k were initially estimated, and then fine-tuned through experimen-
tation, so that they also accounted for sensitivity differences between sensors. k was
found by setting median − minimum to 1 in order to eliminate the effect of p, and
finding which value would provide a position of 1 when the pad is hit on the edge. Once
k was set, p was then adjusted to obtain the desired consistency of values around the
edge of the pad.
Vibrations caused by strikes on the drum plate propagate through the material and thus
produce ‘ringing’ effects in the piezo readings. In terms of strike detection, this results
in the creation of misleading secondary peaks that should be ignored. Ringing effects
become insignificant after about 30 to 40 samples which is equivalent to about 12 ms
(fig. 7.4). Considering the period of a typical timpani roll is much higher (about 50 ms),
it is quite safe not to perform any processing and detection algorithms on 20 ms (or 60
samples) after a hit has been detected.
A further detrimental effect occurs when the pad is not properly secured and bounces
off the surface after a hard hit. The piezo transducers perceive as a second strike. The
time between the actual hit and the perceived hit due to the bounce is likely to be higher
than the period of a fast roll, which suggests it may not be ignored. For this reason, it
is essential that the pad be properly secured.
The profile of a damp is very different to that of a hit (fig. 7.9). When the pad
is depressed, the piezo transducers deform generating a slowly rising signal. Once the
piezos stop being deformed, the signal peaks, and starts to fall slowly, eventually reaching
zero. Removing the hand from the pad allows the transducers to relax back into their
original state, which generates a negative curve.
Chapter 7 Input Processing 69
100
Offset of damp
Strength (quantisation levels)
80
Onset of Damp
60
40
20
0
0 500 1000 1500 2000 2500
Time (samples)
This onset and offset therefore need to be found in order to detect damping. The two
pulses that are generated by the piezoelectric sensors are associated with the slope (first
derivative) of the actual damping information that needs to be extracted. The mathe-
matically rigorous method for extracting damping information is therefore integration.
Although this could be performed by the cumulative summing of readings, the problem
with this technique have already been discussed (section 5.2.5).
A more conceptual approach may be preferable. As the damping is a much slower signal
than a hit, the two can be distinguished using a threshold on rise time. Looking at the
rise time towards a peak is not sufficient though, as local peaks occur quite frequently
in the damping signal and would detected as small strikes.
To ensure this does not interfere with the damping distinction process, it would be
better to measure the time for the whole pulse to occur. Experimental data revealed
that the signal associated with a strike on the pad lasted for about four to ten samples,
depending on the hit strength. It would therefore be safe to assume that a signal that
takes more than twenty samples to reach rise, reach its peak and drop below the noise
threshold again, is caused by damping.
Chapter 7 Input Processing 70
When this algorithm was implemented, a problem was soon discovered with this method
of detection: the onset of the damping is too slow and noisy (fig. 7.9). During the onset
of damping, signal noise causes chatter around the detection threshold, which produces
a stream of small hits to be detected. This effect could be compensated for by using
hysteresis in the threshold setting. Another approach is to smooth the signal to remove
noise by performing a running average (section 7.2.4).
Once the onset of the damping is detected, it is essential to find when the hand is removed
from the pad. The problem with using a peak-detection based algorithm, such as the
one previously described, is that strikes also create a series of negative peaks. Without
additional distinction algorithms, these would be interpreted as damping endings.
Rather than using negative peaks to detect the end of a damp, a counter records all
the time the signal spends below the negative noise threshold (which is defined as
zero point − positive noise threshold ). This counter is reset every time the signal rises
above the negative threshold. By comparing the time spent below the negative noise
threshold, a damp offset and a strike bounce can be distinguished. A threshold of
between 20 and 100 readings was found to be reasonable.
7.2.12 Trade-off
With the current analogue to digital conversion resolution of 8 bits for the piezos, the
solution lies ultimately in a trade-off between rejection of spurious hits when damping,
and detection of light hits. In order to get damping working properly for the prototype,
all hits weaker than a strength of 2 quantisation levels are rejected. Additionally, to
remove problematic glitches in the signal, such as those caused by high frequency com-
ponents of the noise, the peaks that have a rise time that is too small for them to be
caused by strikes could be ignored.
The code in the update input processing() function, has been divided into sub-
functions to modularise the individual detection tasks (fig. 7.10):
• is end of damp(): Looks for an offset of a damp and returns true if one is found.
• is hit(): Performs hit detection and returns true when a strike has been detected.
It also returns the strength of the hit.
Chapter 7 Input Processing 71
• was hit(): When a strike is very light, the program first ensures it is not the start
of a damp. This function returns true when it has established that a small recent
peak was a strike on the pad rather than a damp.
is_damp() Save
Continuous Table
Damp Strength
is_end_of_damp()
Yes
was_hit()
is_hit()
Yes
Small No Save
Strength?
Hit Strength Strike Table
Yes
Hit Strength
In order to make the program flexible, all the control parameters were defined at the
start of the file. Where possible, functions were written in a versatile manner using these
parameters to modify the behaviour.
Chapter 7 Input Processing 72
The channel filter module requires the level of damping continuously. The piezo sensors
only provide a measure of velocity, which creates a positive peak during the onset of
a damp and a negative peak during the offset (fig. 7.9). In order to provide a contin-
uous measure of the damping, the maximum value of the damping signal is held until
the end of the damp. This creates a signal that is analogous to an integral of the in-
put. Once this value is scaled and normalised between 0 and 1, it is then stored in a
struct and passed to the save continuous input() function, which stores them in the
ContinuousInputTable.
At the same time, a low-pass filter with a cut-off frequency of about 20Hz would remove
the strike information and provide a clean profile of the damping. As was explained
earlier, the mathematical method of extracting the damping information would involve
integrating the signal, which could be approximated by calculating a running sum of the
readings. However, the difference between the onset and offset of damping, caused by
the non-linearity of the piezoelectric transducers, would generate a DC drift. In order
to resolve this, a further high pass filter of 0.1Hz would have to be applied to the signal,
which would remove this drift, and tail-off the damping after 10 seconds if an offset of
a damp was not detected.
Chapter 7 Input Processing 73
The main reason why this method would be more effective is that it resolves all the
problems associated with the distinction between strikes and damps. The simplest form
of digital filter is a finite impulse response (FIR) filter (such as the run-length averaging
filter described in section 7.2.4), which depends solely on scaled current and old input
values. The problem associated with using such a filter is that it often necessitates large
delays and storage while not being very flexible. For example, this delay is noticeably
large for a low-pass filter with a low cut-off frequency.
A more efficient way of implementing such filters would be to use infinite impulse re-
sponse (IIR) filters that depend on current and old values of the input as well as old
values of the filter output. As the past values of the filter output are fed back to its
inputs, the IIR filter is potentially unstable. If designed correctly, they are stable and
require fewer terms than an FIR filter.
where
x[n] is the input to the filter at time n
y[n] is the output of the filter at time n
y[n − 1] is the output of the filter at time n − 1
a and b are constants.
If a and b are positive, the filter is low-pass, whereas if they are negative, the filter
is high-pass. Modifying these constants would emphasize or de-emphasize the various
frequencies. In order to ensure the filter is stable, b must be between -1 and 1. There are
many more examples of filters, which would have more complex equations. These include
the standard Bessel, Butterworth or Chebyshev filters, and custom designs. There are
many textbooks that cover filter design in far more depth (36) (41) (38).
Machine learning could be used as an alternative to digital filtering. There are several
techniques that would transform the sensor readings directly into hit and damping in-
formation. Foremost of these is Support Vector Machines (SVM), which can be used for
both classification (strike detection) and regression (generalised model fitting).
The input processing subsystem has three inputs, namely the three sensor readings, and
should provide two outputs: radial distance and strength. In order to perform data
collection rigorously, a method is required for consistently striking the pad at specific
locations and with specific strengths. Such a device was constructed with a view to
implementing a SVM solution (fig. 7.11). Once the dataspace has been obtained, the
Chapter 7 Input Processing 74
SVM would generate (offline) a model of the relationship between sensor readings and
the required outputs. This model could then be used to transform sensor readings in
real-time. Inclusion of other output variables, such as angular strike position, would not
require significant alterations to the system and may prove useful in later applications
(e.g. steel drum).
Problems may occur with dimensionality, if the readings need to be considered as streams
rather than individual time slices. This may make machine learning unsuitable. The
quality of the result will depend on how effectively the training data is collected and
how complex a model is required.
It can be observed that the peaks in the signal from the sensors do not occur syn-
chronously (fig. 7.4). This is probably due to the dynamics of the drum plate. The
strength and positional information may be imprecise and inconsistent, as they are
based on the mean of the sensor signals. It would therefore be better to perform peak
detection on the individual sensor signals.
• Increasing the resolution of the analogue to digital conversion of the piezos will
improve the ability to differentiate between a beginning of a damp and a light hit.
• To avoid additional processing and trade-off between detection of damps and hits,
it would be more advisable to use a sensor type that supports ‘DC force’ detection,
such as strain gauges (as suggested in chapter 5).
Chapter 7 Input Processing 75
Figure 7.11: Mechanism for dropping a ball from a specific height at a certain location
on the pad
Chapter 8
Sound Generation
This chapter covers the production of sounds from the electronic timpani. Details of
measurements made on timpani on the timpani the anechoic chamber are explained
and proposals for future experimentation are made. The complete dataspace creation
process is then described.
Two very different modelling approaches, a data driven learning approach and the use
of Wavetable Synthesis, are compared. Finally the prototyping of the sound generation
model in presented.
76
Chapter 8 Sound Generation 77
The purpose of the experiment was to make many recordings of how the timpani sounds
to the player when struck and played in different ways. A complete map can then be
constructed, such that the sound is known for a given pedal position, strike position,
and hit strength. These were the key input dimensions as defined in the project aims.
This map could be extended to include the effect of different playing techniques such as
damping, glissando and rolls. From this, the sound of the real timpani can be replayed,
learnt or mathematically modelled.
Microphone recordings should be made at the player’s head position; the electronic
timpani is more likely to be used for practice and teaching purposes, rather than as a
performing instrument. If scope allowed recordings could be made in other positions
to give the electronic timpani more versatility. For this experiments only one record-
ing position will be used. Whilst in a real practice/teaching situation there would be
reflection from walls, floor and ceiling, the recordings obtained by this method should
be independent of their surroundings. If the electronic timpani were to sound to the
listener as though it were in an particular environment, this would be applied in the
form of a filter applied to the ‘pure’ recordings.
All possible combinations of input dimension permutations should be covered (such that
the total number of recordings needed would be the number of steps in each variable
multiplied together). It is possible that for a machine learning approach, not all the
combinations would be necessary. Instead a selective range (including the corner points)
could be used.
8.1.2 Method
Having set up the equipment (fig. 8.1), the following method was used to obtain the
recordings (a photograph of the anechoic chamber experiment can be found in Appendix
H.2):
1. The timpani drum skin was marked with 8 positions using a water-based marker
pen, linearly from the centre to the rim (position 1 in the centre, position 8 on the
rim).
2. All recordings were made at a 48kHz sampling rate using the DAT.
3. The gain on the recording amplifiers was adjusted such that the all but the
strongest strikes would not clip, using the clip indicator LED’s. For the strongest
strikes the gain was decreased by 10dB. The gain on the near-field channel was
Chapter 8 Sound Generation 78
-10dB on the input and +10dB on the output. The far field channel had -30dB
attenuation, as the microphone type used was more sensitive.
4. The DAT gain was set to the middle of the range (it was suspected that the DAT
recorder was a significant source of noise).
5. The sound level meter (slm) was set to peak-hold, to provide an indication of the
hit strength. If the level appeared significantly lower or higher than expected, then
the strike would be repeated.
6. Using the 28” timpani at the lowest frequency, the timpani was struck by the same
person at each of the positions marked on the drum skin. The player judged the
same strike level for each position. This was first done at hit strength 1 out of 5 (5
Chapter 8 Sound Generation 79
being the hardest hit that a timpanist would be expected to normally use). The
steps in between would be judged to be linearly increasing, both by the player and
confirmed by the sound level meter.
7. Each of the positions were struck, once for each hit strength level. A second take
would be used if the recordings were invalid for any reason. When each of the hit
strengths had been recorded the pedal would be moved to the next position for
the next frequency step.
8. The lowest frequency is at the highest pedal position and the highest frequency at
the lowest pedal position. Five frequency steps were used using linear positions on
the tuning gauge on the side of the timpani. (e.g. ‘frequency 3’ is in the middle
of the tuning gauge)
9. The 25” timpani was then recorded in the same way as the 28”, except that only
the highest and lowest frequencies and only 4 positions were used as there would
not be time available to complete the full set.
10. Finally as many examples of rolls, glissando, multiple hits and other playing tech-
niques as possible would be taken, time permitting.
8.1.3 Conclusion
It was initially intended that the recordings taken would be a prototype set that could be
used to understand what would be needed for the electronic timpani. Initial experiments
on the modelling of the sound would be possible, with the intention that a better set of
recordings would later be taken when the precise requirements of the experiment would
be better understood.
It is ironic that the very aim for the project, to create an affordable portable device,
would become most evident during this experiment. Due to the limited availability of any
timpani drums and the need to insure the timpani against damage during transportation,
Chapter 8 Sound Generation 80
the timing of the experiment was out of the group’s control and the opportunity to repeat
the experiment did not arise.
During the use of the ISVR’s anechoic chamber, significant problems with external
ambient noise were experienced. Whilst the chamber has an impressive attenuation
through its walls (approx. -60dB SPL), if there are noise events external to the chamber,
for example at 100dB SPL, then the noise will be present in the chamber at 40dB.
(Noise events above 100dB SPL are easily generated in transmission loss testing in the
reverberation chamber adjacent to the anechoic chamber.) The noise events present at
the time meant that many recordings had to be abandoned and the process of collecting
was considerably lengthened. It is also possible that some of the recordings contain noise
events that were not spotted during the experiment.
Chapter 8 Sound Generation 81
The recordings obtained formed a reasonably exhaustive set in terms of the three main
variables: frequency, position and hit strength. However, it was noted that variations
in the hit strength were less predictable than variations in the frequency or position.
This is to be expected in part due to the way in which the hit strength levels generated
in a very qualitative rather than quantitative manner. However because of this, it is
not possible to determine if there is an inherently unpredictable nature in the sound
produced a constant strike velocity (perhaps due to minor changes in the trajectory or
hand-mallet interaction). The fact that a hit strength level recorded at one frequency
cannot be directly compared to a hit strength level at a different frequency means that
the recordings cannot be used with full confidence.
The recordings have a noise floor that is close to -50dB, which is reasonable for most
purposes, however as the dynamic range of the timpani is in excess of 100dB, less noise
is ideally necessary. The large dynamic range of the timpani may present itself as a
significant problem when finally replaying on either headphones or loudspeakers. For
example, if the volume is set such that the loudest parts are comfortable to listen to and
within the range of the equipment, then the quiet sections may be virtually inaudible.
Whilst checks were made throughout the experiment to make best use of the dynamic
range and to check for clipping, a degree of clipping is present on a number of the
recordings. It would appear that warning indicators on the equipment did not respond
quickly enough and therefore it is not a reliable method of determining if clipping is
occurring.
The experiment did highlight areas that should be changed if the experiment were to be
repeated:
• Use of accelerometer
Fundamentally in the experiment, a quantitative measure of the hit strength is
crucial for correct modelling of the sound of the timpani. It was not possible to
acquire the equipment given the short notice of the experiment.
• Tuned timpani
It was suspected that the timpani did not have correct tuning, particularly given
the movement of the timpani and their susceptibility to rapid de-tuning.
• Experienced timpanist
An experienced timpanist would be able to tune and play the timpani as described
in the above section (8.1.3.1).
• Better SNR
Given the extremely large dynamic range of the timpani, a low noise floor in the
recordings is essential.
Wavetable Synthesis in its basic form simply stores waveforms of the entire note of an
instrument in memory, which can then be played back with pitch shifting to produce
different notes. It is the most commonly used form of electronic musical synthesis.
Usually for modern systems there will be several notes recorded for a given instrument
(multi-sampling), for example one per octave is needed to reproduce a reasonable piano
without obvious jumps. A degree of pitch shifting is almost always necessary, as memory
does not permit every note on an instrument to be stored.
Looping, enveloping and filtering of these stored waveforms can further the ability to
‘play’ an instrument and provided expression. Expressivity is defined in as “the variation
of the spectrum and time evolution of a signal for musical purposes. That variation is
usually considered to have a deterministic component and a random component.” (ref.
(39)) The deterministic element is taken as being that controlled by the user during the
performance, for example hitting a piano key harder produces a ‘brighter’ note. The
random component is the change that is not possible to control.
Storing different parts of the note gives the ability to change the duration of the note
and the nature of the way in which it has been played. For the sustain or ‘ringing’
section of a note, a single period of the note may be all that is stored in the wavetable
memory. Storing the attack of an instrument is important, as documented in section
2.1.1. Figure 8.3 is a recognized representation of a standard envelope form to each note
and consists of attack, decay, sustain and release. Looping is required for the sustain
portion of the note. Each section of the envelope may be stored in the wavetable, or
some of the sections may share the same memory array but are used with a different
amplitude envelope. Envelopes are not usually complex, unless used in very specific
circumstances such as the electronic reproduction of a piano (which is not designed to
sound like any other instrument).
• Looping
Chapter 8 Sound Generation 84
Looping should include many periods of the waveform to give a note a natural
sound; one might refer to this as making the note “animated”. Usually at least
100 periods of the waveform may be used in a good quality wavetable reproduction.
• Amplitude Envelope
Amplitude envelope variation is usually a function of the velocity with which the
instrument has been played. The final waveform may be reproduced by multiplying
the waveform in memory, by a time varying amplitude envelope.
• Filtering
Filtering is used to produce further effects on the note, for example tremolo. Many
samplers in use today have 1 to 4 pole ARMA filters (Auto Regressive Moving
Average), these can be used to create spectral modifications in the time domain.
Typically a filter may be a low pass filter with adjustable Q or resonance.
• Digital Summation
For a device to support many notes playing at the same time there must be several
channels. The number of channels corresponds to the number of notes the device
can support; this is often referred to as the polyphony of the instrument. Channels
need to be added together before being passed through to the digital analogue filter.
To avoid ‘word growth’ when channels are added together a scaling rule is usually
√
applied, often of the form 1/ N . This scaling is appropriate as the signals are
usually uncorrelated.
There are two forms of pitch shifting; asynchronous pitch shifting and synchronous pitch
shifting. The former is far simpler to implement but gives less control over the results,
this methods effectively alters the playback speed of the signal by adjusting the sampling
rate in the digital to analogue converter (DAC). Adjusting the sampling rate is simple,
but can only be implemented on a ‘per channel’ basis. In most applications it is useful
if the data has a fixed sampling rate at the output, where signals need to be combined,
as they must all have the same sampling rate. This method is termed ‘asynchronous’
as each output channel must run at a different speed. However if the length of the
recording is not a concern this form of pitch shifting can have little effect on the quality
of the signal.
Synchronous pitch shifting is necessary if the length of the signal should not vary pro-
portionally with the rate of pitch shifting. As this method is implemented using sample
rate conversion algorithms, accessing wavetable memory at regular time intervals is more
logical. Also digital summation, and mixing of channels is possible for post processing
with a single data stream. However, quality issues arise if the conversion algorithms are
not carefully chosen. To summarise:
Chapter 8 Sound Generation 85
Asynchronous
Notes con only be shifted by a certain amount before they sound unnatural. For record-
ings that have significant time domain features, such as percussion instruments, pitch
shifting can become quite objectionable (39). For this reason in particular, pitch shifting
the attack of a note is likely to create more problems than shifting the ‘ringing’ part.
The simplest form of sample rate conversion uses a zero order hold interpolator ((39)).
If one considers a waveform is stored in an array denoted by W avetable[n], where n
refers the location of the datapoint in the array. Also we define a variable Offset, which
has both an integer and a fractional part, representing the current offset into the array.
Then for output datapoints x[n] for each index:
During playback the datapoint in the W avetable[n], may either be dropped or repeated
depending upon the rate at which the Offset is being decreased (fig. 8.4). The waveform
will be shifted up in frequency if the Offset value is incremented by more than 1.0 and
down in frequency if it is less. For example, a value of 2.0 would represent a doubling
of the frequency.
This method can be greatly improved by the use of linear interpolation. This uses the
fractional part of the offset to calculate the value of the new datapoint. The formula
now takes the form:
8.3.2 Looping
More complicated forms of synchronous pitch shifting can be implemented if the array
is read continuously in a loop, that is, when the read point is at the end of the array
it reads from the start of the array. The read and write points to the array are inde-
pendent of one another and therefore move around the loop at deferent speeds, if the
array is read faster that it is written to, then sections of the original waveform must be
repeated (8.5). Similarly, if the writing is occurring faster that the reading, sections of
the waveform are neglected. The problem with looping is that a discontinuity occurs
as one ‘pointer’ overtakes the other. Truncated signals often have a characteristic ‘bell
like’ sound, whereas gaps in the signal have a characteristic buzzing sound. Several
methods have been used to reduce this problem, the simplest being to fade across the
discontinuity. This effectively reduces the amplitude and extent of the additional fre-
quency contribution formed by artefacts. Clearly this process requires significant extra
processing.
A waveform based pitch shifter detects the pitch of a signal and separates the period
information from this, so that the sections of recording that are repeated or deleted are
varied in length appropriately. This technique tends to force the discontinuities formed to
contribute only to the existing harmonic spectrum, so that it is better masked. However
it is very difficult to implement this method if there is not a clear and exclusive tonal
component. A timpani waveform would be unsuited to this method. Non-harmonic
frequencies of the principle component cannot be dealt with.
This technique involves separating a signal into source and filter components. The source
component can then be shifted independently of the filter component. For example
speech synthesis uses this technique. The part of the speech that gives the information
is stored in the filter part and represents the movement of the mouth, tongue and lips.
The source component represents the rest of the vocal tract including the voice box.
This is a powerful technique as the formants in the signal are not pitch shifted and
therefore the resultant signal is more natural sounding.
Where the two components are generated separately this technique, is effective and can
be easily implemented. If this is not the case, adaptive inverse filtering is necessary to
separate the two components. This method would require a high processing although
it is possible that in this project the inverse filtering could be completed off-line. The
Chapter 8 Sound Generation 87
dataspace would contain the deconstructed signals, and therefore the processing require-
ment would be significantly reduced.
Chapter 8 Sound Generation 88
Strike related
• Hit strength
• Pedal position
• Damping level
If just the strike related variables are considered, a mapping could be found between
the output waveform and the input variables. If the input and output are considered as
vectors:
str
pos
s=
pedal (8.3)
damp
W = f (s) (8.5)
It should be noted that n is the number of data points in an output sample. For a 15
second sample at 44100Hz this would be 661500, representing a very highly dimensional
output space.
There are several techniques available to perform this kind of conversion. Two key issues
should be considered.
Chapter 8 Sound Generation 90
2. This system will have to use a data set to fit the mapping, which will inevitably
contain some level of noise. The model should be noise tolerant, and not over fit
the model (so that the noise becomes a feature).
Support Vector Regression may satisfy the second condition (if not the first). It would
not be feasible to fit a model for every data point in the output vector, so some method
must be used to reduce the size of the output vector (output space).
The output space, as an array, is first adjusted to a central origin by finding the mean of
all the output vectors and subtracting that from all of them. It is then multiplied with
its own transform. The eigenvectors and eigenvalues of this array are then calculated and
sorted. These eigenvectors are by definition orthogonal and represent a new space. The
objective with this process is to identify which are the key eigenvectors by disregarding
all those with relatively low eigenvalues. Assuming that the output space could be
reasonably remodelled using only a low number of eigenvectors, it may be possible to
reduce the output space from 661500-dimensional to, say, 20-dimensional. It may then
be reasonable to fit a model for those dimensions.
8.4.2 Implementation
Due to project time constraints, a machine learning generation technique has not yet
been implemented. The system structure is designed to allow later implementation with
minimal redesign. It should be noted that this system does not provide a method for
dealing with those inputs that can vary after the strike occurs.
Chapter 8 Sound Generation 91
Having spliced the recordings from the anechoic chamber, it was necessary to index
them. This task was done using Matlab. Each recording existed as an array in a single
‘.mat’ file with a name corresponding to the size of timpani, frequency, hit strength,
position and take number for each hit (e.g. m1f1s1p1t1.mat). The edited recordings
and the original ‘.wav’ recordings are included on the project CD. Certain parameters
needed to be ascertained for each recording:
• Size:
The total number of datapoints in the array. If the recordings need to be the same
length (perhaps necessary for the learning approach to the modelling) this would
have to be the length of the longest recording. It is also useful to know how long
the recordings will ring for.
• Clipping:
Some method of determining whether or not any clipping occurred.
• Hit strength:
An objective measure of strike force was not generated at the time of the recordings,
so if the recordings can provide the data it will be a useful resource.
• Noise:
A measure of the noise in each recording would be more useful than an overall
level. If the signal-to-noise ratio were too low it would create problems to any
‘learning’ approach to the modelling.
• Frequency:
The precise frequency could be determined from the recordings, as it was not
measured during the experiment.
• Offset adjustment:
It was observed that there was a minor DC offset in the mean value. Its origin is
unclear, but it was easily removed. [offset.m]
• Gain adjustment:
Before any quantification of the hit strength could be made, it would be necessary
to adjust the gain of some of the recordings. (When recording the higher strength
hits, the gain had to be reduced by 10dB to avoid clipping and make better use of
Chapter 8 Sound Generation 92
the dynamic range, as described in section 8.1.2). It did not make any difference
whether the level of these was increased or the level of the rest of the recordings
decreased, as they would all eventually be normalized to 16bit signed integers.
[gain10up.m]
The complete index can be found in Appendix L. The code for these functions is on
the project CD [makefileindex.m], along with the code for the other parts in this section
shown in square brackets.
8.5.2 Clipping
8.5.3 Noise
A measure of the noise was taken to be the RMS value of the last half-second of each
recording. It is assumed that during this part of the recording, ringing of the note has
decayed significantly such that these datapoints represent background noise only. [Used
in: findend.m, makefileindex.m, offset.m, singleindex.m, tailandcut.m and trim.m]
Frequency is determined from the frequency spectrum of each recording in the period
after the decay of the initial transients.
The frequency of the recording should correspond to the second sharp peak in the spec-
trum. The fundamental frequency or ‘01’ mode of the timpani is lower than the perceived
pitch (as explained in section 2.2.1, this is the ‘missing fundamental’ phenomenon). How-
ever in practice this is slightly harder to identify. For those recordings corresponding
to positions close to the centre of the drum the ‘01’ peak is strong and the ‘11’ peak is
considerably reduced, whereas for other positions the ‘01’ peak is difficult to identify.
Therefore a routine had to be created that found the first several peaks. The peak
corresponding to the ‘11’ mode could then be identified manually using common ratios
determined by previous research in table 2.4.
The Matlab power spectral density (PSD) function was used to generate the frequency
spectrums. A 48k window was used to give a 1Hz frequency resolution, however several
Chapter 8 Sound Generation 93
other windowing sizes were used to confirm that the results were obtained were consistent
(i.e. did not contain aliasing). [frequency.m]
It is not possible to compare the amplitude of the recordings from the same hit strength
level. For example, a ‘hit strength 4’ at ‘frequency 4’ is not at the same level as a ‘hit
strength 4’ at any other frequency. This is most likely due to the way in which the
recordings were made: with no reference hit level, and an indirect measure of the hit
strength each time.
Secondly, this assumes that the amplitude of the recordings is directly proportional to
the hit velocity. There is no one correct way of measuring the level from the recording,
so the RMS average of the first 100ms was used as this gave the a reasonable correlation
with the sound level meter readings.
Finally, as was first noticed during recording, the level dropped with increasing distance
Chapter 8 Sound Generation 94
from the centre of the timpani head. Comparing the spectrogram of a centre hit (fig.
8.7) with the spectrogram of a quarter hit (fig. 8.12), shows how the higher frequency
contributions increase towards the centre of the head, which helps to identify the cause
of this phenomenon.
It is preferable that recordings should not be modified to preserve this position depen-
dence. In order to compare values, a method was devised to identify which recordings
were inconsistently higher or lower than the general trend:
1. First the average value from the eight positions, for a given frequency and hit
strength, was taken.
2. Then the ratio of this average value over the actual value would be plotted for
each position (fig. 8.8).
3. These plots show how the level varies linearly with position for a given frequency.
The gradient of this line, however, is not the same for each ‘frequency’ and ‘hit
strength’.
4. Using the general trend a corrected measure of the ‘hit strength’ could be formed.
[hitstr.m].
5. Where the difference between the average level and the corrected level was consid-
ered too large, then the recording could be corrected.
Figure 8.9 shows the value at ‘position 4’ lies outside of the expected range (using a
tolerance of 20%). Note that by this method only recordings with spurious levels are
corrected.
Chapter 8 Sound Generation 95
Position Dependence
1.6
data 1
linear
y = 0.1466*x + 0.176
1.4
Ratio of Average Level over Actual Level
1.2
0.8
0.6
0.4
0.2
1 2 3 4 5 6 7
Position
The method by which the level should be raised or lowered is considered in detail in
section 8.6.3 (interpolation of hit strength). The conclusion is that a basic amplitude
adjustment makes a reasonable approximation. However, it has been observed that the
listener is less likely to notice that a recording has been changed, if it has been shifted
from a different frequency. Also, as two takes were made during recording (usually
because the first was considered incorrect) there were several options by which to correct
the hit strength measure. The order of preference for correction takes the following form:
1. Use Take two: The second takes were usually closer to the expected value.
2. Frequency: Using a recording by shifting the closest recording with the same
‘position’ and ‘hit strength’ was preferable to adjusting the amplitude. [checkstr-
freq.m]
3. Low Level: For those recordings with very low levels, the higher of the two
takes would be used or amplitude shifted down from the ‘hit strength 2’ recording.
[checkstr.m]
4. Level: Usually corrected to the expected value, unless inconsistent with adjacent
Chapter 8 Sound Generation 96
4
Corrected Value
0
1 2 3 4 5 6 7
Position
Figure 8.9: Corrected value vs position. ‘Position 4’ value lies outside expected range
recordings (in this case the level was adjusted to the average between adjacent
recordings).
If the initial data had been collected in a more controlled manner, the measure of the
hit strength would not need to be taken from the recordings themselves.
8.5.6 Trimming
Recordings must start as soon as they are played. They should also start initially with
a zero value so that clicks are not created as the sound card plays the recording, due to
discontinuities. The start of the recording is taken to be the first datapoint that exceeds
a certain noise threshold. By inspection, events in the noise generally do not exceed four
times the RMS noise level.
A simple loop would search through the recording until a datapoint was greater than four
times the RMS noise level. A new recording would then be constructed by backtracking
by a single datapoint and setting its value to zero. [trim.m]
Chapter 8 Sound Generation 97
8.5.7 Tailing
Two problems had to be overcome to implement the correct tailing of the recordings:
firstly, finding the end of the recording and secondly determining what envelope should
be used to tail it to zero. There is no ideal way by which the recording can be tailed
as the timpani will continue to ring infinity, although the recordings must have finite
length. Therefore, from the point at which the ringing is no longer audible, the recording
must be tailed to zero. If a ‘learning’ approach to the modelling were employed, the
method of tailing should be considered carefully as it may introduce artefacts into the
dataspace. A linear envelope was created to tail the recordings with a length dependent
upon the total length of the recording (fig. 8.10).
The end of the note was detected using a routine to search backwards through the
recording, until a number of datapoints exceeded a set level within a given time period
(based upon the RMS noise earlier determined). Although this eliminated random
noise, unwanted sound events (such as a audible knock) would cause an erroneous end
detection. Another routine examined this end point to differentiate knocks from the end
of the note. [tailandcut.m]
A single large array was created by combining the individual recordings. When compiling
the dataspace an index would be created, giving the start point and length of each
recording in the array. The dataspace would also contain any information required to
quantify the array, including the number of steps in each variable (hit-strength, position
and frequency), and the normalised tables of these steps. It also includes the ‘frequency
map offset’, which is used to calculate the frequency shift needed when re-sampling. The
description of the dataspace and the precise order of this information is given in section
9.4. [create.m and trans.m]
8.6 Model
Generation was formed in two parts. Selecting which recording in the dataspace to use,
and generating a new waveform by manipulating it. It is intended that the dataspace
and the sound generation code could be independently replaced without changes being
made to the other.
The generation responded to three inputs, namely: frequency, strike position and hit
strength interpolation. Damping (section 8.7.2) and glissando (section 8.7.1) are also
considered but must reside in the channel filter. Throughout this section, code was
implemented in Matlab first, to verify the principle method and to check that it would
work correctly with the dataspace. [model.m]
The dataspace contains recordings at five discrete frequencies. It was necessary to pitch
shift the closest of these to the desired frequency, as governed by the pedal position. As
explained in section 8.3, methods of pitch shifting essentially fall into two categories:
asynchronous and synchronous. Asynchronous pitch shifting is not an option as it must
be possible to have more than one note ringing at the same time.
It was not necessary to program, by whatever means, different methods of pitch shifting
in order to test them. Many recording/editing programs, such as Cooledit and Gold-
wave, contain these functions with reasonable documentation of how they have been
implemented. Conducting tests proved that looping methods (section 8.3.2) create an
unpleasant sound due to the discontinuities occurring at the end of the repeated/cut
segment. Whilst careful selection of the length of segments created, in order to match
the pitch of the recording improved this, results were still far worse than those achieved
Chapter 8 Sound Generation 99
by the simpler methods. Also these more complicated methods would have been less
efficient, given the size of the recordings this would increase latency.
The pitch shifting method implemented, is based upon a form of linear interpolation
resampling (section 8.11), but is more efficient because only a single multiplication is
used. If n is the index at which the datapoint is to be created from the original recording,
then for every n from one to the length of the recording in the dataspace, we take the
datapoint from the recording either side of the index n to calculate the value of the new
datapoint at n. The integer values in the recording below the index corresponds to the
‘floor’ of n.
Pitch shifting by this method introduces negligible degradation of sound quality for
small changes in pitch. The maximum amount of pitch shifting the recording is likely to
need, given that there are five discrete frequencies almost linearly covering the frequency
range, can be calculated at around 5-6%. However, recordings shifted by the whole range
of the timpani pedal (approx. 50%) still sound reasonably consistent to the all those
who listened. While the discerning ear would be able to tell the difference this, formed
a good starting point. It is possible to play a note on a timpani drum at one end of
the scale and shift it to the top (although an unlikely method of playing normally) and
therefore this simple method could be quite effective. Therefore the same method of
pitch shifting was used in the generation part of the program and the channel filter part
Chapter 8 Sound Generation 100
of the program.
With a little experimentation, it was clear that it would be more difficult to create
convincing results with hit-strength interpolation. The ear could be better deceived if
there were more discrete steps in hit strength. In this scenario one may not need to
interpolate between them. The differences between two different hit levels are more
easily quantified in terms physical changes in the system. It is understandable that
if more energy were put into the system, non-linear effects would be more prominent
(e.g. wrinkling of the timpani membrane close to the impact due to its inertia). In
terms of the differences in the waveform created, it is evident that this would increase
the contribution of the higher harmonics and in-harmonic higher frequencies in general
(those more associated with traditional drum sound). The duration of the note is also
longer. A spectrogram of a recording beautifully illustrates this (fig. 8.12).
At first a somewhat ‘blind’ approach to this problem was proposed. Identifying a general
trend in the increase in level and duration of certain harmonics as the hit strength is
increased, may make it possible to make a lower recording sound similar to a higher
strength recording and visa-versa. This could be implemented in the form of time
varying FFT filters or more crudely, with narrow band filters to increase or decrease
the contribution from certain harmonic frequencies. This method assumes that all the
information for a different strength hit is contained within the given recording and that
the general effect of a transformation in the input produces a relatively simply transform
in the output. It is easy to see at this stage, how the prospect of a ‘learning’ approach
Chapter 8 Sound Generation 101
Experimentation to implement the process described above proved slow and inconclusive.
After informal discussion with staff in the ISVR, in particular Dr. P. White, a better
approach to the task was considered, but on the whole there is no clear solution to
the problem. It is possible that a transfer function between the Fourier spectrum of
two recordings of different strength could be calculated, however this would not take
account of their phase. This would be a particular problem given that the two recordings
would not have the same length. Conceivable methods of dealing with the phase, such
as averaging or using the phase information from the higher of the two hit strengths,
would be unreliable. This approach was unlikely to yield results within the time frame
of the project. It has therefore been abandoned.
To reduce the audible steps that would be created by only five different levels of hit
strength, a basic amplitude mapping was used. For the most part this proved relatively
successful, only creating a clearly audible discontinuity where the dataspace contained
readings that were inconsistent. These problems could be avoided by using a better
controlled and more detailed dataspace.
8.7.1 Glissando
For a single timpani all ringing notes share a common pitch; there is only one skin on a
timpani. In the electronic timpani the recordings for the note may have been generated
with the pedal at different frequencies, and would therefore need shifting by different
amounts. As identified in earlier (section 8.3), pitch shifting the attack is not ideal. As
a range of timpani recordings are used, initial pitch shifting is minimised. However, as
the glissando only effects the ringing portion of the note a greater range of pitch shifting
is acceptable.
Consider two notes are generated one second apart. The pedal on the timpani is de-
pressed such that the pitch increases linearly from 100 to 200Hz (fig. 8.13). The first
note is generated at 100Hz and is continuously pitch shifted upwards. After one second
this has reached 150Hz and at the same the second note is generated at 150Hz. After
two seconds both notes are ringing at 300Hz, however the first is being shifted by a
factor of 3.0 but the second by a factor of 2.0.
Glissando is implemented using the resampling method used in the generation module.
This is fully explained in the concise system overview (section 9.5.2.1).
Chapter 8 Sound Generation 102
8.7.2 Damping
As with glissando the damping routine must occur in the channel filter. This is because
the system must remember the amount of damping that has been applied to each ringing
note at any given point in time. If this were not the case, as soon as the damping event
discontinued, the waveform generated would return to its previous level. Consider a hit
that is then damped. The damping is released and the note is still ringing but at an
attenuated level. If a new hit occurs at this point there should be no attenuation applied
to it.
The damping was first implemented as an exponential envelope by which the waveform
was multiplied. However a more efficient and simpler routine occurs in the present
code. This simply multiplies the waveform by a cumulatively decreasing factor (eq.
8.8). The rate at which the factor decreases is related to the damping value from the
ContinuousInputTable using equation 8.7.
8.8 Implementation
Once the method of sound generation had been selected, it was re-implemented in C.
The most significant change was in data type. By default Matlab uses a double data
type to handle all calculations. This is an eight byte long data type with floating point
precision. Whilst C is entirely capable of handling doubles, it was decided to base sound
handling modules on the short data type. This was done for the following reasons;
• Memory Size
The program needs to allocate significant amounts of memory to store the datas-
pace and the wavetable. These need to be held in system RAM for performance
reasons. Data stored as shorts takes up a quarter of the space that the same data
would as a double.
• Output
The sound card takes short data, and so if any other type were used for processing
it would have to be converted before writing to the sound card. This means that
any extra accuracy gained by using the larger data type would be lost just prior
to output.
• Processing
The processing of integer data types, like shorts, is dramatically faster than float-
ing point types. There are parts of the process, like resampling, that currently
rely on floating point values. It may be possible to optimise these later.
To minimise the amount of variables required in function calls, the storage and hardware
data was all stored in a globally accessible datastructure called DataSpace. As all the
data was accessed via a single struct there was minimal namespace pollution. Although
this did allow the software to be developed quickly, it imposed some difficulties while
trying to de-bug, as it was possible for functions to access and change data that they
should not. When integrating modules, care was therefore taken to ensure that they
only read “up-stream” and that they created outputs as expected by other modules.
Chapter 8 Sound Generation 104
8.8.3 Testing
As each section was developed a separate main.c file was written. This created a new
version of the program specifically design to exercise and test that part of the program.
This was particularly important before threads were implemented, as it was not possible
to run both the inputting and outputting modules on a single thread.
Chapter 9
This chapter is a concise overview of the current electronic timpani working prototype.
The sections detail hardware, software architecture, input processing, sound generation
and dataspace creation.
105
Chapter 9 Concise System Overview 106
Input:
Mallet strikes Piezoelectric transducer sensors
drum plate
components
Level adjustment & buffering
Hardware
Analogue-to-digital conversion
Control
Input:
PIC microprocessor
Tuning pedal
Input processing
components in PC
Software
Wave table
n channels
Channel combination
KEY
Output:
Data storage Sound to amplifier or
headphones
Processing
9.1 Hardware
The electronic timpani hardware system consists of two separate units: a rigid drum
plate with sensor assemblies attached, and an electronic circuit board housed in a plastic
case. The sensor outputs are sampled by the circuitry, and the readings are transmitted
via a serial link to the PC. The circuit diagram is shown in Appendix E.
The drum pad is intended to simulate the playing surface of the timpani, with movement
detected by the sensor assemblies. The plate is an aluminium honeycomb-structure sand-
wich panel of diameter 340mm and is struck by the timpani player with mallets. Three
sensor assemblies support the plate from underneath, attached equidistantly around the
plate circumference. Each sensor assembly consists of a piezoelectric transducer bonded
to layers of rubber to it. The arrangement of the layers of rubber is such that the trans-
ducer is flexed in a similar mode to that of its normal operation as a sounder, thereby
maximising the amplitude of the transducer output signal.
The electrical signals from the transducers are passed into the electronic circuit board,
and undergo several stages of processing before the analogue to digital conversion. The
first stage is to attenuate the signals using a potentiometer to allow individual adjust-
ment. The signals are then buffered to amplify the current, and a level shift takes place
by means of a voltage divider to meet the required input range of the analogue-to-digital
converters (ADCs).
Three 8-bit ADCs (National Semiconductor type ADC0804) send values in parallel to a
PIC microcontroller (Microchip type 16F871), which provides the requisite flexibility. It
runs at a clock speed of 12MHz and also converts the analogue voltage from the tuning
pedal into digital values, using its internal ADC.
The sensor and pedal information is transmitted according to the following protocol:
Chapter 9 Concise System Overview 108
Start byte
Sensor A reading
Sensor B reading
Sensor C reading
Tuning pedal reading
Sensor A reading
Sensor B reading
Sensor C reading
• maintain synchronisation
This data sequence is transmitted approximately once every 0.6ms to the PC using the
RS-232 serial interface, via a level converter and cable. The serial interface runs at
115200 bits per second (bps). No data flows from the PC to the PIC.
Chapter 9 Concise System Overview 109
The electronic timpani uses a range of processing and storage modules to perform com-
plex real time input processing and sound generation. This section outlines the overall
system architecture.
9.2.1 Platform
A single PC is used to translate sensor readings (communicated via the serial port)
into an audio output. A Linux platform was selected as it offered good low-level access
to hardware and processes scheduling facilities. Where possible standardised functions
and libraries (notably POSIX) are used to facilitate porting to other operating systems
at a later date. The real-time software is implemented with C (flexible and efficient
programming language).
9.2.2 Structure
The software is divided into three layers; hardware, global storage and processing layer.
The storage layer is used to store the intermediate results created by the separate pro-
cessing modules without them needing to call each other’s functions. This helps modu-
larisation and also allows the program to be multiply threaded.
• StrikeTable
The strike table records information that relates directly to individual strikes. This
currently includes strength, radial distance and pedal position at time of strike.
This could later be extended to allow for multiple timpani and mallet types.
• ContinuousInputTable
This table records values for the those inputs that vary between strikes. This
currently includes damping level and pedal position.
• WaveTable
This table contains a set of waveforms which hold the output of the generation
module. This table is queried by the channel filter to created sound fragments.
• ChannelBuffer
The sound fragments created by channel filter are held in this buffer before it is
combined into the SoundOutputBuffer.
Chapter 9 Concise System Overview 110
• SoundOutputBuffer
The SoundOutputBuffer is passed to the post filter module before being written to
the soundcard. It should be noted that the post filter module is currently unused.
9.2.3 Threading
In order to ensure that the hardware interfaces are dealt with at the correct rate, each
is given a separate thread. This means that the program effectively executes at three
separate points concurrently. This also allows the output thread to begin playing a
waveform before the generation thread has finished creating it.
Chapter 9 Concise System Overview 111
9.3.1 Description
The input processing module scans the information provided by the PIC and extracts
damping and strike information. The hit strength and radial position of these strikes
are then calculated, and provided to the sound generation part of the program.
9.3.2 Operation
Once the appropriate initialisation is performed, the data output from the PIC into the
serial port of the PC is read in frame by frame. A frame is defined as a set of piezo
values and the pedal value.
The frames are stored in a buffer and are then subjected to a running length averaging
filter, that attenuates high frequency noise. During the initialisation process, the zero-
volt offset is measured by averaging 1000 readings while the pad is not being hit. The
noise level is also estimated at this stage. It is taken to be the maximum value occurring
during this initialisation period.
Detecting the occurrence of a strike or damping requires the detection of peaks within
the sensor readings. This is achieved by finding an instance where the first derivative in
the signal is zero and the second derivative is negative.
The output variables for strength of damping and strength of strike are generated by
calculating the mean sensor value and then normalising it from 0 to 1.
The sound generation module uses a range of pre-recorded timpani sounds stored in
a dataspace, which is saved as a file called ‘dataspace.dat’. This is a file containing
all recordings, plus additional information to index and quantify the dataspace. (The
dataspace.dat file is compiled with trans.m on the project CD.)
• 8 radial positions
• 5 frequency steps from linear pedal positions on the timpani using the tuning gauge
• 5 hit strength levels, judged by hand and verified with a sound level meter (spurious
levels for individual hits corrected)
All recordings have been trimmed such that the value of the first datapoint of each
recording is zero. Each recording also has a linear taper to a zero value from the end
of the recording (where the end is the point at which the ‘ringing’ falls below the noise
level). The data is sorted as 16-bit signed integers, scaled so that the highest value in
all recordings is at 90% of the maximum (32768). There is a small degree of clipping on
some of the recordings (see file index J).
2. Frequency map offset: Value used when calculating the correct frequency shift
needed for pitch shifting. It corresponds to the ratio of the lowest frequency (in
Hz) to the highest frequency (in Hz) stored in the dataspace. (Stored as a float)
5. Recordings: The recordings stored consecutively in the same order as the index
of lengths above.
Chapter 9 Concise System Overview 113
The tables of normalised variables are calculated by the following equations. For all
integers, n, from 1 to the number of frequency steps, F , the normalised frequencies are:
(df(n) − df(1))
nf(n) = (9.1)
(df(F ) − df(1))
n 1
ns(n) = − (9.2)
H 2H
where ns(n) is the array of normalised hit strengths. For all integers, n, from 1 to the
number of positions, P , the normalised positions are:
1
np(n + 1) = ∗n (9.3)
(P − 1)
np(1) = 0 (9.4)
Chapter 9 Concise System Overview 114
The electronic timpani project attempts to recreate the sound of a authentic sound of an
acoustic timpani using digital processing. This section outlines some of the techniques
that are used.
Normalised values for the frequency, position and strength are read from the StrikeTable.
The model determines which recording in the dataspace is closest to these values. This
recording is then resampled to match the actual frequency (from the pedal position at the
time of the strike). The shift needed to resample is calculated using equation 9.5, where
FreqMapOffset is the ratio of the lowest frequency (in Hz) over the highest frequency
in the dataspace (in Hz), P edal is the normalised pedal position from the StrikeTable
and nf(closest) is the closest frequency variable in the dataspace. The amplitude is also
adjusted to the correct level. The resampling uses the linear interpolation method as
explained in section 8.6.2. No adjustment to the recording for the position is made.
In order to implement glissando and damping, each channel has a resampling and decay
routine in the channel filter (section 8.7.1). These are not continuous in real-time on a
datapoint by datapoint basis but occur on the ‘packets’ of data that are passed through
from the sound generation module to the channel combine filter. As packets are only a
fraction of a second long, it appears to the player as though these do occur in real-time.
With the current ‘packet’ size, steps in frequency and damping are not audible during
a glissando event.
9.5.2.1 Glissando
Resampling uses the same linear interpolation method used in the sound generation
module, where the shift is calculated using equation 9.5. In this case, the variable
nf(closest) is the frequency at which the recording is generated in the sound generation
model, and P edal is the normalised frequency read from the ContinuousInputTable to
get the current pedal position.
Chapter 9 Concise System Overview 115
9.5.2.2 Damping
The recordings are attenuated in the channel filter during a damping event. The amount
of attenuation depends upon the level of damping applied. The normalised damping
value is continuously read from the ContinuousInputTable. Datapoints with the record-
ing are then multiplied by a cumulatively decreasing factor. The rate at which the factor
decreases by is related to the normalised damping value. (For further information see
section 8.7.2.)
Chapter 10
Project Management
This section details how the project was managed, including task breakdown, resource
allocation, budget and communication.
116
Chapter 10 Project Management 117
The project was divided into several related, but distinct sections. This allowed team
members to focus on those areas particularly suited to their skills.
Stephen Emsen:
• Generation of code for interpreting and processing the input to differentiate damp-
ing and strike information
Christopher Heal:
• Project secretary
Richard Sunderland:
Robin Willis:
• Project coordination
All team members researched the hardware for the sensor selection at the start of the
project, and contributed to the discussion of the drum plate/sensor configuration. The
project report was written and edited with contributions from all members. All members
were involved in the anechoic chamber experiment.
Chapter 10 Project Management 118
Although the team initially had six members, the majority of the project was performed
by four. Two members chose to leave the course after the team had been allocated.
Adam Johnston, studying Mechanical Engineering, chose to graduate at the end of Part
III and Peter Wood study Electronic Engineering assumed a sabbatical post on the
Students Union Council in November 2001.
10.3 Budget
The project budget of £700 was provided by the Faculty of Engineering. This budget
was to include a project PC and all external hardware. A proposed cost breakdown was
submitted at the start of the project, but was later adjusted slightly to include an extra
hard-drive and transportation insurance for the University’s timpani. Even with these
extra costs the initial budget was found to be sufficient.
Most medium-term group projects require access to dedicated laboratory space so that
equipment can be left correctly configured. Unfortunately both the department (ECS)
and the Faculty of Engineering were unable to provide this. The unprompted generosity
of the ISIS research group, in their provision of laboratory space, contributed signifi-
cantly to this project. The team felt that consideration should be given to the provision
of well-equipped multi-disciplinary facilities.
10.5 Communication
Most formal discussion was performed using a threaded email-list. This allowed our
supervisors and sponsor to keep track of progress. This became especially important
Chapter 10 Project Management 119
during the period in which our supervisor was abroad. Informal communication, espe-
cially after combined lectures, helped with day-to-day organisation of tasks. In the early
stages a dedicated website proved useful for sharing research and links. This website
was not maintained towards the end of the project because it did not provide enough
benefit to justify the time required.
Group meetings created an opportunity for more formal discussions with the project
supervisor. This was important at the early stages in providing drive, feedback and
advice.
Formal presentations and meetings allowed the group to benefit from more in-depth and
critical appraisal, from both a musical and academic stand point.
Some of the university timpani were used during the course of the project in order to
obtain recordings. However, the availability was limited because they are in constant use
by the Music Department. The team was granted access to them on two occasions. The
first, in early December, was used for initial recordings in the Turner Sims concert hall.
The second, for two days later in December, were used for more formal recording in the
ISVR anechoic chamber. For this latter recording session, an insurance policy was taken
out to cover the timpani for accidental damage, and the help of the University Security
staff was requested in order to transport the timpani from the Music Department to the
ISVR.
Initially the team hoped that a close working relationship with the University Orchestra
would develop. This would enable us to continually monitor and improve the ‘look
and feel’ of the electronic timpani. Peter Wood had several good contacts within the
orchestra and the team lost these when he left. The now smaller team was forced to
manage resources more judiciously, and as such could not afford the time required to
develop a useful working relationship.
Chapter 10 Project Management 120
With any project there is a trade-off between the number of options considered in detail
and the amount of work that can be dedicated to developing the chosen solution. One
of the project’s key objectives was to produce a working first prototype that could be
used as a development platform for future work. As the project had a tight time scale,
it was important to identify which components were absolutely necessary and focus
attention on those. Input dimensions were prioritised so that the important ones would
be achieved first. Wherever possible the system was designed to facilitate extension and
modification at a later date.
The team was forced to reorganise its task allocation six weeks into the project. The work
and roles that had been allocated to Peter Wood were divided amongst the remaining
members. Robin’s role as a manager was slightly less important with the smaller team
and he took on a more technology-focused workload to compensate for the missing
member. The team was still determined to achieve the agreed objectives and so the
work schedules were not adjusted.
As Robin was not directly involved with the programming, the organisation and man-
agement of the timpani software (with the exception of the PIC program) was handled
by Richard. The work was divided into three main sections:
Although this project has laid a very solid foundation there is plenty of scope for further
development. This section outlines some of the ways the project could continue.
121
Chapter 11 Recommendations for Future Work 122
1. Multiple Timpani
Timpani are commonly used in groups of three or more. The system has been
designed with extensibility in mind and so the inclusion of further timpani should
not be difficult.
3. Tuning Pedal
The commercial keyboard pedal is satisfactory for the prototype electronic timpani.
Future versions would require a mechanism to more accurately simulate the action
of a real timpani pedal.
4. Communication Method
Currently RS-232 based communication is used. This is satisfactory for a single
timpani, but would not be suitable for several. A faster communication standard
such as USB or FireWire would allow further flexibility in the sensor sample rate.
A wireless system would make the system much more convenient to use.
5. Drum plate
Although the current drum plate operates effectively, it needs to be developed into
a rugged and portable device. Careful attention should be given to the selection
of materials to minimise the sound of strikes on the pad, and mimic the bounce of
a real skin.
6. Hardware resolution
The current hardware uses ADCs with a resolution of 8 bits. The resolution
could be increased to improve detection of soft hits. (This would require a faster
communication method.)
13. Marketing
It is hoped that this project may eventually evolve into a marketable product. To
date however, there has been no research into its potential.
Conclusions
124
Chapter 12 Conclusions 125
12.1 Success
This project has successfully taken an initial idea through to a working prototype. The
project provides a stable and well-designed development platform that will be a firm
foundation for future work in this area.
The electronic timpani responds to the four input dimensions defined in the project
objective. It also responds correctly to the fifth input dimension, namely glissandos.
Although the other dimensions (timpani type and mallet type) are not implemented,
this could be easily remedied by little more than including the appropriate recordings
in the dataspace.
12.2 Team
The group managed its changing resources effectively, allocating and re-allocating job
roles as the need arose. Care was taken to ensure that each team member’s skills and
enthusiasm were used to the benefit of the project.
Although strain gauges may provide a clean hardware solution to the detection of damp-
ing, there are several promising techniques that may render them unnecessary. These
include digital filtering and direct input transformation using machine learning.
12.4 Performance
The current prototype responds to inputs with no audible latency. Although detailed
testing has not been performed, initial qualitative analysis indicates that the system, as
it stands, would meet appropriate latency specifications.
It has not been possible to run the timpani software on a range of architectures and
operating systems. This has made in difficult to justify the choice of platform from a
performance perspective.
Currently thread scheduling is effective, but does not handle resource competition well
enough (i.e. if another program, like X, demands processor resources, the output will
click). Further research into scheduling, especially the real-time variants, should yield a
solution to this problem.
Chapter 12 Conclusions 126
Initial costing and system performance indicate that this project may well evolve into a
useful and potentially commercially viable training instrument.
Bibliography
127
BIBLIOGRAPHY 128
Terms
Recording Recording of the sound created following a single strike of the timpani, i.e.
there is a recording of a strike for each ‘frequency’, ‘position’ and ‘hit strength’
ddrum Clavia’s electronic drum pads, with the ability to detect position and pressure
Glissando Effect produced by the movement of the pedal while a note is playing,
changing the frequency of the sound
Damping Attenuation of the sound with the use of a hand placed gently on the timpani
skin
Abbreviations
dB Decibel
IC Integrated Circuit
PC Personal Computer
Clavia’s ‘ddrums’ and Roland’s ‘V-Drums’ have a reputation for being among the best
electronic drum pads available at present. Both are capable of detecting position as well
as strength of hit, and also have the ability to detect pressure applied to the head for
muting and pitch bending. However neither is available in sizes even close to that of a
timpani head, and neither is capable of glissando. Specifications can be found at ((19))
and ((20)).
Clearly it would not be possible to find out precisely how the pads worked, as man-
ufacturers prefer to keep designs out of competitors’ reach. The opportunity arose to
directly investigate the ddrum pads. Experiments would be limited, as there was not
scope in the budget to purchase the pads, and therefore had to be made at a retailer,
with their permission. This would also mean that deconstruction the pad to document
the internal workings could only be done to the extent that no damage would be made
to them.
Results
The pad has a cast aluminium shell with a pliable plastic membrane, which is formed
at the edges, making it more rigid at its edge than in its centre. This is then tightened
over a layer of fairly dense foam. It is held together by a set of tensioning screws around
a top ring. The shop assistant told us that in his older model an upper layer of foam
had a cut out in the middle, so that there was a hollow in the centre of the pad. It was
not possible to remove the foam to confirm the type of sensor used underneath for fear
of damaging the device.
The output of the pad had only two terminals so the pads had to use a fairly simple
sensor arrangement; the use of multiple devices or electronics within the pad would
require at least three terminals (+volts, ground, and signal). This tended to indicate
1
Appendix A Investigation of Clavia Ddrums 2
the use of a single piezoelectric sensor. An oscilloscope was used to first monitor the
direct output of the pad for given hits.
As shown in figure A.2 the output showed distinctly different waveforms for both centre
hits and edge hits, the former being smoother and having little ringing. The edge hits
seem to have a little less decay and a sharper initial pulse. It would appear that the
differences in the vibrations picked up by the sensors are formed by the way in which the
drum membrane is formed and tensioned. It detected the onset and offset of damping
by a smooth pulse but could not detect a continuous DC offset, again confirming that
it probably uses piezoelectric sensors and nothing else.
The ddrums strength is in its electronics and firmware, with interpretation of the nui-
sances in the output of the pad and mapping them to the way the drum is being played.
Position, hit strength and damping appear to be determined using only a single piezo-
electric sensor.
Appendix A Investigation of Clavia Ddrums 3
Table of Authorship
This section outlines which team member was primarily responsible for each section of
the report. This division roughly reflects which parts of the project each team member
focused on, however it should be noted the team worked in a flexible manner, with each
member supporting the others’ work.
• Introduction 1: SME
• Hardware 5: REW
• Software Archecture 6
Time Planning
Figure C.1 shows the initial project time plan. Some changes where made towards the
end of the project, especially with regard to report writing because the development of
the prototype took longer than expected as it was delayed by Multidisciplinary Projects
and Semester One exams.
6
Appendix D
Software Listings
D.1 timpani.h
#ifndef _timpani_h_defined
#define _timpani_h_defined
/********************************************************
* Timpani Group Design Project *
********************************************************
* file: /usr/local/timpani/include/timpani.h *
* *
* This is the general header file that is included *
* by all c files. It defines the shared global data- *
* structures and also the minmal functions that should *
* be provided by each module. It is possible for each *
* part of the program to declare its own header files *
* (espically the sound generation modules) but these *
* are not refered to here as the rest of the code need *
* not (and should not) know about them *
*******************************************************/
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/soundcard.h>
#include <sys/ioctl.h>
#include <sys/select.h>
#include <sys/time.h>
#include <sys/mman.h>
#include <pthread.h>
#include <unistd.h>
#include <termios.h>
#include <fcntl.h>
#include <errno.h>
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
7
Appendix D Software Listings 8
#include <signal.h>
#include <string.h>
/***********************************************************
* Sound Card Configuration *
* This struct is used to store the information required *
* to configure the sound card correctly. *
***********************************************************/
typedef struct SoundCardConfigurationStruct{
int setting; /* Used to configure how much memory */
/* the card uses as buffer */
int channels; /* 0=mono (chosen) 1=stero */
/***********************************************************
* Serial Port Configuration *
* This structs hold the information required to initialise *
* and terminate the serial port handling and also to *
* to convert the incoming bit stream into a series of *
* InputFrames.
***********************************************************/
typedef struct SerialPortConfigurationStruct{
struct termios oldoptions; /*Serialport configuration */
/*struct used for backing up */
/*old settings */
int handle; /*The descriptor to the port */
int state; /*state variable used for */
/*interpreting input bytes */
int lastPedal; /*Used to store pedal value */
/*in between sets of piezos */
}SerialPortConfiguration;
/***********************************************************
* Continuous Input Table *
* Table to store processesed information that reflects the *
* way the timpani changes from sample to sample. Currently *
* only two variables are stored;pedal position and damping.*
* Damping is reasonably difficult to calcuate, effectively *
* requiring a low pass filter. All data stored as floats *
* normailised between 0->1 *
Appendix D Software Listings 9
***********************************************************/
typedef struct ContinuousInputStruct{
float pedal;
float damping;
}ContinuousInput;
}ContinuousInputTable;
/***********************************************************
* Strike Table *
* Strike table stores processed information that relates *
* to independent strikes of a timpani skin. This currently *
* includes radial distance, strength and pedal at time of *
* strike. A later modification may extend this to include *
* a number which represents which timpani has been hit, *
* although this does have implications as to how the *
* filters work. Again normalised 0->1 *
***********************************************************/
typedef struct StrikeStruct{
float strength;
float pedal;
float radial;
}Strike;
/***********************************************************
* Input Frame Struct *
* This struct is used to communicate between the serial *
* port and the input processing. The pedal and reading *
* values are stored by serial port and the rest is *
* calculated by inputprocessing *
***********************************************************/
typedef struct InputFrameStruct{
int pedal; /* Pedal position for this frame */
int reading[3]; /* Direct sensor readings */
float smooth[3]; /* Smooth via average with past */
float mean; /* mean of smoothed readings */
}InputFrame;
Appendix D Software Listings 10
/***********************************************************
* Input Processing Configuration Struct *
* This contains all the global variables required to *
* distinghish between a hit and damp given a series of *
* InputFrames *
***********************************************************/
typedef struct InputProcessingStruct{
int risetime; /*The rising time of the signal */
/*when signal > noise threshold */
int negative_falltime;/*The falltime of the signal */
/*when signal < -noise threshold */
int damptime; /*The duration of the damping */
bool wait_for_zero; /*If true, need to measure the */
/*whole pulse duration, not just */
/*rise time */
int wait_after_hit; /*Decrementing counter used to */
/*ignore readings after a strike */
bool dampingMode; /*TRUE if damp is in progress, */
/*FALSE otherwise */
volatile int index; /*Current index in storage buffers*/
int middle; /*Index of the middle value in buffer*/
int length; /*Length of buffers*/
int bytes; /*Space required for storage */
float zero_point; /*Quantisation level corresponding*/
/*to ground */
float max_noise; /*Noise threshold detected during*/
/*initialisation procedure */
float max_damp; /*Maximum damp value in current damp*/
struct StrikeStruct stored_strike; /*Used to store peak */
/*information when delay is needed*/
}InputProcessing;
/**********************************************************
* Waves Table Struct *
* This contain the output from the mod_generate which *
* creates complete samples of timpani strikes based on *
* information contained in the StrikeTable struct. *
* This struct appears overly complex, but it may form part*
* of an interface between two non-synchronisied threads *
* so the information required for reading and writing to *
* it are separted, so that each side has a set that it is *
* responsible for maintaining such that the overside can *
* observe its activity *
**********************************************************/
}Wave;
/***********************************************************
* Channel Buffer Struct *
* This where the output from the mod_channel filter it *
* placed. It is basically a collection of buffers, the same*
* length as SoundOutputBuffer. Once they have been writen *
* to, they are combined to form the output before it is *
* to mod_post_filer. *
***********************************************************/
}ChannelBuffer;
/***********************************************************
* Sound Output Buffer *
* This buffer contains a sample that will be written to *
* the sound card. *
***********************************************************/
/***********************************************************
* DataSpace Struct *
* combines the above structs to create a global variable *
* collection. This is available to all the ’modules’ i.e. *
* mod_generate, mod_channel_filter and mod_post_filter. It *
* does not, however, contain the internal data required by *
* the models, as that is interanl to them and not *
* constrained by main program *
***********************************************************/
typedef struct DataSpaceStruct{
/* LOCAL VARIABLES */
/* These variables are mirrored in most */
/* sub-tructs, but since it is so important */
/* this one was added as the definitive article */
}DataSpace;
/*--------------------------------------------------------**
**--[ SOUND GENERATION MODULE FUNCTIONS ]-----------------**
**--------------------------------------------------------*/
/*--------------------------------------------------------**
**--[ HARDWARE IO FUNCIONS ]------------------------------**
**--------------------------------------------------------*/
/*--------------------------------------------------------**
**--[ TABLE MANAGEMENT FUNCTIONS ]------------------------**
**--------------------------------------------------------*/
void term_sound_output_buffer();
#endif
This program is compiled and programmed into the PIC. Readings from the external
ADCs, and the internal PIC ADC, are formatted and transmitted over the RS232 in-
terface to the PC.
/**************************************************************/
/* */
/* File name: TIMP8.C */
/* Description: PIC code for electronic timpani */
/* Inputs: 3 sets of 8-bit data (from 3 ADCs) */
/* 1 analogue input voltage to internal ADC */
/* Outputs: Inputs in prescribed protocol, over RS232 */
/* Version history: */
/* ADPICC1.c Initial code for RS232 data transmission */
/* ADPICC2.c Higher RS232 data rate implemented */
/* ADPICC3.c Pedal input added */
/* ADPICC4.c Byte splitting added */
/* TIMP5.c Byte splitting not required, */
/* external ADCs used */
/* TIMP6.c Triggered mode (third mode) added */
/* TIMP7.c Continuous non-strain gauge mode */
/* (fourth mode) added */
/* TIMP8.c Final protocol agreed upon */
/* (SB A B C pedal A B C), all else excluded */
/* */
/**************************************************************/
#include <16f871.h>
/* Configure RS232 */
#use rs232(baud=115200, xmit=PIN_e0, rcv=PIN_e1, parity = n, bits = 8)
main() {
int value_piezoa;
int value_piezob;
int value_piezoc;
int value_pedal;
int i;
set_adc_channel(0);
if (value_piezob == 255)
{
value_piezob = 254;
Appendix D Software Listings 17
}
putc(value_piezob);
if (value_piezoc == 255)
{
value_piezoc = 254;
}
putc(value_piezoc);
}
}
}
%------------------------------------------------------
% Pre Program, loads dataspace specs from dataspace.mat
% and constructs mapping
%load database
%filetoload = [’D:\dataspace\dataspace.mat’];
%load(filetoload);
% Normalise Freq
df = [82 113 132 141 154]; %Hz
Lf = length(df);
a = 0;
for a = 1:Lf
nf(a) = (df(a)-df(1))/(df(Lf)-df(1));
end
sf = (df(1)/df(Lf));
rf = 1-sf;
b = 1/Ls;
c = -b/2;
for a = 1:Ls
ns(a) = c + (b*a);
end
% Normalise Position
Lp = 8;
b = 1/(Lp-1);
np(1)=0;
for a = 1:(Lp-1)
np(a+1) = b*a;
end
% Damping
damp = damp * 6;
%------------------------------------------------------
% Main Program
diff = abs(ped-nf(a));
if diff < b
nearf = a;
end
b = diff;
end
a = sf + (nf(nearf)*rf);
b = sf + (ped*rf);
shift = b/a;
damprate = 1;
if damp > 0.1 %arbitary noise value chosen
for a = 1:lngth
damprate = damprate * damp;
dampdata(a)= newdata(a)*damprate;
end
end
%------------------------------------------------------
%play sample
playdata=newdata/32768;
sound(playdata,44100);
21
Appendix E Hardware Circuit Diagram 22
Analogue Digital
supply supply
regulator regulator +2.5V ADC
Power supply input reference Potentiometer
+5V(A) +5V(D)
6V - 20V DC voltage divider in tuning pedal
+V 2805 2805 +5V(D)
+5V(A) +2.5V ref +5V(A)
RS-232
1uF level shifter
0V
Analogue Digital 10k 20k 2
ground ground 1 4
+5V(D)
20 11 5
100pF 12 15
6 16 +5V(D)
CLK
+5V(A) 4 13
3 18 33 8
LM324
1k 5 34 9
Piezoelectric (1)
35
transducer
1M
36
7 8 1 2 10
37 +5V(D)
38
39 11
40 32
+5V(D)
20 11 15
CLK 12 16 1k
+5V(A) 4 13 17 1
3 18 26
LM324 12MHz
1k 5 19
Piezoelectric (2) 20 14
transducer
1M
7
8 1 2 10 21
15pF
22 10
27
28 3
+5V(D) 29 4
20 11 30 5
CLK 12
+5V(A) 6 7 12 31
4 13 1k
+2.5V ref 14
Sensor 9 15 1k 1k
2.2k ADC0804 Green LED
adjustment 19 16
potentiometers
6 17
power indicator
3 18
LM324
1k 5
Piezoelectric (3)
transducer
1M
7 8 1 2 10
LM324 buffer
op-amps
23
Appendix F Final Budget Breakdown 24
Hardware
Piezoelectric transducers £5.87
Adhesives £6.85
PIC microcontroller £7.48
Prototyping circuit board £2.54
Plastic case £3.50
Other hardware £40.98
Project PC components
AMD Athlon XP 1600+ £106.00
1.4Ghz SoA (Processor)
Gigabyte SoA VIA KT266 £75.00
ATX A (Motherboard)
IBM Deskstar 60GXP 20.6GB £63.00
UDMA100 (Hard disk)
Suntek Computers Ltd. £42.00
Viper ATX Midi Tower 250W
(Case)
Crucial Technology 256MB £52.62
184DIMM PC2100 NP CL2.5
(Memory x 2)
AOpen 52x IDE Internal £21.00
OEM (CD ROM)
Mitsumi 1.44MB Internal 3.5” £6.50
(Floppy Disk)
Dabs Value 105 Key PS/2 £4.00
(Keyboard)
Mitsumi 2 Button Classic £4.00
PS/2 Retail (Mouse)
Extra PSU (350W) (Power £9.40
Supply)
Extra hard disk and delivery £81.03
(HD for Linux)‘
Miscellaneous
Latex guide £35.70
Timpani transport insurance £25.00
Total £673.22
Appendix G
25
Appendix G Initial Idea Sketches 26
Timpani Photographs
29
Appendix H Timpani Photographs 30
The following is a list of the software packages used during this project. This software
may be helpful for any future work on electronic timpani.
Windows software:
CoolEdit 2000 was used to import the recordings of the timpani from the DAT
recorder. It is primarily a sound editing package.
Goldwave was also used for sound editing. The interface was found to be easier to
work with than CoolEdit.
Matlab was used to perform offline frequency and amplitude shifting of the recordings.
Listen32 proved invaluable in capturing serial data transmitted by the PIC to the PC.
Linux software:
pdflatex was the package for compiling and producing the report. The Latex format
is highly recommended over Microsoft Word, as the documents are more professionally
presented and the individual parts of the document can be simultaneously edited by
multiple users.
32
Appendix J
M-File index
Many of the .m files are specific to a certain stage in the process of indexing the recordings
and converting into a single dataspace. If the files are to be used or edited at a later
stage by a third party it is important to be careful of the names of the .mat files that
they refer to and the order in which they can be used.
33
Appendix J M-File index 34
CD Contents
35
Appendix K CD Contents 36
Recording Index
Complete listing of the recordings used when constructing the dataspace. These can be
found on the project CD in ‘.mat’ format. The headings are as follows:
f Frequency (out of 5)
s Hit Strength (out of 5)
p Position (out of 8)
t Take (take 1 or 2)
size The number of datapoints in the sample (after trimming and
tailing)
freq The fundamental frequency of the recording in Hz, i.e. the
pitch of the note perceived
+n The maximum datapoint value in the last 0.5 seconds of
data
-n The minimum datapoint value in the last 0.5 seconds of data
mean The mean of the mean of the last 0.5 seconds of data
rms The RMS value of the last 0.5 seconds of data
cli Clipping, indicates the first datapoint in the recording at
which clipping occurs
lim Limiting, indicates the first datapoint in the recording that
reaches within 95% of the maximum possible value
r Review?, if the recording indicates limiting value but not
clipping a value of 1 indicates that a manual check of the
data should be made
1st100ms The rms value of the first 100ms of data as an indication of
the hit strength
hit The 1st100ms value is corrected to take account of the posi-
tion dependence of the level (8.5.5). The value is normalised
such that the maximum value would be 100 before the cor-
rection for position is made
37
Appendix L Recording Index 38