PDF

Music and Shape
Studies in Musical Performance as Creative Practice

Series Editor John Rink
Volume 1
Musicians in the Making: Pathways to Creative Performance
Edited by John Rink, Helena Gaunt and Aaron Williamon
Volume 2
Distributed Creativity: Collaboration and Improvisation in
Contemporary Music
Edited by Eric F. Clarke and Mark Doffman
Volume 3
Music and Shape
Edited by Daniel Leech-Wilkinson and Helen M. Prior
Volume 4
Global Perspectives on Orchestras: Collective Creativity and Social Agency
Edited by Tina K. Ramnarine
Volume 5
Music as Creative Practice
Nicholas Cook
STUDIES IN MUSICAL PERFORMANCE AS CREATIVE PRACTICE
About the series
Until recently, the notion of musical creativity was tied to composers and the works
they produced, which later generations were taught to revere and to reproduce in
performance. But the last few decades have witnessed a fundamental reassessment
of the assumptions and values underlying musical and musicological thought and
practice, thanks in part to the rise of musical performance studies. The five volumes in
the series Studies in Musical Performance as Creative Practice embrace and expand
the new understanding that has emerged. Internationally prominent researchers,
performers, composers, music teachers and others explore a broad spectrum of
topics including the creativity embodied in and projected through performance,
how performances take shape over time, and how the understanding of musical
performance as a creative practice varies across different global contexts, idioms
and performance conditions. The series celebrates the diversity of musical perfor-
mance studies, which has led to a rich and increasingly important literature while
also providing the potential for further engagement and exploration in the future.
These books have their origins in the work of the AHRC Research Centre
for Musical Performance as Creative Practice (www.cmpcp.ac.uk), which con-
ducted an ambitious research programme from 2009 to 2014 focused on live
musical performance and creative music-making. The Centre’s close inter
actions with musicians across a range of traditions and at varying levels of
expertise ensured the musical vitality and viability of its activities and outputs.
Studies in Musical Performance as Creative Practice was itself broadly con-
ceived, and the five volumes encompass a wealth of highly topical material.
Musicians in the Making explores the creative development of musicians in
formal and informal learning contexts, and it argues that creative learning is
a complex, lifelong process. Distributed Creativity explores the ways in which
collaboration and improvisation enable and constrain creative processes in
contemporary music, focusing on the activities of composers, performers and
improvisers. Music and Shape reveals why a spatial, gestural construct is so
invaluable to work in sound, helping musicians in many genres to rehearse,
teach and think about what they do. Global Perspectives on Orchestras consid-
ers large orchestral ensembles in diverse historical, intercultural and postcolo-
nial contexts; in doing so, it generates enhanced appreciation of their creative,
political and social dimensions. Finally, Music as Creative Practice describes
music as a culture of the imagination and a real-time practice, and it reveals the
critical insights that music affords into contemporary thinking about creativity.
Music and Shape
Edited by
Daniel Leech-Wilkinson
Helen M. Prior
1
1
Oxford University Press is a department of the University of Oxford. It furthers
the University’s objective of excellence in research, scholarship, and education
by publishing worldwide. Oxford is a registered trade mark of Oxford University
Press in the UK and certain other countries.
Published in the United States of America by Oxford University Press

198 Madison Avenue, New York, NY 10016, United States of America.
© Oxford University Press 2017
All rights reserved. No part of this publication may be reproduced, stored in

a retrieval system, or transmitted, in any form or by any means, without the
prior permission in writing of Oxford University Press, or as expressly permitted
by law, by license, or under terms agreed with the appropriate reproduction
rights organization. Inquiries concerning reproduction outside the scope of the
above should be sent to the Rights Department, Oxford University Press, at the
address above.
You must not circulate this work in any other form

and you must impose this same condition on any acquirer.
Library of Congress Cataloging-in-Publication Data

Names: Leech-Wilkinson, Daniel. | Prior, Helen M.
Title: Music and shape / edited by Daniel Leech-Wilkinson, Helen M. Prior.
Description: New York, NY: Oxford University Press, [2017] |
Series: Studies in musical performance as creative practice; 3 |
Includes bibliographical references and index.
Identifiers: LCCN 2016042331 | ISBN 9780199351411 (hardcover) | ISBN 9780199351442 (oso)
Subjects: LCSH: Music—Psychological aspects. | Music—Performance—Psychological aspects.
Classification: LCC ML3838.M94947 2017 | DDC 781.1/7—dc23
LC record available at https://lccn.loc.gov/2016042331
9 8 7 6 5 4 3 2 1
Printed by Sheridan Books, Inc., United States of America
CONTENTS
List of contributors ix
List of illustrations xvi
About the Companion Website xxiii
Preface xxv
DANIEL LEECH-WILKINSON AND HELEN M. PRIOR
PART 1 Shapes mapped

Reflection Evelyn Glennie 3
1 Key-postures, trajectories and sonic shapes 4
ROLF INGE GODØY
Reflection Lucia D’Errico 30

2 Shape, drawing and gesture: empirical studies of cross-modality 33
MATS B. KÜSSNER
Reflection Anna Meredith 57

3 Cross-modal correspondences and affect in a Schubert song 58
ZOHAR EITAN, RENEE TIMMERS AND MORDECHAI ADLER
PART 2 Shapes composed

Reflection George Benjamin 89
4 Affective shapes and shapings of affect in Bach’s Sonata for
Unaccompanied Violin No. 1 in G minor (BWV 1001) 96
MICHAEL SPITZER
Reflection Steven Isserlis 127

5 Shape in music notation: exploring the cross-modal representation
of sound in the visual domain using zygonic theory 129
ADAM OCKELFORD
Reflection Alice Eldridge 165

6 The shape of musical improvisation 170
MILTON MERMIKIDES AND EUGENE FEYGELSON
vii
viii Contents
PART 3 Shapes performed

Reflection Max Baillie 207
7 Shape as understood by performing musicians 216
HELEN M. PRIOR
Reflection Simon Desbruslais 242

Reflection Malcolm Bilson 248
8 Shaping popular music 252
ALINKA E. GREASLEY AND HELEN M. PRIOR
Reflection Steven Savage 278
PART 4 Shapes seen

Reflection Mark Applebaum 283
Reflection I-Uen Wang Hwang 302
9 Music and shape in synaesthesia 306
JAMIE WARD
Reflection Timothy B. Layden 321

Reflection Stephen Hough 323
Reflection Alex Reuben 324
10 Intersecting shapes in music and in dance 328
PHILIP BARNARD AND SCOTT DELAHUNTA
Reflection Richard G. Mitchell 351
PART 5 Shapes felt

Reflection Julia Holter 357
11 Musical shape and feeling 359
DANIEL LEECH-WILKINSON
Reflection David Amram 383
Reflection Antony Pitts 386
Notes 389
Index 397
LIST OF CONTRIBUTORS
Mordechai Adler graduated from Tel Aviv University in 2014 with a PhD in
musicology. His dissertation, ‘Cross-modal correspondence and musical repre-
sentation’, combines empirical studies of cross-modal perception with musical
analyses. Adler is currently developing a music education method using cross-
modal correspondences.
David Amram is one of the most prolific and performed composers of his gen-
eration, and has left a unique mark on the world of music. He became the first
composer-in-residence with the New York Philharmonic in 1966 at the request
of Leonard Bernstein. At eighty-six Amram continues to work as a classical
composer, multi-instrumentalist, band leader, lecturer and guest conductor,
constantly composing as he tours the world.
Mark Applebaum is Associate Professor of Composition at Stanford
University. His solo, chamber, choral, orchestral, operatic and electroacoustic
work has been performed widely and includes notable commissions from the
Merce Cunningham Dance Company, the Fromm Foundation and the Vienna
Modern Festival. Many of his pieces challenge the conventional boundaries of
musical ontology. Applebaum is also an accomplished jazz pianist and builds
electroacoustic sound-sculptures out of junk, hardware and found objects.
Max Baillie is a leading instrumentalist of his generation, equally at home on
both violin and viola. As a performer he has appeared on stages from Carnegie
Hall to Glastonbury and from Mali to Moscow in a diverse spectrum of styles
including classical, pop, folk and electronic music, alongside leading artists
from around the world. He plays principal viola in the Aurora Orchestra and is
part of a series of unique creative projects which go beyond the concert stage.
Philip Barnard worked for the Medical Research Council’s Cognition and
Brain Sciences Unit (CBSU) in Cambridge from 1972 to 2011, where he car-
ried out research on how memory, attention, language, body states and emo-
tion work together. He is now retired but remains a visiting researcher with the
CBSU. Since 2003, he has been collaborating with Wayne McGregor | Random
Dance to develop productive synergies between choreographic processes and
our knowledge of cognitive neuroscience.
George Benjamin was born in 1960 and began composing at the age of seven.
After studying with Messiaen he worked with Alexander Goehr at King’s
ix
x List of contributors
College, Cambridge. His Ringed by the Flat Horizon was performed at the BBC
Proms when he was just twenty. Written on Skin has been scheduled by numer-
ous international opera houses since its 2012 premiere in Aix. He regularly
conducts some of the world’s leading orchestras and since 2001 has been the
Henry Purcell Professor of Composition at King’s College London.
Malcolm Bilson has been a key contributor to the restoration of the fortepiano
to the concert stage and to fresh recordings of the ‘mainstream’ repertory. He
has recorded the Mozart piano concertos with John Eliot Gardiner and the
English Baroque Soloists, and the complete Mozart and Schubert solo sona-
tas. Bilson gives concerts, masterclasses and lectures around the world. He is a
member of the American Academy of Arts and Sciences and has an honorary
doctorate from Bard College.
Lucia D’Errico is an artist devoted to experimental music, performing on
plucked string instruments. As a performer and improviser, she collaborates
with contemporary music groups and with theatre, dance and visual art com-
panies. An artistic researcher at Orpheus Institute Ghent, she is part of the
ME21 research project. Her doctoral research (on the docARTES programme)
focuses on recomposing baroque music. She is also active as a freelance graphic
designer.
Scott deLahunta has worked as writer, researcher and organizer on a range of
international projects bringing performing arts with a focus on choreography
into conjunction with other disciplines and practices. He is currently Senior
Research Fellow at Deakin University (Australia) in partnership with Coventry
University (UK), R-Research Director (on sabbatical) at Wayne McGregor |
Random Dance, and Director of Motion Bank/The Forsythe Company.
Simon Desbruslais has an international reputation as a trumpet soloist, spe-
cializing in the performance of baroque and contemporary music. His solo
disc Contemporary British Trumpet Concertos on Signum Classics includes
new works written for him by John McCabe, Deborah Pritchard and Robert
Saxton. He is a lecturer in music at the University of Hull, and has taught at the
universities of Oxford, Bristol, Nottingham and Surrey. He is writing a mono-
graph on the music and music theory of Paul Hindemith, based on his doctoral
dissertation from Christ Church, Oxford.
Zohar Eitan is a professor of music theory and music cognition at the
Buchman-Mehta School of Music, Tel Aviv University. His recent research
was published in Cognition, Journal of Experimental Psychology: Human
Perception and Performance, Experimental Psychology, Music Perception,
Psychology of Music, Musicae Scientiae, Empirical Musicology Review and
Psychomusicology.
List of contributors xi
Alice Eldridge is a researcher, lecturer and cellist with interdisciplinary inter-

ests in biological systems and sound. She leads the Music Informatics Degree
at the University of Sussex, where she works across the creative arts, technol-
ogy and science. As a cellist she embraces collaboration and has performed
with a diverse array of personalities, including Steve Beresford, Russell Brand,
Icarus, Shih-Yang Lee, Vagina Dentata Organ and Evan Parker. She is a mem-
ber of the London Improvisers’ Orchestra and a regular at John Russell’s Fete
Qua Qua.
Eugene Feygelson took a performance-based PhD at King’s College London,
focusing on modes of nonverbal communication used in improvisatory con-
texts. His other interests include classical and contemporary improvisation,
music cognition, and relationships between music and language, as well as
music’s role in human evolution. Feygelson’s postgraduate research, including
his Master’s from the University of Cambridge, was supported by the Jack
Kent Cooke Foundation.
Evelyn Glennie was the first musician to create and sustain a career as a solo
percussionist. She has played around two thousand five hundred concerts in
more than fifty countries, recorded thirty albums and won three GRAMMY
awards. In 1990 she released her autobiography, Good Vibrations, and since
then has written several essays, has contributed to a variety of publications and
often publishes printed music. She continues to invest in realizing her vision: to
Teach the World to Listen.
Rolf Inge Godøy is Professor of Music Theory at the Department of Musicology,
University of Oslo. His main interest is in phenomenological approaches to
music theory, taking our subjective experiences of music as the point of depar-
ture for music theory. This work has been expanded to include research on
music-related body motion in performance and listening, using various con-
ceptual and technological tools to explore the relationships between sound and
body motion in the experience of music.
Alinka E. Greasley is Lecturer in Music Psychology at the University of Leeds,
where she teaches music psychology at all levels and leads the MA Applied
Psychology of Music programme. Her research lies mainly within the field of
social psychology of music, and her interests focus on people’s experiences with
and uses of music in everyday life, including musical preferences, categoriza-
tion of musical genres, functions of music, listening behaviour, electronic dance
music culture and DJ performance practice.
Julia Holter is a musician from Los Angeles interested in songwriting, perform-
ing and various methods of recording. Her most recent recording was the stu-
dio album Loud City Song (2013) on Domino Records. Since the release of her
previous two albums, Tragedy (2011) and Ekstasis (2012), she has performed
xii List of contributors
at venues and festivals throughout Europe, North America, Lebanon and

Australia. She has had pieces commissioned by and/or has performed with
ensembles such as the Los Angeles Philharmonic and Stargaze. She frequently
collaborates in group projects with artist and musician friends including Rick
Bahto, Ramona Gonzalez, Yelena Zhelezov, Laurel Halo, Mark So, Cat Lamb
and Laura Steenberge.
Stephen Hough is an English pianist with a catalogue of more than fifty CDs.
His iPad app ‘The Liszt Sonata’ was released by Touch Press in 2013. As a com-
poser, he has been commissioned by the Wigmore Hall, the Musée du Louvre,
the National Gallery (London), musicians of the Berliner Philharmoniker,
BBC Symphony Orchestra, Westminster Abbey and Westminster Cathedral.
He is on the faculty of the Juilliard School in New York and is a visiting profes-
sor at the Royal Academy of Music in London and the Royal Northern College
of Music in Manchester.
I-Uen Wang Hwang moved from Tainan, Taiwan to the USA and earned her
PhD in music composition from the University of Pennsylvania (1998). Since
she is both a painter and a musician, a link between her music and art naturally
developed. She often paints to amplify her creativity while composing. The
Taiwan National Symphony Orchestra has commissioned three of her sympho-
nies, including Diptych of Taiwan (2010), which was included in the CD that
won Taiwan’s Golden Melody Award (2014) for best art music album.
Steven Isserlis is a British cellist who is acclaimed worldwide for his technique
and musicianship. He enjoys a distinguished career as a soloist, chamber musi-
cian, educator and author. While his extensive performing and recording career
takes up the majority of his time, he has also written two books for children
about the lives of the great composers, and he gives frequent masterclasses all
around the world. For the past seventeen years he has been Artistic Director of
the International Musicians’ Seminar at Prussia Cove in Cornwall.
Mats Küssner is Research Associate in the Department of Musicology and
Media Studies at Humboldt University, Berlin. In 2014– 15, he was Peter
Sowerby Research Associate in Performance Science at the Royal College of
Music. Küssner completed his PhD within the AHRC Research Centre for
Musical Performance as Creative Practice at King’s College London, investi-
gating embodied cross-modal mappings of sound and music. In 2013, he and
Daniel Leech-Wilkinson co-edited a special issue of Empirical Musicology
Review on ‘Music and Shape’.
Timothy B. Layden was born in the USA but is currently living and working in
the UK. He studied fine art at the University of the Americas (Mexico), before
receiving a doctorate in fine art from the University of Barcelona in 2005. He
is an interdisciplinary artist working primarily with sound, image and text. He
List of contributors xiii
has been involved in diverse art and educational projects around the world.
Much of his work is inspired by his own experience of synaesthesia.
Daniel Leech-Wilkinson studied at the Royal College of Music, King’s College
London and Clare College, Cambridge, becoming first a medievalist and then,
since c. 2000, specializing in the implications of early recordings, especially
in relation to music psychology and performance creativity. He led a project
on ‘Expressivity in Schubert song performance’ within the AHRC Research
Centre for the History and Analysis of Recorded Music (CHARM), followed
by ‘Shaping music in performance’ as part of the AHRC Research Centre for
Musical Performance as Creative Practice. Books include The Modern Invention
of Medieval Music (2002) and The Changing Sound of Music (2009).
Anna Meredith is a composer and performer of both acoustic and electronic
music. She has been Composer in Residence with the BBC Scottish Symphony
Orchestra, RPS/PRS Composer in the House with Sinfonia ViVA, the classical
music representative for the 2009 South Bank Show Breakthrough Award and
winner of the 2010 Paul Hamlyn Award for Composers. HandsFree (2012), a
PRS/RPS 20x12 Commission for the National Youth Orchestra, was performed
at the BBC Proms, Barbican Centre and Symphony Hall as well as numer-
ous flashmob performances around the UK. Her debut EP Black Prince Fury
was released on Moshi Moshi records to critical acclaim including Drowned in
Sound’s ‘Single of the Year’.
Milton Mermikides is Lecturer in Music and Head of Composition at the
University of Surrey, Professor of Jazz Guitar at the Royal College of Music,
and deputy director of the International Guitar Research Centre. He is a com-
poser, guitarist and sound artist with a keen interest in a range of disciplines
including jazz, popular, electronic and ‘world’ music, improvisation, digital
technologies in analysis and creative practice, music perception, art/science col-
laboration, and data sonification.
Richard G. Mitchell is a film composer. He graduated from Central Saint
Martins in fine art and film, where he composed for students at Saint Martins,
Royal College of Art and National Film School. His best-known works are
A Good Woman (Scarlett Johansson, Helen Hunt), To Kill a King (Tim Roth,
Rupert Everett), and Grand Theft Parsons (Johnny Knoxville), and he has an
Ivor Novello Award for Trial by Fire, a Royal Television Society Award for the
BBC The Tenant of Wildfell Hall, and a Polish Academy Award for Günter
Grass’s The Call of the Toad.
Adam Ockelford is Director of the Applied Music Research Centre at the
University of Roehampton, UK. His research interests are in music psychol-
ogy, education, theory and aesthetics(particularly special educational needs
and the development of exceptional abilities); learning, memory and creativity;
xiv List of contributors
the cognition of musical structure; and the construction of musical meaning.

Recent books include Applied Musicology: Using Zygonic Theory to Inform
Music Education, Therapy and Psychology Research, and Music, Language and
Autism: Exceptional Strategies for Exceptional Minds.
Antony Pitts is a composer, conductor, producer, and winner of the Prix Italia,
Cannes Classical and Radio Academy BT Awards. From Hampton Court
Chapel Royal treble, New College Oxford Academic Scholar and Honorary
Senior Scholar, TONUS PEREGRINUS founder-director, Royal Academy of
Music Senior Lecturer and BBC Radio 3 Senior Producer to Artistic Director
of The Song Company, he has made music at London’s Wigmore Hall and
Westminster Cathedral, Amsterdam’s Concertgebouw, Berlin’s Philharmonie
Kammermusiksaal, and the Sydney Opera House.
Helen M. Prior is a lecturer at the University of Hull. Her work on music
and shape began when she was a postdoctoral researcher within the AHRC
Research Centre for Musical Performance as Creative Practice at King’s
College London. She has interests in musical performance, music and emotion,
and music perception and familiarity.
Alex Reuben makes movies characterized by dance, music and environment. His
films are exhibited by Picturehouse and Curzon Cinemas. Routes—Dancing to
New Orleans was selected in the ‘Top 20 Movies of the Decade’ (Geoff Andrew,
BFI/Time Out). Reuben is a lecturer at Camberwell and Central Saint Martins.
He has been commissioned by Sadler’s Wells, Channel 4 TV, DanceDigital
and the BBC, with awards from The British Council and Jerwood Charitable
Foundation, and he is director of Cinderella (RockaFela), a movie about cog-
nition and movement, for the Wellcome Trust and Arts Council England.
Steve Savage is an active producer and recording engineer. He has been the
primary engineer on seven records that received GRAMMY nominations,
including CDs for Robert Cray, John Hammond, The Gospel Hummingbirds
and Elvin Bishop. Savage holds a PhD in music and teaches musicology at
San Francisco State University. He has several books that frame his work as
a researcher and as a practitioner, including his most recent work Mixing and
Mastering in the Box (2014).
Michael Spitzer is Professor of Music at the University of Liverpool, having
previously taught for many years at Durham University. Author of Metaphor
and Musical Thought (2004) and Music and Philosophy: Adorno and Beethoven’s
Late Style (2006), his research explores the interfaces between music theory,
aesthetics and psychology. He inaugurated the series of international confer-
ences on music and emotion at Durham in 2009, and is presently writing a
history of music and emotion.
List of contributors xv
Renee Timmers is Reader in Psychology of Music at the University of Sheffield,

where she directs the research centre Music, Mind, Machine in Sheffield. She
was trained in the Netherlands in musicology and psychology. She carried out
postdoctoral research at King’s College London, Northwestern University and
Radboud University Nijmegen, among others. Her main areas of research
include expressive timing in music performance, perception and expression of
emotion in music, and multimodal experiences of music.
Jamie Ward is Professor of Cognitive Neuroscience at the University of Sussex.
He has an MA in Natural Sciences from the University of Cambridge and a
PhD in Psychology from the University of Birmingham. He previously held a
faculty position at University College London. He is Co-Director of Sussex
Neuroscience and was Founding Editor of the journal Cognitive Neuroscience.
He has a particular research interest in synaesthesia and, more generally, in the
question of how information is integrated between the senses.
LIST OF ILLUSTRATIONS
Figures
1.1 A pianola representation of the first eight bars of J. S. Bach’s Fugue

in C major, Well-Tempered Clavier Book I 6
1.2 The spectrogram of a sustained deep C double bass tone (top) and
the spectrogram of the same tone passed through a time-varying
wahwah filter (bottom) 17
1.3 The spectrogram of a distortion guitar sound with a downward
glissando followed by a slow upward expansion (top), and so-called
sound-tracings of this sound by nine listeners (bottom) 19
1.4 The score of the first two bars of the last movement of Beethoven’s Piano
Concerto No. 1 (top), and graphs showing the position, velocity and
acceleration of the vertical motion of the right-hand knuckles, wrist
(RWRA) and elbow (RELB) in the performance of these two bars 23
1.5 The top part shows motiongrams (i.e. video-based summary images
of motion trajectories; see Jensenius 2013 for details) of three
different successive dance performances by the same dancer to a
twenty-second excerpt from Lento from György Ligeti’s Ten Pieces
for Wind Quintet (Ligeti 1998), and the bottom part shows for the
purpose of reference three repetitions of the spectrograms of this
excerpt. 25
R.1 Schematization of bodily music-shape forces 31
3.1 Mean weighted pitch (black line) and mean absolute pitch interval
(grey line) per two-bar phrase 66
3.2 Mean intensity (left) and maximum intensity (right) per two-bar
phrase for three performers. Intensity was measured from commercial
recordings combining the piano and the vocal line. 67
3.3 Average rhythmic durations (black line) of the vocal line and
standard deviation of rhythmic durations (grey line) within successive
two-bar phrases 68
3.4 Median spectral centroid (Hz) per stanza for three performances of
Schubert’s ‘Die Stadt’ 69
3.5 Normalized phrase duration of successive two-bar phrases in the
performance by DFD, IB and TQ 77
xvi
List of illustrations xvii
R.2 Berg, Wozzeck, Act 3, bars 3–7 91

R.4 Berg, Wozzeck, Act 3, bar 114 92
R.11 Berg, Wozzeck, Act 3: harmonic connections 95
4.1 Bach, Sonata for Unaccompanied Violin No. 1 in G minor
(BWV 1001), Adagio, bars 1–13 100
4.2 Vivaldi, Violin Concerto Op. 3 No. 6, Largo, bars 1–6 105
4.3 Inflections of the fifth cycle 106
4.4 Tempo and dynamic map of Luca, bars 1–13 110
4.5 Tempo and dynamic map of Perlman, bars 1–13 111
4.6 Tempo and dynamic map of Kremer, bars 1–13 114
(BWV 1001), Fuga, bars 1–4 118
(BWV 1001), Siciliana, bars 1–2 119
(BWV 1001), Presto, bars 1–11 120
4.10 (a) Hypermetrical reduction of Bach, Sonata for Unaccompanied
Violin No. 1 in G minor (BWV 1001), Presto, bars 1–6; (b) metrical
reduction of bars 6–8, revealing syncopation 121
5.1 Oboe and cor anglais duet from the third movement of Vaughan
Williams’ Fifth Symphony 131
5.2 Representation of primary interperspective relationships 132
5.3 Primary and secondary zygonic relationships 133
5.4 The image of a small black dot 135
5.5 Two small black dots 135
5.6 A primary interperspective relationship of location, whose value is
shown using Cartesian coordinates 135
5.7 A secondary zygonic relationship of location reflects the fact that
the difference in location between dots B and C is deemed to exist in
imitation of the difference between A and B. 136
5.8 Imitation of location at the tertiary zygonic level 137
5.9 The perceived orderliness inherent in a straight line modelled in
zygonic terms 137
5.10 One shape deemed to exist in imitation of another 138
xviii List of illustrations
5.11 Single interperspective values of difference cannot be imitated

between domains;therefore, systematic mapping and iconic
representation in Peircean terms are not possible. 139
5.12 Domains whose perspective values are capable of conveying
a sense of size can bear cross-modal imitation of ratios at
the secondary level and therefore have the capacity for iconic
representation. 140
5.13 Iconic representation of pitch in terms of location through
tertiary-level imitation 140
5.14 Example of the derivation of pitch through imitation of a ratio
between differences in location from a constellation in Stockhausen’s
Sternklang (1971) 143
5.15 Indirect connection between graphic and sound 144
5.16 An arbitrary shape is given meaning by convention. 145
5.17 The meaning of an arbitrary shape learned through imitation 146
5.18 Cross-modal relationship engendered by pitch-colour
synaesthesia 147
5.19 Taxonomy of the possible types of relationship between musical
sounds and visual images 149
5.20 A child’s transcription and performance of a rhythm 150
5.21 Regular cross-modal mapping between sound and score, and score
and sound 151
5.22 A congenitally blind child’s representation of pitch glides on
German film 153
5.23 Cross-modal imitation at the tertiary level assumed to underlie the
representation of a pitch glide as a straight diagonal line 154
5.24 Western staff notation embeds arbitrary symbols within a
semi-regular framework of pitch and time. 155
5.25 Music Time in braille music notation (represented in print form),
with explanations of the signs 157
5.26 The fingering for the opening four chords of Music Time presented
using guitar chord symbols 158
5.27 The three semiotic processes at work as a guitarist performs from a
chord symbol 159
5.28 Fragment of Jamie Roberts’ synaesthetically derived score of
Jean-Michel Jarre’s Oxygène, track 4 159
5.29 Types of semiosis functioning in a fragment of Jamie Roberts’
synaesthetic score of Oxygène 160
R.12 Opening of the Prelude of Bach’s Cello Suite No. 5 in scordatura
notation 166
6.1 An illustration of a complex chains-of-thought improvisation
methodology 173
List of illustrations xix
6.2 An illustration of musical refractions. In the course of an

improvisation, a phrase is manipulated by the selection of one of
many transformational processes (1–8 present a few of countless
possibilities). 176
6.3 Coexisting interpretations of Phrase α 178
6.4 Improvised continuations of Phrase α 180
6.5 An illustration of how the fixing and variation of musical topics may
forge improvisational continuations from Phrase α 181
6.6 Coltrane’s cube: some possible phrases of Coltrane’s
Acknowledgement plotted in the three-dimensional musical space
of metric placement, rhythmic separation and chromatic
transposition, with a few coordinates illustrated with standard
notation 183
6.7 Phrase α existing at the centre of a three-dimensional musical space
with variously proximate neighbouring phrases 185
6.8 An impression of M-Space: phrase α sits at the centre of many
simultaneous dimensions of musical transformation. 186
6.9 A multi-level depiction of Smith’s solo on The Sermon 188
6.10 Five improvisational structures: 1) ‘Nuclear’. 2) ‘Field Series’.
3) ‘Pivot’. 4) ‘Merged’. 5) ‘Unbounded’. 190
6.11 Opening section of Léonard’s cadenza (L1/L2) and corresponding
sections from Beethoven’s Violin Concerto Op. 61 (B1/B2) 196
6.12 Second section of Léonard’s cadenza (L3/L4A/L4B) and
corresponding sections from Beethoven’s Violin Concerto Op. 61
(B3/B4) 197
6.13 Final section of Léonard’s cadenza (L5/L6A/L6B) and
corresponding sections from Beethoven’s Violin Concerto Op. 61
(B5/B6) 199
6.14 Graphic representation of Léonard’s cadenza illustrating the
relationship of musical proximity to Beethoven’s original score 200
R.13 The opening of the Allemande from J. S. Bach’s Partita in D minor
for solo violin: (R.13a) as usual and (R.13b) upside down 209
R.14 Allemande, bars 1–8, with a harmonic analysis of tonal centres and
harmonic rhythm 210
R.15 The passage in R.14 represented as a physical journey through space
between related tonal orbits 211
R.16 Allegro assai, bars 1–8, from J. S. Bach’s Sonata in C major for
unaccompanied violin 212
R.17 The passage in R.16 showing the harmonic rhythm 213
R.18 The Allegro assai, bars 13–16, showing melodic rhythm 214
7.1 Model of musical shaping. In the online version, each component is
numbered, and numbered examples of each component are presented
in linked tables. 222
xx List of illustrations
R.19 Bach, B minor Mass, Gloria II, bars 57–61: a) Gesellschaft edition,
followed by b) a notated interpretation 244
R.20 Bach, B minor Mass, Cum sancto spiritu, bars 111–17 245
R.21 J. S. Bach, Complete Trumpet Repertoire, Vol. III with my
annotations 245
R.22 Tchaikovsky, Swan Lake Suite Op. 20a, ‘Intrada’, rehearsal
mark 13 246
R.23 Pritchard, Skyspace (2012), third movement, notated for piccolo
trumpet in A, bars 1–8 246
R.24 Beethoven, Piano Sonata in F minor, Op. 2 No. 1, first movement,
bars 1–9 249
R.25 Three-dimensional mixing metaphor 279
R.26 The Metaphysics of Notation, panel 4 285
R.27 The Metaphysics of Notation, panel 4 close-up: descending
‘shields’ 286
R.28 The Metaphysics of Notation, panel 4 close-up: sinusoidal curve 287
R.29 The Metaphysics of Notation, panel 5 288
R.30 The Metaphysics of Notation, panel 5 close-up: materialization of
rectilinear forms 289
R.31 The Metaphysics of Notation, panel 5 close-up: contrasting materials,
‘heart guitar’ and canonic dots 291
R.32 The Metaphysics of Notation, panels 3, 4, 5, 6 and 7 in stacked
arrangement 293
R.33 The Metaphysics of Notation, close-up: circle and oval pair inverted
across panels 3 and 4 294
R.34 The Metaphysics of Notation, close-up: ‘scroll’ with number five
inverted across panels 3 and 4 295
R.35 The Metaphysics of Notation, close-up: panels 4 and 5 inverted
shields, connection to the ‘heart guitar’ 296
R.36 The Metaphysics of Notation, close-up: panels 5, 6 and 7 dangling
angles, chain of circles, dot clock 297
R.37 The Metaphysics of Notation, panels 9 & 10 in stacked
arrangement 299
R.38 Watercolour paintings: (a) Red and White and (b) Fireworks 304
9.1 RP’s synaesthetic experience to overtone singing by Wolfgang
Saus 315
R.39 Timothy B. Layden, Dark Glistening. 322
10.1 Selected illustrations for the productions of (a) ATOMOS,
(b) ENTITY and (c) UNDANCE for Wayne McGregor | Random
Dance 330
10.2 (a) Still from video annotating form and flow in Forsythe’s
One Flat Thing; (b) Difference forms in movement viewed from
above 331
List of illustrations xxi
10.3 Relationships and representations that bridge sources of inspiration

and a finished work in contemporary dance 334
10.4 Examples of deep patterning in multimodal fusion 340
10.5 A core mammalian mental architecture with four subsystems, each
with three components (image, memory and processes) 341
10.6 Interacting cognitive subsystems: a nine-subsystem architecture for
the human mind 343
10.7 Extracts from the Mind and Movement educational resource that
illustrate (a) the development of imagery based upon musical stimuli
and (b) the translation of that imagery into innovative movement
material 346
Tables
0.1 Historical examples of the use of shape xxvi

3.1 Original text and English translation of ‘Am fernen Horizonte’ 62
3.2 ‘Die Stadt’, stanza 1 versus 2: contrasting and parallel
dimensions 63
3.3 ‘Die Stadt’, stanza 1 versus 3: contrasting and parallel
dimensions 63
3.4 Recorded performances of ‘Die Stadt’ by Fischer-Dieskau, Bostridge
and Quasthoff 75
3.5 Partial correlations between pitch and intensity after correction for
correlations with dynamic indications in the score (N = 12) 76
3.6 Correlations of duration with phrase intensity and forte indication
(N = 12) 78
7.1 Participants in the interview study 218
7.2 Participants who discussed each musical level 223
7.3 Participants who discussed each trigger 225
7.4 Participants who discussed each technical modification 227
7.5 Participants who discussed each heuristic 229
7.6 Differences between Elsie’s two versions of the extract 232
8.1 Number of popular musicians in Prior (2012b) who played each
genre of music 255
8.2 Layers of shaping in popular music performances 271
9.1 The ‘tone shapes’ reported by Zigler (1930) 309
11.1 Some of the synonyms for ‘shape’ collected for Prior (2010) 361
11.2 Associations reported in Eitan and Granot (2006) 362
11.3 Highest-scoring results from Eitan and Timmers (2010; Table 1), with
a proposed environmental cause for the participants’ preference 374
ABOUT THE COMPANION WEBSITE
Oxford University Press has created a password-protected website to accom-

pany Music and Shape, which contains additional illustrations, including all
the book’s colour illustrations, sound files, videos and excepts from interviews.
Examples available online are indicated in the text with Oxford’s symbol .
Anna Meredith, I-Uen Wang Hwang and Timothy B. Layden have all pro-
vided artwork and corresponding sound files for the compositions they dis-
cuss in their ‘Reflections’ on shape. Adam Ockelford, Lucia D’Errico and Max
Baillie have used colour to clarify some of their illustrations. Zohar Eitan,
Renee Timmers and Mordechai Adler provide a score of Schubert’s ‘Die
Stadt’, discussed in their chapter. Helen M. Prior provides numerous additional
figures, together with extracts (in the Tables) from her interviews with musi-
cians that illustrate each component of the model of musical shaping she finds
that they use. Malcolm Bilson provides, over and above his Reflection, a video
of a full-length lecture he gave at the Liszt Academy, Budapest, on the topics
discussed here.
All these items enrich a reading of Music and Shape.
www.oup.com/us/musicandshape
xxiii
PREFACE
Daniel Leech-Wilkinson and Helen M. Prior
There can be no doubt that concepts of shape are ubiquitous in musical

discourse and music cognition: we use innumerable shape-related
metaphors for most (if not all) features of music such as dynamics,
timbre, harmony, pitch, contour, rhythm, texture, tempo, timing,
expressivity and affective qualities. Also, we encounter shapes in
various music-related images such as in graphical scores, composers’
sketches, music analysis illustrations, as well as in more directly signal-
based shape images as waveforms and spectrograms, and last but not
least, as shape images of music-related body motion. We could thus
speak of widespread and deep-rooted shape cognition in music.
—Godøy (2013: 223)
‘Music and what?!’ people have tended to ask us. But, as Godøy’s remarks sug-
gest, that puzzlement is not shared by musicians: they always seem to know
what we’re referring to.1 In a sense it’s that discrepancy that inspired this book
and the research project from which it has emerged. For although the connec-
tion between shape and sound may seem mystifying to others, Prior (2012)
finds that professional musicians use ‘shape’ to talk and think about how to
perform notes, phrases, melodic lines, melodic patterns, harmonic features,
harmonic patterns, rhythms, movements, compositions, changes in loudness,
tempo and expression; and this applies in classical music, jazz, folk, pop, rock,
urban, world musics and crossover, and for people who originated from thirty-
one countries, 43 per cent of them fluent in a language other than English.
Moreover, for speakers of languages which do not use a simple equivalent to
‘shape’ in discussion of music, the concept was nevertheless immediately recog-
nized from their own musical discourse. The use of the term is also not merely
a current ‘fashion’: there is evidence to show its use by composers, performers
and critics throughout the twentieth century and to some degree earlier (see
Table 0.1).2 Evidently, shape is a concept that is flexible, ubiquitous and very
useful when thinking and speaking about performance and composition.
With so many and such varied uses, it cannot just be visual or tactile shape
that we are dealing with. Shape must be doing much broader metaphorical
work, transferring into different, less tangible domains including time, quan-
tity, intensity, complexity, speed and emotional response, at least.3 One way of xxv
looking at this is to say that shape means so many things in relation to music
xxvi Preface
TABLE 0.1 Historical examples of the use of shape
Shape as W. R. Anderson (critic) ‘Caractacus, the composer’s Op. 35 (Leeds, 1898:

form or dedicated to Queen Victoria), immediately
structure precedes the Variations, the Sea Pictures, and
Gerontius, and looks strongly onward from
the earlier cantatas, both in shape and idiom.’
(Anderson 1934: 396)
Benjamin Britten (composer/ ‘I never, never, start a work without having a
performer) very, very, clear conception of what that work is
going to be. Err… When I say conception, I don’t
mean, necessarily tunes, or specific rhythms, or
harmonies, or … old fashioned things like that,
but I mean the actual … shape of the music,
the kind of music it’s going to be, rather than the
actual notes.’
‘I know that the first drafts for The Turn of the Screw
were in what one would call then the normal three-
act form … and … even I think, the libretto was
written in that shape. ’ (Britten and Mitchell 1969)
Fryderyk Chopin (composer/ ‘Chopin … is at the piano and does not observe
performer) that we are listening to him. He improvises as if
haphazardly. He stops. “What’s this, what’s this?”
exclaims Delacroix, “you haven’t finished it!”
“It hasn’t begun. Nothing’s coming to me. …
Nothing but reflections, shadows, shapes that
won’t settle. I’m looking for the colour, but I can’t
even find the outline.” ’ (George Sand, in Eigeldinger
1997: 240)
Lang Lang (performer) ‘When you’re talking about Mr Barenboim, he can
really bring the knowledge, the structure, how to
put every element into one big shape.’ (Barenboim
2005)
Claus-Steffen Mahnkopf, Frank ‘While in common-practice music concepts
Cox and Wolfram Schurig such as theme or motive, phenomena such as
(composers/musicologists) line or melody and the systems of syntax and
rhythm are generally taken to be self-explanatory,
the question concerning corresponding means
in post-traditional music (i.e. new music since
1945)—that is, sufficient to shaping the musical
surface in a potentially meaningful manner—is
rarely reflected upon and more commonly
suppressed.’ (Mahnkopf, Cox and Schurig 2004: 7)
Anthony Marwood (performer) ‘I didn’t have any influence over the structure or
shape of the piece, but I know that he [Thomas
Adès] had my playing in mind when he wrote it.’
(Anonymous 2008: 15)
Michael Quinn (critic) ‘The Delmé find the shape, structure and even the
nobility of Haydn’s Emperor but the detail seems
sadly lacking.’ (Quinn 1999: 72)
Stephen Plaistow (critic) ‘… his feeling for the shape of a Bach fugue, and
for part playing and the character and brilliance
of Bachian figuration, is full of finesse, quite the
equal of any Bach specialist’s.’ (Plaistow 1965: 114)
(continued)
Preface xxvii
TABLE 0.1 Continued
Shape in Nalen Anthoni (critic) ‘The music breathes a life of its own as he ardently
relation to inflects its phrases to shape the tension of his line.’
musical (Anthoni 2008: 65)
expression Dietrich Fischer-Dieskau ‘Shape the endings of the long phrases in the
(performer) recitative in a way that the conductor can
easily follow you.’ (Dietrich Fischer-Dieskau in
Monsaingeon 1992)
Trevor Harvey (critic) ‘The orchestral playing is not just good, it is really
outstanding: the conductor knows how to give
us flexible and shapely phrases as well as tightly
rhythmic music.’ (Harvey 1954: 59)
Rachel Podger (performer) ‘With Vivaldi there are so many possibilities to
shape the music.’ (Podger 2003: 15)
Stephen Plaistow (critic) ‘Richter doesn’t shape the actual subjects in
the fugues very much, preferring to state them
flatly and to let the counterpoint achieve its own
expressiveness.’ (Plaistow 1965: 114)
Alec Robertson (critic) ‘It is a pity this artist has so little feeling for the
shape of a phrase.’ (Robertson 1947: 165)
Stanley Sadie (critic/musicologist) ‘Another thing Podger is specially good at is the
shaping of those numerous passages of Vivaldian
sequences, which can be drearily predictable,
but aren’t so here because she knows just how to
control the rhythmic tension and time the climax
and resolution with logic and force.’
(Sadie 2003: 51)
Shape in Aaron Cassidy (composer/ ‘… the notion that the primary morphological
relation to musicologist) unit—not only in my music but also in music in
movement general—is not merely the aural gesture, but far
or gesture more importantly, the physical gesture. I would
assert that the shapes and local forms that we
hear and process as listeners are at their core
the byproducts of physical, visceral activities and
energies, and, further, that the physical motion
required to create a particular sound or set of
sounds is the most important component of a
gesture’s morphological identity.’ (Cassidy 2004: 34)
that it in effect means nothing at all. But that kind of throwing up of hands in
despair doesn’t lead to very penetrating scholarship; and in any case, its very
imprecision may prove to be its raison d’être. Better, then, to approach shape as
a concept with some unusual and intriguing properties, and to try to find out
what those might be and what they might suggest about its place in the brain’s
responses to music.
It was with this aim as an ideal, albeit one we could not hope to realize,
that we planned and carried through a three-year research project (2009–12) on
music and shape, funded by the UK’s Arts and Humanities Research Council
within its Research Centre for Musical Performance as Creative Practice.4 In the
event we managed to continue for a further two years, since there was so much
to do and King’s College London continued to provide support. This book
contains some of the results of that project work (the chapters by Küssner,
xxviii Preface
by Leech-Wilkinson and by Prior). But mainly it consists of contributions by

scholars not involved with the project whose work seemed to us to be dealing
with topics in which, our research suggested, shape might be implicated. This
then is not a conference proceedings. We did hold a highly interesting and fruit-
ful conference on ‘Music & Shape’ in London in July 2013, the result of an
open call for papers, and the studies we attracted are published in three special
issues (forming volume 8) of the journal Empirical Musicology Review. The
chapters in this book, however, were commissioned separately, the choice of
authors reflecting the research areas that seemed most crucial to those engaged
in the music-and-shape project. Authors’ home disciplines include music psy-
chology, music analysis, music therapy, musicology, performance (jazz, clas-
sical and DJ), synaesthesia, and dance (scholarship and performance). That
every author, although none had discussed shape before, found it quite easy to
see how their work might contribute, only confirms the flexibility and ubiquity
of shape as a concept that does some useful work for those who try to under-
stand music and musical practice.
As well as commissioning the eleven chapters, we followed the example set
by the Cambridge Companion to Recorded Music (one of the publications from
our predecessor project, the Centre for the History and Analysis of Recorded
Music).5 There we had included ‘personal takes’ by a wide variety of artists in
different areas linked to recordings. For this volume, we commissioned ‘shape
reflections’ from a similarly wide spread of music practitioners. We approached
a range of performers (wind, strings, keyboards, percussion, guitar) and com-
posers (classical, film, graphic, jazz, popular)—two of whom were also notable
conductors—as well as a record producer, a music painter and a synaesthete.
Their instruction was simply to tell us how they used ‘shape’ in their own musi-
cal work and thinking, or to reject it as a useful concept if in fact they didn’t use
it (none took up that last option).
Performers’ thought about music-making has not always been well under-
stood by musicology, despite being at the heart of much ethnomusicology,
and more recently music sociology and psychology. At its best (for example,
Berliner’s 1994 ethnography of jazz practice), studies of performers’ expe-
rience of what they do can illuminate a whole world of music-making. In
a previous study of violinists and harpsichordists (Leech-Wilkinson and
Prior 2014)—developed here in Chapter 8 on DJs’ practices by Greasley and
Prior—we argued that the way musicians talk about performing details in
scores, however approximate it may look to music theory, reveals a highly
efficient means of enabling the body to generate expressive performances in
real time. The Reflections here offer similar evidence over a wider field. Of
the heuristics we identified, shape proves to be one of the most powerful, for
it summarizes, with a generality that allows it to be implemented and enacted
in a great many ways, the essential characteristics of a ‘musical’ performance.
As Eitan, Timmers and Adler conclude in their Chapter 3, shape is a concept
that is sufficiently flexible to map between domains on any hierarchical level
Preface xxix
from a single note to a whole piece of music; it can apply to scores, perfor-
mances and listening experiences, and within those to such varied features as
narrative structure, form, loudness, brightness, tempo, speed, density, register,
intensity, harmonic or interval patterning, pitch direction, sound spectrum,
distance and timbre. As such, it acts as a highly efficient synthesizing tool for
musicians to use in order to negotiate the vast array of musical choices avail-
able to them in performance.
Shape’s flexibility and usefulness are just as clear from the range of other views
that this book offers. In Adam Ockelford’s Chapter 5, shape is seen as a core
property of music that links together its notation, its audible features and our
cognition of musical structure. In Michael Spitzer’s Chapter 4, a sonata by Bach
is compared ‘both to the shape of particular emotional behaviours and to the
expressive shapings of a formal model’ as well as ‘performance styles of “expres-
siveness” ’. For Milton Mermikides and Eugene Feygelson, writing about impro-
visation in Chapter 6, shaping processes are conceived of as strategies through
which material is selected and transformed within musical space. For Philip
Barnard and Scott deLahunta, in Chapter 10, ‘shape’ is a useful concept for
dancers and choreographers not just to describe bodily configuration and move-
ment but also ‘to index the more ineffable meanings and relationships that are
intuited to “make sense” in an artistic context’. For Rolf Inge Gødoy (Chapter 1),
‘shape-cognition in music is opening up new areas of musicological, aesthetic
and affective psychological research, as well as providing practical tools in artis-
tic creation, for example in the domains of sonic design and various kinds of
multimedia art’. For the synaesthetes with whom Jamie Ward works, shapes not
only are a means of conceptualizing complex interactions of musical features and
the feelings they seem to trigger, but are experienced ‘at multiple levels in music:
from single notes through to whole compositions and performances’ (Chapter 9)
as sensations automatically generated by hearing music. Among the various
aspects of shape our practitioners discuss in their Reflections, George Benjamin
mentions shape especially in relation to form, Malcolm Bilson to performance
style, Stephen Hough to composition style, Timothy B. Layden to visual impres-
sions, Lucia D’Errico to bodily sensation, Alex Reuben to body movement,
Alice Eldridge to both gesture and visual representations, Richard G. Mitchell
to emotional change, Evelyn Glennie to dynamics (in the fluid sense), David
Amram to musical character, I-Uen Wang Hwang to rhythm and metre, Max
Baillie to harmony, Simon Desbruslais to timbre, Steven Isserlis to narrative,
Steve Savage to sonic landscape, Antony Pitts to initial inspiration, and Julia
Holter to closure. It goes without saying that all of these factors could be written
about separately and in much greater depth, and indeed they have been. But the
point is not that ‘shape’ could always be replaced by a more precise term—one
which varies according to the context in which shape is being used. Rather, what
we need to ask is why shape is so useful in the sample of contexts discussed here,
and by implication in so many others; and why it is so much more useful than
the more precise term that might pin it down in each case.
xxx Preface
Concepts very like the ‘synthesizing’ notion we discuss here have been
invoked in the past. Mine Doğantan-Dack has summarized this interestingly in
her essay in the Empirical Musicology Review volume mentioned above:
Christian von Ehrenfels, who is best-known today for his article titled ‘Über
Gestaltqualitäten’, i.e. ‘On Gestalt Qualities’, … published in 1890 …
argued that each experience we have of a Gestalt or form in any sensory
modality is cognized as structurally analogous to the experience of a spa-
tial shape. In other words, spatial Gestalten serve in his view as references
for our comprehension of forms in other modalities. An immediate impli-
cation of this idea is that concepts related to the perception of spatial
shapes can be applied to shapes extended in time—for instance, tonal
patterns. Indeed, the idea that there are similarities of form between dif-
ferent fields of experience is one of the most important conclusions of
Ehrenfels’ article. (Doğantan-Dack 2013: 213–14)
Jin Hyun Kim, in her article in the same collection, notes that:

Delineating the causal relation between bodily aroused states and vocal-
izations, [Friedrich von] Hausegger discusses dynamic forms of sound,
which are experienced as an expression of mental states, in his seminal
monograph Music as Expression (Die Musik als Ausdruck) (1887). …
Hausegger contends that shaped vocal sounds are not only experienced
as expressions of others’ aroused states, but also give rise to the ‘co-
sense (Mitempfindung)’ of arousal (p. 42). He also considers this kind of
phenomenon in the context of non-sentient phenomena such as music
and dance.
In the monograph Shaping and Movement in Music (Gestaltung und
Bewegung in der Musik), [Alexander] Truslit (1938) tackles the coupled
relationship between the shaping of music and musical experience. The
shaping of music is regarded as fundamental to the musical experience,
which takes place during both music-making and music perception; the
latter is characterized by the listeners’ ‘co-shaping (mitgestalten)’ of music
(Truslit 1938, p. 20) through their inward experience of movement (p. 27).
Basing the shaping of a sound on its duration and intensity, Truslit con-
ceives of movement as the primordial element being shaped. Movement in
music is shaped by dynamics—gradations of sound intensity changing the
volume of sound as perceived—in conjunction with agogics—temporal
changes of sound causing its deceleration or acceleration within the given
overall temporal structure—resulting together in spatio-temporal con-
tours of music. According to Truslit, dynamics and agogics act as funda-
mentals of the process of musical shaping. (Kim 2013: 164–5)
Yet neither of these studies was followed up at the time, probably because they
had no points of contact with contemporary musicology, whose concern above
Preface xxxi
all was to present music as a subject for historical and textual study. Closest
in the intervening years, as Doğantan-Dack points out, was Susanne Langer
(1942), whose interest in how music feels brings some of her work into the same
orbit. And indeed, shape’s re-emergence recently can be understood as part of
a growing interest in those musical practices and responses that draw on feeling
more than on thinking; this is a result of the increasing focus of music stud-
ies on emotion, enabled by the development of music psychology and neuro-
science. In this context, Kim, Doğantan-Dack and Leech-Wilkinson (this last
in our volume) have all (independently) pointed to child-psychiatrist Daniel
Stern’s work on vitality affects (2010), which in a sense (though unknown to
Stern) extends Truslit’s work, as a valuable theoretical base for understanding
musical shape. Interrelations with other work are suggested, too, by Godøy in
the continuation of the quotation that begins this Introduction:
We could thus speak of widespread and deep-rooted shape cognition in
music, as well as in human reasoning in general, as suggested by some
directions in the cognitive sciences, foremost by so-called morphodynami-
cal theory and so-called cognitive linguistics. (Godøy 2013: 223)
Much relevant work has been done by researchers studying music and gesture,
outstandingly Godøy himself and Marc Leman (Leman 2007; Godøy and
Leman 2010). Gesture clearly implies shape: it is often considered as includ-
ing performers’ executive and expressive movements—that is, how they move
while they play—but it has also been used extensively to talk about habits in
the forming or performing of short sequences of notes (Gritten and King 2006,
2011). Yet while gesture is closely tied to indicative human movement, shape
seems more abstract and thus more flexible in its application to musical and
other kinds of action.
Another difficulty is hinted at in Leech-Wilkinson’s chapter, where the pos-
sibility is raised that a sense of ‘shape’ arises from a submodal feature common
to all the sense modalities. This extends beyond the cross-domain mapping
that several chapters (Ockelford’s; Eitan, Timmers and Adler’s; and Spitzer’s
in particular) see as crucial to shape’s multiple applications. A submodal role
for shape might explain why our understanding of what shape refers to in
music is at once so multifaceted and so hazy, and why it may always remain
so. For submodal features, as pointed out by Marks (1978), are necessarily
beyond conscious perception: they are components in our sensory experience
but not accessible to consciousness directly through the senses. Alex Reuben’s
impressionistic Reflection on his work as a filmmaker may well be pointing
towards this aspect of shape: in using shape to link feelings in different senses
and art forms, he is not being merely touchy-feely but may be drawing on the
submodal qualities of shape (operating in the recently discovered domain of
multisensory perception) as an aspect of the dynamics of all sensory experi
ence. What previously seemed fanciful now is beginning to seem simply
xxxii Preface
correct: feelings aroused by one sense can be linked by the brain to feelings
aroused by others, so that input in one mode can be read in terms of the
impressions arising from others—and not just for synaesthetes, in fact partic-
ularly not for synaesthetes since for them the effect is fixed whereas for others
it varies with context. Synaesthetes, nevertheless, offer particularly interesting
insights into musical shape. For, as Jamie Ward has shown, their experiences,
though remarkably varied, still make better sense to nonsynaesthetes than
artificial alternatives.
It looks, then, as if the kind of work that ‘shape’ does for musicians draws
on some quite fundamental aspects of perception, while at the same time offer-
ing us a host of ways of thinking about the experience and practice of music
on many other levels. The chapters and Reflections are interleaved and ordered
so as to emphasize interconnections. While they are grouped thematically into
five sections—shapes mapped, composed, performed, seen and felt—there is
also a gradual shift of theme so that the borders between sections are fuzzy. To
read from cover to cover, then, should be to take a journey through views of
music and shape. Most contributions speak of multiple facets of this complex
relationship, however: other orderings are possible, and dipping in and out will
often make further connections apparent.
References
Anderson, W. R., 1934: record review of HMV DB 2142, The Gramophone 11 (no. 130): 396.
Anonymous, 2008: interview with Anthony Marwood, Gramophone 85 (no. 1029): 15.
Anthoni, N., 2008: review of Arkiv Production 477 7371–2, Gramophone 86 (no. 1035): 65.
Barenboim, D., 2005: ‘Barenboim on Beethoven: masterclasses’ (EMI DVD 68993).
Berliner, P. F., 1994: Thinking in Jazz: The Infinite Art of Improvisation (Chicago and
London: University of Chicago Press).
Britten, B. and D. Mitchell, 1969: ‘Benjamin Britten in conversation with Donald Mitchell’,
CD booklet accompanying BBC Legends: Britten Mozart Requiem (BBCL 4119–2).
Cameron, L., 2010: ‘What is metaphor and why does it matter?’, in L. Cameron and
R. Maslen, eds., Metaphor Analysis: Research Practice in Applied Linguistics, Social
Sciences and the Humanities (London: Equinox), pp. 3–25.
Cassidy, A., 2004: ‘Performative physicality and choreography as morphological deter-
minants’, in C.- S. Mahnkopf, F. Cox and W. Schurig, eds., Musical Morphology
(Hofheim: Wolke Verlag), pp. 34–51.
Doğantan-Dack, M., 2013: ‘Tonality: the shape of affect’, Empirical Musicology Review 8/
3–4: 208–18.
Ehrenfels, C. von, 1890: ‘Über “Gestaltqualitäten” ’, Vierteljahrsschrift für wissenschaftliche
Philosophie 14: 242–92.
Eigeldinger, J.-J., 1997: ‘Chopin and “La note bleue”: an interpretation of the Prelude Op.
45’, Music & Letters 78/2: 233–53.
Preface xxxiii
Gibbs, R. W., 2008: ‘Metaphor and thought: the state of the art’, in R. W. Gibbs, ed., The
Cambridge Handbook of Metaphor and Thought (Cambridge: Cambridge University
Press), pp. 3–13.
Godøy, R. I., 2013: ‘Shape cognition and temporal, instrumental and cognitive constraints
on tonality. Public peer review of “Tonality: the shape of affect” by Mine Doğantan-
Dack’, Empirical Musicology Review 8/3–4: 223–6.
Godøy, R. I. and M. Leman, 2010: Musical Gestures: Sound, Movement, and Meaning
(New York: Routledge).
Gritten, A. and E. King, eds., 2006: Music and Gesture (Aldershot: Ashgate).
Gritten, A. and E. King, eds., 2011: New Perspectives on Music and Gesture (Aldershot:
Ashgate).
Harvey, T., 1954: record review of Decca LW 5114, The Gramophone 32 (no. 374): 59.
Kim, J. H., 2013: ‘Shaping and co-shaping forms of vitality in music: beyond cognitiv-
ist and emotivist approaches to musical expressiveness’, Empirical Musicology Review
8/3–4: 162–73.
Langer, S., 1942: Philosophy in a New Key (Cambridge, MA: Harvard University Press).
Leech-Wilkinson, D. and H. M. Prior, 2014: ‘Heuristics for expressive perfor-
mance’, in D. Fabian, R. Timmers and E. Schubert, eds., Expressiveness in Music
Performance: Empirical Approaches across Styles and Cultures (Oxford: Oxford
University Press), pp. 34–57.
Leman, M., 2007: Embodied Music Cognition and Mediation Technology (Cambridge,
MA: MIT Press).
Mahnkopf, C.-S., F. Cox and W. Schurig, 2004: Musical Morphology (Hofheim: Wolke
Verlag).
Marks, L. E., 1978: The Unity of the Senses: Interrelations among the Modalities
(New York: Academic Press).
Monsaingeon, B., 1992: The Mastersinger—Lesson III (EMI DVB 3101949).
Plaistow, S., 1965: review of Deutsche Grammophon (S)LPM18950, The Gramophone 43
(no. 507): 114.
Podger, R., 2003: ‘A question to … Rachel Podger, Baroque violinist’, Gramophone 80
(no. 966): 15.
Prior, H. M., 2012: ‘Shaping music in performance: report for questionnaire participants
(revised August 2012)’, http://www.cmpcp.ac.uk/wp-content/uploads/2015/09/Prior_
Report.pdf (accessed 9 April 2017).
Quinn, M., 1999: review of Droffig National Trust NTCC014, Gramophone 76 (no. 914): 72.
Robertson, A., 1947: record review of Decca M 602, The Gramophone 24 (no. 287): 165.
Rothfarb, L., 2001: ‘Energetics’, in T. Christensen, ed., The Cambridge History of Western
Music Theory (Cambridge: Cambridge University Press), pp. 927–55.
Sadie, S., 2003: review of Channel Classics CCS15958, Gramophone 80 (no. 966): 51.
Stern, D., 2010: Forms of Vitality: Exploring Dynamic Experience in Psychology, the Arts,
Psychotherapy, and Development (Oxford: Oxford University Press).
Truslit, A., 1938: Gestaltung und Bewegung in der Musik (Berlin: Chr. Friedrich Vieweg).
PART 1
Shapes mapped
Reflection
Evelyn Glennie, percussionist
The shape of music is constantly fluid because nothing resonates the same
twice. Every sound and shape is born and reborn. When music is printed on the
page it takes shape in my imagination with the eye leading the way.
As a performer, the environment is my instrument and percussion instru-
ments are my tools to deliver the sound. I can provide all the musical ingredi-
ents for the environment I am immersed in. The acoustic will mould the sound
meal which is thus delivered to the audience. The members of the audience will
have differing perspectives on the sound and shape according to where they are
situated and their emotional state at the time.
Listening is ever-present, recognizing that the body is a huge ear that allows
us to experience the sensation of the sound journey, reached far beyond the
capacity of the ear alone. That in turn creates the fluid shapes in music.
3
1
Key-postures, trajectories and sonic shapes

Rolf Inge Godøy
It seems that we come across expressions of shape everywhere in music-

related contexts. When talking about music, people—with or without musi-
cal training—often tend to use shape metaphors such as ‘thin’, ‘fat’, ‘smooth’,
‘rough’, ‘curved’, ‘flat’ etc., or when listening to music, people often tend to
trace shapes with their hands or other body parts, shapes that reflect sonic fea-
tures of the music. And needless to say, the body motion of musicians and
dancers in performance can be perceived as shapes, as can music notation
and graphical scores (see Ockelford, Chapter 5 below), in more recent times
extended to signal-based graphical representations of musical sound as wave-
forms and spectrograms (see Greasley and Prior, Chapter 8 below).
The ubiquity of shape expressions in music-related contexts seems to be
spontaneous and robust, as well as quite practical, when we talk about music.
But on reflection, the widespread use of shape metaphors and other shape
representations is also enigmatic for the simple reason that audible sound is basi-
cally invisible (unless we use some technology for sound visualization), whereas
‘shape’ is primarily something in the visual domain. ‘Shape’ is defined in the
New Oxford American Dictionary as ‘the external form, contours, or outline of
someone or something’ and as ‘a geometric figure such as a square, triangle, or
rectangle’, yet it can also have more indirect or conceptual visual-geometric sig-
nifications such as ‘the specified condition of someone or something’. Although
this and similar definitions of ‘shape’ may include such general and nonvisual
applications of the term, the question remains as to why and how we so readily
link sonic features with visual shape representations in musical experience.
This is even more enigmatic considering that music unfolds in time, and so
shape by definition is something that we overview or ‘have in the field of vision’
as an ‘all-at-once’ experience, and hence is something ‘instantaneous’ in our
minds (at least subjectively, although there is a time-dependent scanning and
4 mental processing going on in the perception and cognition of visual images).
Key-postures, trajectories and sonic shapes 5
How this ‘temporal-to-atemporal’ transformation in our minds works still

seems not to be well understood in the relevant cognitive sciences, but from our
own and others’ research, we believe the linking of sonic features and visual
shape images has much to do with experiences of music-related body motion.
In what can be broadly called a motor theory perspective on music perception,
it seems that body postures at salient moments in sound production (both
instrumental and vocal), what we call key-postures, and body motion trajecto-
ries between these key-postures, relate to subjectively perceived sonic shapes, as
suggested by the title of this chapter.
The basic tenet of this chapter is therefore that shape in music-related con-
texts is closely related to experiences of something that we do or mentally simu-
late that we do; so after an introductory presentation of some main notions
of shape in musical experience and music-related research this chapter goes
on to develop some ideas of motor cognition in music. Relevant elements of
research on music-related body motion are reviewed, including various kinds
of the sound-producing body motion of musicians and sound-accompanying
body motion that we can observe in music listening. A central issue in this
connection is an assessment of the correspondences between body motion fea-
tures and sonic features: rhythm and pitch contours are often seen to be clearly
reflected in body motion, but other features such as texture, timbre, dynamics
and a number of so-called expressive features may all be related variably to
body motion and thus also to shape images.
One crucial issue in such a listing of links between body motion and sonic
features is that of timescales: in listening, either to a short tune or to a more
extended work of music, we need to segment sound and associated body motion
into meaningful chunks that enable more specific determinations of shape.
Various instrumental, biomechanical, cognitive and musical- aesthetic con-
straints seem to converge in suggesting that we experience fragments of music
at what we call the meso timescale, very approximately in durations ranging
from 0.5 to 5 seconds, as particularly salient with regard to both body motion
and sonic features. After a presentation of relevant research on this phenom
enon of chunking and motion-sound shapes at the meso timescale, the chapter
concludes with some ideas on how principles of key-postures, trajectories and
sonic shapes may be put to use in music-related research and practical contexts.
Shape representations
Needless to say, music and shape is a very extensive topic, with ramifications to
most areas of music and music-related research. Yet out of all this material, it
could be useful to take a brief look at some aspects of western music notation
and more recent instances of shape representations in musical research, to bet-
ter situate our motor theory perspective on shape in music.
6 Music and Shape
FIGURE 1.1 A pianola representation of the first eight bars of J. S. Bach’s Fugue in C major, Well-
Tempered Clavier Book I. This representation highlights the gradually expanding pitch space,
fanning out to several octaves from the initial middle C. The shape of this pitch space expansion is
one of the main architectural elements here (as well as in the rest of J. S. Bach’s works and much other
music for that matter); however, the timescale of this kind of shape is rather slow, i.e. is on what we
call the macro timescale (see p. 14 below).
For one thing, western common practice notation, as well as recent exten-
sions such as MIDI pianola representation (Figure 1.1), can partly be regarded
as a kind of choreographic script, a system for denoting sound-producing body
motion to be realized by performers. Trained score readers may readily see corre-
spondences between the graphical shapes in the score, the required motion shapes
of the performers and the emergent sonic shapes, in particular as pitch contours
and rhythmical-textural shapes. In other cases, there may be less c lear relation-
ships between visible shapes in the score and subjectively perceived sonic shapes;
e.g. a tamtam strike may be indicated in the score as a single onset point in time,
perhaps with some dynamic marking and indication of the type of mallet to be
used, yet the result in performance is a protracted and extremely complex sound.
Evidently, timbral features are in general not well represented in western
notation because of its focus on pitch and duration. And as we know, this focus
has tended to leave expressive features of pitch, dynamics and timing outside
the mainstream conceptual apparatus, relegating these to the domain of perfor-
mance practice, a focus that has led to problems when attempting to represent
music of other cultures by western music notation transcriptions. But within
this pitch-and duration-focused western musical culture, we have also seen
some further abstractions from perceptual features, such as at times disregard-
ing octave placement, equating for instance an octave-compressed chord with
a widely spaced chord.1 Similar distortions of perceptually salient pitch shape,
and also of rhythm–shape relationships, are found in twentieth-century serial
and integral serial music, as well as in so-called pitch-class set theory, effectively
resulting in what could be called a ‘spatiotemporal collapse’ of salient percep-
tual features (see Godøy 1997 for a discussion of this).
In the twentieth century, however, we have also seen attempts to develop

more graphical and shape-reflecting representations, such as the Schillinger
system (Sethares 2007) or various kinds of graphical scores, such as those of
Cage, Ligeti, Bussotti, Logothetis and others (Schäffer 1976). One of the most
important music-and-shape efforts of the twentieth century is in the work
of Xenakis, for example in his development of connections between musical
and architectural shapes such as in his well-known composition Metastaseis
and later design of the Philips Pavilion at the Brussels World’s Fair in 1958
(Xenakis 1992).
Since the advent of sound-analysis technologies, we have had the means for
signal-based representations of musical sound as shapes. An early and remark-
able effort in this direction of visualizing the shapes of actual sonic unfold-
ing of music was the work of Cogan (1984), and in the ensuing decades we
have seen a great expansion of signal-based representations of music in the
domain of so-called Music Information Retrieval (MIR). MIR is actually a
matter of going in the opposite direction from western music notation: instead
of making continuous sound from discrete symbols, it tries to extract the dis-
crete pitches and durations from continuous, complex and, we could say, often
messy signals. Confronted with continuous musical sound, we soon realize that
the great difficulty in MIR in making computer-based transcription of music
(in particular of polyphonic music) is that human listening, including shape
perception in music, although seemingly versatile and robust, is dependent on a
number of perceptual cues in combination with extensive prior knowledge and
mental schemas, hence on something that has yet to be implemented in MIR
technologies.
As the universe of continuous sound has been opened up to explorations
by signal processing technologies, in principle giving us access to the above-
mentioned timbral and expressive features, we also need to develop a concep-
tual apparatus for handling these features (see e.g. Peeters et al. 2011). One
pioneering research effort based on continuous sound was that of Pierre
Schaeffer and co-workers (Schaeffer 1966, [1967] 1998; Chion 1983; Godøy
1997, 2006). The point of departure for Schaeffer and co-workers was to take
the subjectively perceived overall pitch, dynamic and timbre-related shapes of
sound fragments, of so-called sonic objects, as the point of departure, and then
successively to differentiate more and more subfeatures as shapes, only at a
later stage trying to correlate these subjectively perceived shape features with
physical features of the acoustic signal.
After this pioneering work of Schaeffer and co-workers, there have been
some related projects of exploring musical sound by way of subjective shape
metaphors, for example the Unités Sémiotiques Temporelles (UST) project
(Delalande et al. 1996), which is more oriented towards affective features of
sonic objects. The common point of departure for Schaeffer and the UST
project was the idea that although western musical culture has been good at
8 Music and Shape
conceptualizing features that can be ordered into more abstract symbolic sys-
tems such as those of pitch and duration, it has not been well suited to con-
ceptualizing more continuous, composite and multidimensional features. In
assessing the work of Schaeffer and followers, we find the idea of using various
shape images as a nonsymbolic means for feature representation to have been
an attractive solution, something that we now see has an affinity with body
motion (Godøy 2006).
Shape ontologies
Shape in musical contexts is a multimodal phenomenon because it involves

sound and vision and—our main concern here—also the sense of motion.
Multimodality has in recent years received a lot of attention in the cognitive
sciences, and ‘classical’ notions of the separation of the senses have been chal-
lenged. There is now mounting evidence that the sense modalities work together
and complement one another, sometimes even with one sense modality over-
riding another, resulting in what may be judged as illusions, as in the ‘McGurk
effect’ where visual impressions of a speaker’s mouth motion can change the
subjective interpretation of the sound heard (McGurk and MacDonald 1976).
The sense of motion is now regarded as composite, including kinematic (vis-
ible motion), effort (dynamic, not directly visible), proprioceptive (self monitor-
ing) and haptic (sense of touch) components. Additionally, musical sound is
obviously highly composite and multidimensional, with many features in paral-
lel. This means that we need to be sufficiently precise about which features we
have in mind when we discuss shape in musical contexts so that we do not make
so-called category mistakes, mixing incommensurable features. We thus need to
consider shape ontologies, carefully analysing what features of musical sound
and/or body motion we are referring to, and also whether some instances of
shape can be considered amodal, i.e. more independent of a specific modality
and applicable across modalities.
Considering shape ontologies also means trying to distinguish what is in the
signal (auditory, visual, haptic, etc.) and what is in our minds, regarding men-
tal shape images as just as salient as more physical shapes, provided that these
mental shape images are shared by people. This should mean in turn that we
treat illusions on an equal footing with the ‘real’, as long as they are subjectively
experienced as relevant for experiences of shape in music, as in the well-known
illusions of endless ascending or descending sounds by Jean-Claude Risset,
similar to M. C. Escher’s optical illusions of endless ascending or descending
staircases. The dividing line is to be placed between subjectively comparable
and incomparable features, meaning that there should be a perceivable similar
ity between two domains, as is the case with this endless decent or ascent in
Risset and Escher, making auditory and visual shape sensations ontologically
commensurable. On the other hand, abstractions based on western music nota-

tion may lead to category mistakes, for example by transferring numerical fea-
tures from one domain to another without reflecting perceivable similarities.
The risk of making such category mistakes is also present in technology-
based shape applications, in so-called sonifications of data from nonauditory
sources, converting a visual domain image to sound. We could use the term
‘mapping’, well known in music technology contexts (Hunt, Wanderley and
Paradis 2003) for keeping track of shape ontologies. Basically, ‘mapping’ means
taking data from one domain and assigning it to features in another domain.
For instance, stock exchange data could be used to control pitch on a musical
instrument so that we could listen to the development of the stock market as
a melodic curve. Or we could use a stream of video or data from other sensors
in mapping body motion to sound generation in various ways, and so listen
to body motion (Jensenius and Godøy 2013). Or we could take a picture of a
cat and use this picture as a spectrogram for generating a sound. The extent
to which the resultant sound would have any ‘cat’-like perceptual features is
doubtful: we could probably call this cat sonification a case of category mistake
in shape mapping, in principle similar to the ontological mismatching of shape,
mentioned above, that we may find in music using western common practice
notation.
Mapping is at the core of all electronic instrument development, and given
the fact that any mapping between input data and sound output is possible
with electronic instruments, the crucial question concerns what kinds of map-
pings make sense to, or could be called ‘intuitive’ by, musicians and audience.
This is a question that can be studied empirically, as has been done in some
recent research projects (see Jensenius 2007 and Nymoen 2013 for overviews).
From this research as well as numerous informal observations over decades of
development in the field of new electronic instruments, the prime candidate
for shape transfer from one domain to another is our sense of body motion,
meaning the mapping of motion along axes in three-dimensional space to vari-
ous perceptually salient sonic feature dimensions, typically pitch, loudness and
spectral centroid.
Shape cognition
Findings in a number of domains seem to converge in suggesting that notions

of shape are fundamental to much (and perhaps most) human cognition and
behaviour. This means that we should also consider some principles of gen-
eral, amodal shape cognition, as these may be useful when we migrate across
modalities and features as we do here in the context of music and shape.
Providing an ‘all-at-once’ overview image of whatever we perceive or think
about is both the prime attribute of shape cognition and its prime advantage,
10 Music and Shape
as well as its challenge, in our context: if we do not somehow have such over-
views of lived experience and are just submerged in a continuous stream of
sensations we will not be able to make sense of the world in general or of music
in particular, as was pointed out by Edmund Husserl more than a century ago
(Husserl 1991). To Husserl, it was obvious that we need to interrupt the con-
tinuous stream of sensations from time to time, and make overview images of
whatever is being perceived, by a series of intermittent ‘now-points’ (Godøy
2010b). Shape cognition could then be defined as our capacity to capture and
handle the ephemeral and temporally distributed features of music, as well as
other lived experience. And with presently available methods and technologies
for recording and processing both sound and body motion, we have the pos-
sibility of ‘freezing’ transient sound and motion and examining them at leisure
as shapes.
Historically, one of the first and most extensive projects on shape cognition
originated in music with gestalt theory in the last decades of the nineteenth cen-
tury (Smith 1988), with, among other things, a focus on how shapes emerge and
are conserved across different instances, such as melodies across various instru-
mental or vocal guises. Gestalt theory was later extended to other domains,
and is now often primarily associated with the visual. The remarkable insights
of early gestalt theory concerning coherence criteria in shape cognition still
have validity today, both in auditory perception (Bregman 1990) and in human
motor control (Klapp and Jagacinski 2011).
But one of the most extensive recent research efforts on shape cognition is
no doubt that of so-called morphodynamical theory (Thom 1983; Petitot 1985,
1990; Godøy 1997). The gist of morphodynamical theory is that human per-
ception, understanding and reasoning are based on ordering sensory input as
shapes, or in the words of René Thom, the leading figure of this theory, ‘the
first objective is to characterise a phenomenon as shape, as a ‘spatial’ shape. To
understand means first of all to geometrise’ (1983: 6).2
Of interest here is the morphodynamical distinction between the ‘control
space’ and the ‘morphology space’, meaning a distinction between the input
and the perceived results of any generative model (Petitot 1990), be that in
physics, biology, behavioural sciences or other domains such as musical sound.
It is always the perceived shapes—the features of the morphology space—that
are of interest for us here in musical contexts, and the distinction between con-
trol and morphology spaces helps us to determine what are ontologically com-
parable features and avoid various mapping mismatches or category mistakes
as mentioned above.
The distinction between control and morphology spaces is particularly use-
ful for exploring categorical thresholds between shapes. This means making
systematic explorations of perceived shapes by generating incrementally differ-
ent variants through what is often called analysis-by-synthesis (Risset 1991).
A simple but important example of this is the distinguishing of ‘percussive’
and ‘bowed’ sounds by the steepness of the attack segment at the beginning
of the sounds: with a very short attack we get the subjective sensation of a
percussive sound, and when gradually increasing the duration of the attack,
we sooner or later get a ‘bowed’ sound sensation. In other words: we explore
the thresholds between these two sound categories (features in the morphology
space) by incrementally varying the duration of the attack segment (a value in
the control space).
The analysis- by-synthesis approach enables exploration of perceptually
salient features by comparing incrementally different variants along several
feature axes, for example combining the incremental attack dimension (‘sharp-
ness’) with feature dimensions for spectral centroid (a measure of ‘brightness’
in timbre perception) in a two-dimensional analysis-by-synthesis exploration.
The analysis-by-synthesis approach is actually what people practise in music
production contexts, tweaking the buttons for equalizing, reverberation or
other kinds of effects processing in the mixing studio, or adjusting drum mem-
branes, instrument and microphone placement, and so on in the recording
room, until the ‘right’ sound is found. When individual musicians or conduc-
tors repeatedly try out versions of singular sounds or phrases until they find the
sonic expression they are searching for, they too are practising an analysis-by-
synthesis approach.
In summary, the analysis-by-synthesis approach is holistic in the sense of
allowing us to evaluate perceptual features of a whole chunk of sound, and in a
way it also bridges the symbolic-to-subsymbolic divide, which in the terminol-
ogy of Schaeffer is called the abstract–concrete divide: singular values along an
axis (or a scale) are abstract, whereas sonic objects with multiple features that
are holistically perceived as shapes are concrete (Schaeffer 1966; Chion 1983).
Related to analysis-by-synthesis is the idea of blending two shapes, an idea
that has become popularly known as the ‘morphing’ of visual images (such as
human faces) or of sounds, the latter case also being known as cross-synthesis.
There are various signal processing models for this, but it is also possible to
generate a series of incremental variants, say between sound A and sound B,
and explore the categorical threshold between the two.
The inherent challenge with such variant shape methods is that interesting
shapes are multidimensional: they usually cannot be characterized by only
one value axis so choices have to be made as to what aspect(s) are focused on.
The same goes for similarity ratings of differing shapes: Which part of the
shapes are we comparing, or are we going for a more global or cumulative
similarity judgement? In their pioneering work on categorization, Eleanor
Rosch and colleagues suggested that categories may be strongly linked with
motor schemas (Rosch et al. 1976): thus the category ‘chair’ may be difficult
to define from construction features alone (there are too many variants of
design, for example from rococo to modern) but is easier to categorize as
something to sit on.
12 Music and Shape
In sum, we can see that shape cognition in music, as well as in general, has to
do with features and categorical thresholds, and that the shape of body motion
can be an important part of understanding categories and shapes in music. It
follows that motor theory should be a part of shape cognition in music.
Motor theory
One leading idea in several domains of the cognitive sciences during the last
three decades has been to regard human cognition as rooted in bodily experi-
ence, as what has been broadly called embodied cognition. An essential fea-
ture of embodied cognition is that perception, thinking and understanding are
all related to mental simulation of body motion, meaning that we mentally
imitate the actions that we believe are the cause of what we perceive or that
actively trace one or more features of what we perceive. As suggested by Alain
Berthoz, with reference to Cézanne, seeing is a matter not just of passively tak-
ing in visual information with our eyes, but also of mentally tracing the outline
of what we are seeing, as if we are ‘touching’ whatever we see with our gaze
(Berthoz 1997).
In the case of spoken language, this means mentally simulating articula-
tory motions of the vocal apparatus when we listen to speech, and in the case
of music, mentally simulating the sound-producing body motions we believe
musicians are making: hearing ferocious drumming, we might imagine ener-
getic hand motions, while hearing soft string music, we might imagine slow and
protracted bowing motion. Such triggering of sound-producing images in lis-
tening means that we associate the shape of the sound-producing body motion
with the shape of the sound that we hear.
This theory of associations of sounds that we hear (or merely imagine) with
some kind of sound-producing body motion is known as the motor theory
of perception, sometimes referred to in the plural—‘motor theories’—because
several versions have been proposed. Originating in the 1960s in linguistics
(Liberman and Mattingly 1985), this can now be regarded as a more general
theory, also including other areas of human cognition (Galantucci, Fowler
and Turvey 2006). The gist of motor theory is that perception is production in
reverse, meaning that when we listen, we project motor images onto what we
are hearing and use these motor images as mental schemas to make sense out
of what we are hearing.
Motor theory concerns learning and expertise: if we are familiar with a lan-
guage or type of music, we probably know in more detail what body motion
goes into producing the sound; yet we may also have sketchier or vaguer motor
images of sounds we are not so familiar with. Although I myself speak nei-
ther Korean nor Polish, I believe I can distinguish these two languages by what
I perceive as their respective required phonological gestures. Having some
approximate image of sound production, which we have called motormimetic

sketching, is better than having no image at all, and we believe this applies to
music as we have found in our observation studies of so-called air instrument
performance (Godøy, Haga and Jensenius 2006). In these studies, we found that
most listeners, including those with no musical training, seemed able to repro-
duce sound-producing body motion that fitted the music they heard, reflecting
in their air performances overall pitch and rhythmic features and, more vari-
ably so, details of musical textures and articulations. We can see manifestations
of the motor theory in other cases of imitative behaviour, such as in scat sing-
ing and in beatboxing, with some people demonstrating a truly astonishing
capacity to imitate nonvocal sounds with their vocal apparatus.
One crucial feature of motor theory as applied to music is that all sonic
events are included in some kind of body-motion trajectory, trajectories that
will typically start before the onset of the sound(s), encompass the sound(s) and
often continue after the sound(s), for example moving the hand/mallet towards
the drum, making an impact on the drum membrane, moving the hand/mallet
back to the initial position. In the motor theory perspective, this drumming
body-motion shape will contribute to our shape images of drum sound: in the
words of Berthoz (1997), ‘Perception is simulated action’. Although we now see
increasing support for the motor theory perspective on perception from brain
observation studies as well as from behavioural studies, we still have several
challenges in finding out more about the links between sound and body motion
in perception of the various musical features, something that we believe also
necessitates an exploration of the timescales at work in musical experience.
Shape timescales
The basic tenet of this chapter is that most features of music, ranging from
low-level acoustic and body-motion features to high-level affective and aes-
thetic features, are time-dependent, yet can also be thought of as shapes. Shape
images are in a sense ‘outside time’, to use the expression of Xenakis (1992):
they are ‘snapshots’ of what has unfolded or is about to unfold in time. This
raises issues of continuity versus discontinuity in musical experience (and other
time-related experiences for that matter), issues much focused on by philoso-
phers and psychologists in the nineteenth and twentieth centuries, in particu-
lar by Husserl as mentioned above (see Godøy 2008, 2010b, 2011, 2013). One
approach to this enigma of the temporal versus the atemporal may be to look
at constraints at work in our perception of sound and motion, in particular to
try to single out qualitative differences at the various timescales involved here.
As we know, human hearing is situated in the region of approximately 20
to 20,000 Hz (for healthy young people), with a threshold at around 20 Hz
for fused versus distinct features. This means that the timescale above 20 Hz
14 Music and Shape
is mostly concerned with shapes of frequency relationships, meaning spectral

or formantic shapes (such as vowels and other stationary tone colour compo-
nents) and pitch relationships (intervals), whereas the timescale below 20 Hz is
concerned with all the other shape features of music and music-related body
motion. But there are also some qualitative timescale thresholds at work in
the region below 20 Hz. In our own research we have found it useful to discern
three main timescales that apply to both sonic and body-motion features:
1.
Micro timescale features: basically stationary or continuous features
of sound and motion: stationary pitch, loudness and timbre, and the
corresponding stationary postures, and continuous, smooth body
motions as in sustained bowing or blowing. This will include what is
often referred to as ‘sound’ in popular music research, meaning the
overall subjective impression, readily recognized in even very short
fragments of music (Gjerdingen and Perrott 2008).
2.
Meso timescale features: sound and motion features typically
unfolding in approximately the 0.5 to 5 seconds duration-range and
holistically perceived as motion-sound chunks, in the same duration-
range as the so-called ‘sonic objects’ of Schaeffer and co-workers.
The meso timescale is usually sufficient for perceiving most salient
sonic features such as rhythmical and textural patterns, melodic,
harmonic and modal/tonal features, as well as expressive, style,
genre, overall aesthetic and affective features, and the corresponding
body-motion features. A number of research findings converge on
the meso timescale as the most important in several areas of human
cognition, in particular that motion duration, attention spans, short-
term memory and meaning formation all seem to be attuned to this
timescale (see Godøy 2013 for an overview).
3.
Macro timescale features: the timescale of sections, movements,
whole works and various other long-lasting music-related events.
The perceptual workings of the macro timescale seem not to be
well researched, but we would hypothesize that it also concerns the
overlap and/or lingering memories of successive chunks from the
meso timescale.
Although these three timescales coexist in musical experience, it is possible

to zoom in and out of various timescales, intentionally shifting our atten-
tion, something designated by Schaeffer as the ‘context-contexture’ perspec-
tives (Schaeffer 1966; Chion 1983). This means that any sonic object may be
included in some larger-scale context yet also have its own internal context
called ‘contexture’. The essential principle for Schaeffer and for us here is that
at all these timescales we can conceptualize features as shapes.
However, of these three main timescales, the meso timescale is clearly the
most important when it comes to the experience of various salient musical
features, as mentioned above. Furthermore, the most important attribute of

the meso timescale here is that motion-sound chunks are holistically perceived,
that they are somehow kept in consciousness as whole units; because of this,
the prime sources of shape cognition in music are ‘instantaneous’ overview
images of both sound and body motion.
Sonic features
Seeing evidence from various research fields converging on the importance of

the meso timescale, we might find it useful to take a closer look at sonic features
at this level. Inspired by the work of Schaeffer and co-workers on sonic objects,
we adopt a subjective-perceptual top–down approach of listening and differen-
tiating sonic features at this meso timescale. This method originated in the early
days of musique concrète, when, for practical reasons, composers used looped
sound fragments on phonograph discs, called sillon fermé (‘closed groove’), in
the mixing of sounds when composing electroacoustic music. With repeated
listening to these looped sound fragments, Schaeffer and co-workers noticed
that their attention shifted from the everyday significations of the sound frag-
ments to the more subjectively perceived overall shapes of the sounds. This led
to developing a scheme for classifying the sonic objects, called the typology of
sonic objects, by their overall dynamic shapes and their overall pitch-related
shapes. The three main dynamic shapes are as follows:
1.
Sustained: a protracted sound, such as in bowing and blowing
2.
Impulsive: a short sound with a sharp attack as in percussive and
plucked sounds
3.
Iterative: a sound with rapid fluctuations such as in a tremolo
These three main types have clear correlates in body motion: the sustained
sonic objects imply a continuous transfer of energy from the body, hence a
continuous effort such as bowing or blowing; the impulsive implies an abrupt,
discontinuous type of body motion, so-called ballistic motion, as in hitting
or kicking; and the iterative implies a rapid back-and-forth or shaking body
motion.
Furthermore, there are categorical thresholds in this typology, and we can
explore these thresholds by producing incremental variants as presented ear-
lier. If a sustained sound is shortened below a certain duration threshold, it
will be perceived as an impulsive sound, and conversely, if an impulsive sound
is extended beyond a certain duration threshold, it will be perceived as a sus-
tained sound. Likewise, if an iterative sound is slowed down to a certain rate,
it will turn into a series of distinct impulsive sounds, and conversely, if a series
of distinct impulsive sounds is accelerated beyond a certain rate, it will change
into an iterative sound. As we shall see later, these category changes are related
16 Music and Shape
to so-called phase-transitions in body motion: changes in the morphology space

resulting from incremental changes in the control space, as we would say in
morphodynamical theory.
In the sonic object typology, there is furthermore an analogous coarse clas-
sification of the overall pitch-related features of a sonic object:
• Definite pitch: more or less stationary throughout the sonic object
• Complex pitch: inharmonic or various noise band sounds
• Variable pitch: pitch changing in the course of the sonic object, for
example by glissando
These two typological classifications (dynamic-related and pitch-related) were

combined into a 3 x 3 matrix, and could be applied as a first and coarse, yet very
useful, classification of overall sonic features as shapes. Other criteria were also
added to this rudimentary typology, and zooming into the micro features of the
sound, we could then elaborate a classificatory scheme called the ‘morphology
of sonic objects’.
In the morphology of sonic objects there is a similar top–down shape-related
classification of sonic features, including perceptually salient spectral features,
both quasi-stationary and more fluctuating, as well as profiles, rate, ampli-
tude and patterns of these fluctuations. Prominent morphological features are
found in the so-called grain and gait (‘allure’ in French, sometimes rendered in
English as ‘motion’, but also as ‘allure’) categories, where grain denotes vari-
ous fast fluctuations in the sound (of pitch, dynamics, timbre) and gait slower
fluctuations. As an example of this, consider the burring of a deep double bass
sound, readily evoking the metaphor of a kind of grain surface, and a slower
gait such as in the opening and closing of a wahwah mute as we can see in the
spectrogram representations in Figure 1.2.
The various morphology features in turn have further qualifications, denot-
ing the amplitude, rate, regularity and so on of grain- or gait-type fluctua-
tions, all the time with clearly shape-related labels.3 Importantly, what we see in
Schaeffer’s classificatory scheme is an attempt to single out and give names to
previously unnamed, yet perceptually salient, features of musical sound, some-
thing that is still an important challenge for psychoacoustic research (Peeters
et al. 2011).
Furthermore, advances in signal-based music research of the last couple
of decades have enabled research on various expressive features, both at the
subnote level and at the supranote level, in turn enabling research on musical
performance with two shape-related sonic features (see Goebl et al. 2006 for an
overview):
1. Timing/groove, including tempo curves as shapes
2. Expressivity, representing various minute inflections as shapes
5000
Frequency (Hz)
0
0 4.026
Time (s)
5000
Frequency (Hz)
0
0 4.026
Time (s)
FIGURE 1.2 The spectrogram of a sustained deep C double bass tone (top) and the spectrogram of
the same tone passed through a time-varying wahwah filter (bottom). The double bass tone has a
distinct burring sound, what could be referred to as a grain morphology feature in the terminology of
Schaeffer (1966), and the wahwah filtered version of this double bass tone has additionally a slower
open-close-open-close etc., a gait (or allure) morphology feature in the terminology of Schaeffer
(1966). At two different timescales, both grain and gait are clearly body-motion shape-related features,
i.e. grain making a fast shaking motion and gait making a slower opening and closing motion (cf. the
onomatopoetic associations of opening/closing the mouth in pronouncing ‘wahwah’).
Needless to say, we also often find uses of shape expressions designating more
traditional western music-theory-related sonic features in innumerable writings
on musical analysis, such as:
• Melodic features, such as contours, various kinds of patterns
• Harmonic features, both single chords and composite chord
progressions
• Modality, not as abstract pitch space (or scales) but as shapes of
interval constellations, referred to as ‘physiognomy’ by Lutosławski
(Norwald 1969)
• Rhythmical patterns and textures as shapes

In summary, we could say that most sonic features of musical experience could
be represented as a shape, bearing in mind the idea presented earlier that shape
is a fundamental cognitive strategy for making sense of the world. Yet there are
also a number of sonic features that are so close to body-motion features, as is
the case for rhythm and texture, that we need to have a look at what is sound
and what is body motion here.
18 Music and Shape
Body-motion features
We can observe a great variety of music-related body motion in dance, concert

and everyday listening situations, and in the course of several years of inter-
national research collaboration in this area we have come to suggest a basic
classification scheme for music-related body motion (Godøy and Leman 2010):
• Sound-producing body motion, related to all the sonic features
mentioned above, but more specifically excitatory, meaning energy
transfer from the body to the instrument (including the vocal
apparatus) such as in bowing, blowing, hitting and stroking; and
modulatory, meaning changing the effects of the energy transfer, such
as left-hand finger motion on string instruments and mute opening
or closing on brass instruments. There are also various types of
ancillary body motion here, to avoid fatigue or strain injury, to help in
articulation and expressivity, to communicate with other musicians,
or to make theatrical impressions on the audience. Although not
strictly sound-producing, conducting could also be included here
because of its role in guiding the musicians by beating time signatures,
and by postures, facial expressions and various motion trajectories,
expressing sonic features as shapes. Eminently shape-related is also
sol-fa and chironomy (in Jewish and Christian sacred music), and
other kinds of gestural visualization of musical features used in
various improvisational contexts.
• Sound-accompanying body motion includes all kinds of body motion

that listeners make to music, such as in dancing, walking, nodding
and gesticulating. Common to all sound-accompanying body motion
is that it is somehow related to one or more perceived sonic features
such as the predominant beat or melodic contour of the music.
Although we may see differing sound-accompanying body motions
made to the same music, so that the music has multiple gestural
affordances (Godøy 2010a), there is often a clear reflection of the
overall subjectively perceived energy of the music in the body motion,
as we can see in Figure 1.5.
There will be overlaps in many (perhaps most) cases between these catego-
ries, meaning that music-related body motion will also often be multifunc-
tional: some motion by a musician may, for instance, be both sound-producing
and communicative, such as an upward hand motion to prepare a fortissimo
chord on the keyboard, at the same time serving as an upbeat signal to the
other musicians, in addition to demonstrating a high level of energy to the
audience.
Besides observing body motion in performance that is not strictly sound-
producing, we may also readily observe body motion imitating sound-producing
when people listen to music, something that we have seen in so-called sound-
tracing, when listeners spontaneously draw (on a digital tablet or in the air) the
shape of sounds that they hear, an example of which can be seen in Figure 1.3.
More extensive study of sound-tracings, including statistical processing of cor-
relations between tracings and sound features, suggests that pitch contours are
quite robustly perceived as shapes, but also that dynamic and timbral features
1.2·104
Frequency (Hz)
0
0 4.841
Time (s)
FIGURE 1.3 The spectrogram of a distortion guitar sound with a downward glissando followed by a
slow upward expansion (top), and so-called sound-tracings of this sound by nine listeners (bottom).
The sound-tracings were made on a digital tablet by the listeners immediately after hearing the sound
for the first time, and should reflect something of how they spontaneously perceived the overall shape
of this sound.
20 Music and Shape
may likewise be spontaneously traced as shapes as long as there is not too much
competition between the features (Nymoen 2013).
The point here is that listening to or imagining music activates mental
images of some kind of music-related body-motion, and that these images are
one of the main sources for shape concepts in musical experience. Taking the
consequences of such close links between sonic and body-motion features, the
question arises as to the true nature of musical features such as rhythmical and
textural patterns: Are they sonic or body-motion patterns? For instance, is a
dance pattern (waltz, tango, samba) a sonic or body-motion pattern? Similarly,
is chunking in music based on sonic cues (sometimes referred to as qualitative
discontinuities in the sound) or on body-motion patterns? Our understanding
is that music includes both sonic and body-motion features, and that these fea-
tures are united in multimodal shape images although they actually emerge
from various constraints at work in the production of musical sound.
Constraint-based shapes
The fact that before the advent of electronic music technology music tradition-
ally was made by body motion in interaction with physical instruments or the
human vocal apparatus means that, in addition to body-motion constraints,
various instrument constraints imposed by physics are reflected in the resultant
sonic shapes. Observing that musical expression is ‘on top of’ instrumental and
body-motion constraints by no means diminishes the endless volitional expres-
sive capacities of music, but it should remind us to take various constraints on
sound production into account when we talk about shape in music.
To begin with, musical instruments have constraints, both in the mode of
excitation and in the subsequent energy dissipation: hitting a metal plate with
a hammer is an impulsive type of body motion, resulting in a sound with short
attack followed by a long decay. The perceived sonic shape is constrained here
by the size, shape and material of the metal plate and the hammer, and by the
force and duration of the impact. Instrumental and vocal sounds typically
have such overall envelope shapes, but may also have various internal textural
features as a direct physical response to excitations, for example the rough or
grainy sound of a deep double bass (bearing in mind the presentation of grain
earlier), or the hollow smooth sound of a high harmonic (flageolet) tone on
a violin.
In our music and shape context, it is interesting to consider so-called physi-
cal model sound synthesis as a way of thinking that takes physical constraints
into account, such as in a mathematical model that simulates the physical exci-
tation and resonance features of ‘real’ instruments or the human voice where
the resultant sonic shapes are constrained by the physical parameters of the
model. The point is that the behaviour of the physical model results in ‘real
world’ emergent sonic shapes, fitting with our ecological schemas of how sound
unfolds, in contrast to an abstract synthesis model such as additive synthesis,
where in principle any number of sinusoid components, with any frequency,
duration, fluctuations and so on, may be combined, and where there is really
no connection to the outside world except via those images we might project
onto the sound from previous experiences of similar features, by what is called
‘anthropomorphic projection’.
Instrumental or vocal performances in turn have their sets of constraints,
not just those we typically associate with different instruments—their idioms
or clichés (the things that are easy to play and sound well on an instrument)—
but also more general body-motion constraints that we believe contribute to
the shape of musical sound. Body-motion constraints, both biomechanical and
more neurocognitive (sometimes difficult to tell apart), effectively limit possible
body-motion range, speed and duration, and also necessitate rests and shifts
in posture to avoid fatigue and/or strain injury. Also, the fact that all human
body motion takes time, because it is not possible to move instantly from one
position to another, means that there always will be transition time between
positions. This in turn means that music-related body motion is continuous
(although it may at times appear as abrupt) and hence may result in fusion or
contextual smearing of otherwise singular sound onsets, apparent as so-called
phase-transitions and coarticulations.
Phase-transition designates changes in behaviour due to changes in some
parameter such as the speed and/or amplitude of body motion (Haken, Kelso
and Bunz 1985). In our context this means that otherwise singular motion-
units may fuse into a superordinate unit if the speed is increased, and con-
versely, a rapid motion may become split into distinct units if the speed is
decreased, as would be the case of a 3/4-time waltz pattern going from three
beats per measure to one beat per measure with increasing tempo, and con-
versely, from one beat per measure to three beats per measure with decreasing
tempo, similar to the transitions between sustained, impulsive and iterative
sounds mentioned above.
Coarticulation means that there is a fusion and contextual smearing of body
motion so that otherwise singular actions fuse into more superordinate trajec-
tories; in other words, body motion creates a context where the present state
of an effector (finger, hand, vocal tract) is determined by what was just done
as well as what is to be done next (Rosenbaum 1991). This means that there
are so-called carryover and anticipatory effects at work in sound p roduction,
something that has been quite extensively studied in linguistics (Hardcastle
and Hewlett 1999) but less so in music (Godøy, Jensenius and Nymoen 2010;
Godøy 2014). This coarticulatory fusion also has consequences for the sound
produced, contributing to a similar contextual smearing of sound and of
motion, resulting in continuous trajectories that in turn are one of the sources
of shape experience in music.
22 Music and Shape
Another element of motor control is that body motion seems to be organized

hierarchically by a series of goals (Grafton and Hamilton 2007). Following
findings from motor control research, this can be understood as a series of
postures, with continuous motion between them (Rosenbaum et al. 2007). For
convenience, we have chosen to use the terms key-postures and trajectories in
our publications, ‘key-postures’ denoting the shape and position of the sound-
producing effectors (fingers, hands, arms, tongue, lips, vocal tract and so on)
and ‘trajectories’ denoting the continuous motion of the effectors between
these key-postures.
One aspect here is that of continuous versus intermittent motor control, a
much-debated topic for more than a century (Elliott, Helsen and Chua 2001).
Classical control theory, be that in human motion or machines, stipulates two
basic control schemes, closed loop with continuous feedback adjustment (as
in a thermostat) and open loop with only intermittent control, typically lim-
ited to initiating the motion, as in hitting a golf ball. Closed loop seems plau-
sible enough from everyday experience, in that we adjust our body motion in
response to the effects of our body motion, as in balancing, singing a tone,
bowing on a string instrument and so on. Yet the difficulty here is that all such
adjustment takes time. To avoid delays, there must be some kind of anticipa-
tory cognition at work: we somehow have to have an ‘all-at-once’ image of the
ensuing motion trajectory, which is a feature of open loop control, as in hitting
the golf ball. There is mounting evidence for this kind of anticipatory cognition
at work in human motor control, leading to the idea of action gestalts, where
human motion is seen as a series of pre-programmed motion shapes (Klapp
and Jagacinski 2011).
One result of going deeper into shape cognition is the realization that atten-
tion and effort are unequally distributed, that there is an intermittency of both
attention/control and effort/energy influx in body motion. Intermittency in
human motor control is now gaining support from a number of observations
such as the work on action gestalts (Klapp and Jagacinski 2011) mentioned
already and more general human motion control theory (Loram et al. 2011),
and it supplements the evidence for key-posture-based action planning and
control (Rosenbaum et al. 2007).
Adapted to our context, we believe music can also be understood as cen-
tred on certain salient moments in time in the form of downbeats and other
accents—on what we call ‘goal-points’ in music—and that the key-postures
are situated at these goal-points. These key-postures and goal-points in the
music are intermittent, and so there is a fundamental discontinuity at work
in music, albeit a discontinuity that may be forgotten in the face of the con-
tinuous motion trajectories and sound between these goal-points, as well as
through continuous series of often overlapping chunks in succession, as we
hypothesize is the case at macro timescales in music. Furthermore, we hypoth-
esize that the shape of these postures and trajectories also forms the basis for
sonic shapes. We can see a short example of this in Figure 1.4, where we have
FIGURE 1.4 The score of the first two bars of the last movement of Beethoven’s Piano Concerto
No. 1 (top), and graphs showing the position, velocity and acceleration of the vertical motion of the
right-hand knuckles, wrist (RWRA) and elbow (RELB) in the performance of these two bars. We
clearly see the up–down motion at the downbeats, i.e. at what we call the goal-points, as well as the
relative high velocity at these points, typical of so-called ballistic motion.
the key-postures at the goal-points of the downbeats and continuous motion

trajectories between these key-postures.
Motion-sound chunks
On the basis of our own and others’ research, then, we believe that there are sev-
eral elements of musical instruments, body motion and human cognition that
converge in singling out meso timescale motion-sound chunks as primordial for
the experience of shape in music, elements that may be summarized as follows:
• A number of findings in research on human motor control, memory
and attention point to the meso timescale as special in terms of
meaning in both perception and action.
• More specifically in music, the meso timescale is also sufficient for

perceiving a number of musically salient features such as rhythm,
texture, dynamics, timbre, melodic, harmonic and modal features,
style and genre, and sense of motion and affect.
24 Music and Shape
In the context of music and shape, the meso timescale motion-sound chunks
are clearly carriers of salient shape experiences in music:
• All sounds are included in some action trajectory, with various
principles of human motion such as phase-transition and
coarticulation contributing to emergent effects of fused body-motion
and sonic shapes; thus, there is a contextual smearing of otherwise
singular motion and sound elements within the fused chunk.
• This contextual fusion is evident in most musical features, but
in particular in tightly welded units such as various ornaments
(Pralltriller, mordent, turn, etc.) and other figures (all kinds of
rhythmical patterns such as waltz, tango, samba and so on) where
the speed and density of motion and sonic events typically are so
high that anticipatory cognition is required, so that these figures are
conceived and performed as singular, holistic body-motion shapes.
In a phenomenological perspective, motion-sound chunks may be understood

in this way:
• The ‘all-at-once’ and the ‘now-points’ were basically epistemological
arguments of Husserl (and several of his contemporaries) but now can
be understood as grounded in intermittent, serial ballistic, anticipatory
cognition (Husserl 1991; Godøy 2008, 2010b, 2011, 2013).
• Singling out the fusion features of the meso timescale, and

assessing the available evidence here, we hypothesize that musical
experience combines discontinuity with continuity by concatenating
meso-timescale chunks into macro-timescale experiences of
continuous music.
Motion-sound scripts
Although there is converging evidence that the meso timescale is crucial for per-
ceiving very many musical features, we also clearly experience music at longer
timescales: people go to performances of symphonies and operas, participate
in various long-lasting music-related events and rituals, or report long-endur-
ing trance-like experiences of music. Yet the perception of large-scale forms in
music seems not to be a well-researched topic. What we have is a substantial
number of western music analysis texts that assume the efficacy of large-scale
forms, but the little perceptual-empirical material that we have come across
suggests that we should be rather sceptical of such claims until further notice
(see for example Eitan and Granot 2008).
Lacking more systematic research in this area, we could assume from our
motor theory perspective that general principles of goal-directed motor cogni-
tion apply here, so that we understand long sequences as a series of key-postures
with intervening continuous motion trajectories and may also mentally quickly
run through a long stretch of music, just as we mentally run though a long walk
or a whole journey by a series of landmarks or junctions. This would essentially
amount to understanding large-scale musical works as extended motion-sound
scripts, as a series of concatenated and/or overlapping motion-sound chunks,
creating a sense of long-range continuity in musical experience. In addition to
the features of meso-timescale chunks, the macro timescale may often, by its
longer extension, have new dramaturgical and/or narrative features. We could
also speculate that such macro-level motion-sound scripts in turn could be
envisaged as having shapes, shapes that we could glimpse in an instant, just
as we could envisage a long walk or journey; in other words, the same prin-
ciple of ‘all-at-once’ overview images applies here too, as a kind of compressed
‘trailer’ or ‘story board’ for the whole work, as in the famous statement by Paul
Hindemith that ‘If we cannot, in the flash of a single moment, see a composi-
tion in its absolute entirety, with every pertinent detail in its proper place, we
are not genuine creators’ (2000: 61).
What we do know from our research on music-related body motion is that
we can see some salient global features over longer stretches, such as quan-
tity of motion (essentially a physical measure based on total displacement of
the body or parts of the body within a unit of time), recurrent patterns of
5000
Frequency (Hz)
0
0 Time (s) 62.84
FIGURE 1.5 The top part shows motiongrams (i.e. video-based summary images of motion
trajectories; see Jensenius 2013 for details) of three different successive dance performances by
the same dancer to a twenty-second excerpt from Lento from György Ligeti’s Ten Pieces for Wind
Quintet (Ligeti 1998), and the bottom part shows for the purpose of reference three repetitions of the
spectrograms of this excerpt. From this macro timescale view of the dancer’s body motion, we can
clearly see the overall shape (curve out from initial position and back) and mode of motion (mostly
calm but with a few abrupt elements).
26 Music and Shape
body motion, as well as amplitude, velocity and degree of calmness or agita-

tion by extracting measures of ‘jerkiness’ in the recorded body-motion shapes
(Hogan and Sternad 2007). Such global features of body motion can in turn
be correlated with various other qualitative observations of affect, style and
genre, providing us with important shape insights also at the macro timescale.
In Figure 1.5 we see such an example of three variant versions of a twenty-
second dance sequence to music by Ligeti, each variant having similar overall
motion qualities, although local details vary.
While much remains to be done in this area, the idea of shape cognition
seems to be both applicable and useful at the macro timescale, provided that at
this timescale we also succeed in making ‘all-at-once’ or ‘instantaneous’ over-
view images of body motion and sonic features.
Thinking shapes in music
The observation that shape metaphors and graphical shape representations

are ubiquitous in music-related contexts should by itself suggest that there is
a close relationship between shape cognition and sound in musical experience.
But given presently available methods and technologies for recording, analysing
and correlating all kinds of sonic and motion feature data, it should be possible
to make much more systematic explorations of shape cognition in music (see
also Küssner, Chapter 2 below). We now have possibilities for bypassing the
restrictions of western music notation in music research and working directly
with shapes as holistic, nonsymbolic entities in music.
Yet in the face of such optimism, we still face many challenges, first of all
to develop less obtrusive and ecologically valid observation settings for music-
related body motion, and also to develop better means of data processing,
both of input signals and for exploring various patterns and correlations.
Additionally, there are the enigmas, already mentioned, of how our minds are
able somehow to extract information from a continuous stream of sensations,
to break out of the continuous flux of time and generate more or less stable
overview images, and also to integrate sense modalities—enigmas that we hope
the cognitive sciences can shed light on in the coming decades.
Despite such challenges of method as well as basic cognitive issues, the great
advantage of shape-cognition in music is in opening up new areas of musi-
cological, aesthetic and affective psychological research, as well as providing
practical tools in artistic creation, for example in the domains of sonic design
and various kinds of multimedia art. In this connection, thinking and actively
working with shapes in music as was practised several decades ago by Schaeffer
and co-workers, by what we have called motormimetic sketching of sonic fea-
tures, means embarking on what is essentially a hermeneutical circle of drawing
(mentally, on paper, digitally), listening, drawing, listening, each time creating

a greater awareness of sound and body-motion features as shapes, and in this
process enhancing our understanding of music and other multimedia arts.
References
Berthoz, A., 1997: Le sense du mouvement (Paris: Odile Jacob).

Bregman, A., 1990: Auditory Scene Analysis (Cambridge, MA, and London: MIT Press).
Chion, M., 1983: Guide des objets sonores (Paris: INA/GRM Buchet/Chastel).
Cogan, R., 1984: New Images of Musical Sound (Cambridge, MA, and London: Harvard
University Press).
Delalande, F., M. Formosa, M. Frémiot, P. Gobin, P. Malbosc, J. Mandelbrojt and
E. Pedler, 1996: Les Unités Sémiotiques Temporelles: Éléments nouveaux d’analyse musi-
cale (Marseille: Éditions MIM –Documents Musurgia).
Eitan, Z. and R. Y. Granot, 2008: ‘Growing oranges on Mozart’s apple tree: “inner form”
and aesthetic judgment’, Music Perception 25/5: 397–417.
Elliott, D., W. Helsen and R. Chua, 2001: ‘A century later: Woodworth’s (1899) two-
component model of goal-directed aiming’, Psychological Bulletin 127/3: 342–57.
Galantucci, B., C. A. Fowler and M. T. Turvey, 2006: ‘The motor theory of speech percep-
tion reviewed’, Psychonomic Bulletin & Review 13/3: 361–77.
Gjerdingen, R. and D. Perrott, 2008: ‘Scanning the dial: the rapid recognition of music
genres’, Journal of New Music Research 37/2: 93–100.
Godøy, R. I., 1997: Formalization and Epistemology (Oslo: Scandinavian University Press).
Godøy, R. I., 2006: ‘Gestural-sonorous objects: embodied extensions of Schaeffer’s con-
ceptual apparatus’, Organised Sound 11/2: 149–57.
Godøy, R. I., 2008: ‘Reflections on chunking in music’, in A. Schneider, ed., Systematic
and Comparative Musicology: Concepts, Methods, Findings (Frankfurt: Peter Lang),
pp. 117–32.
Godøy, R. I., 2010a: ‘Gestural affordances of musical sound’, in R. I. Godøy and M.
Leman, eds., Musical Gestures: Sound, Movement, and Meaning (New York: Routledge),
pp. 103–25.
Godøy, R. I., 2010b: ‘Thinking now-points in music-related movement’, in R. Bader,
C. Neuhaus and U. Morgenstern, eds., Concepts, Experiments, and Fieldwork: Studies
in Systematic Musicology and Ethnomusicology (Frankfurt am Main: Peter Lang),
pp. 245–60.
Godøy, R. I., 2011: ‘Sound-action awareness in music’, in D. Clarke and E. Clarke, eds.,
Music and Consciousness (Oxford: Oxford University Press), pp. 231–43.
Godøy, R. I., 2013: ‘Quantal elements in musical experience’, in. R Bader, ed., Sound—
Perception—Performance (Berlin: Springer), pp. 113–28.
Godøy, R. I., 2014: ‘Understanding coarticulation in musical experience’, in Sound,
Music, and Motion: 10th International Symposium, CMMR 2013, Marseille,
France, 15– 18 October 2013, Revised Selected Papers (Berlin: Springer),
pp. 535–47.
28 Music and Shape
Godøy, R. I. and M. Leman, 2010: Musical Gestures: Sound, Movement, and Meaning
(New York: Routledge).
Godøy, R. I., E. Haga and A. Jensenius, 2006: ‘Playing “air instruments”: mimicry of
sound-producing gestures by novices and experts’, in S. Gibet, N. Courty and J.-F.
Kamp, eds., Gesture in Human-Computer Interaction and Simulation: 6th International
Gesture Workshop, Lecture Notes in Artificial Intelligence 3881 (Berlin: Springer), pp.
256–67.
Godøy, R. I., A. R. Jensenius and K. Nymoen, 2010: ‘Chunking in music by coarticulation’,
Acta Acustica united with Acustica 96/4: 690–700.
Goebl, W., S. Dixon, G. De Poli, A. Friberg, R. Bresin and G. Widmer, 2006: ‘ “Sense” in
expressive music performance: data acquisition, computational studies, and models’, in
P. Polotti and D. Rocchesso, eds., Sound to Sense, Sense to Sound: A State of the Art in
Sound and Music Computing (Berlin: Logos Verlag), pp. 195–242.
Grafton, S. T. and A. F. Hamilton, 2007: ‘Evidence for a distributed hierarchy of action
representation in the brain’, Human Movement Science 26: 590–616.
Haken, H., J. Kelso and H. Bunz, 1985: ‘A theoretical model of phase transitions in human
hand movements’, Biological Cybernetics 51/5: 347–56.
Hardcastle, W. J. and N. Hewlett, eds., 1999: Coarticulation: Theory, Data and Techniques
(Cambridge: Cambridge University Press).
Hindemith, P., 2000: A Composer’s World: Horizons and Limitations (Mainz: Schott).
Hogan, N. and D. Sternad, 2007: ‘On rhythmic and discrete movements: reflections,
definitions and implications for motor control’, Experimental Brain Research 181/1:
13–30.
Hunt, A., M. Wanderley and M. Paradis, M., 2003: ‘The importance of parameter map-
ping in electronic instrument design’, Journal of New Music Research 32/4: 429–40.
Husserl, E., 1991: On the Phenomenology of the Consciousness of Internal Time, 1893–1917,
trans. J. B. Brough (Dordrecht: Kluwer Academic).
Jensenius, A. R., 2007: ‘Action–sound: developing methods and tools to study music-related
body movement’ (PhD dissertation, University of Oslo).
Jensenius, A. R., 2013: ‘Some video abstraction techniques for displaying body movement
in analysis and performance’, Leonardo: Journal of the International Society for the Arts,
Sciences and Technology 46/1: 53–60.
Jensenius, A. R. and R. I. Godøy, 2013: ‘Sonifying the shape of human body motion using
motiongrams’, Empirical Musicology Review 8/2: 73–83.
Klapp, S. T. and R. J. Jagacinski, 2011: ‘Gestalt principles in the control of motor action’,
Psychological Bulletin 137/3: 443–62.
Liberman, A. M. and I. G. Mattingly, 1985: ‘The motor theory of speech perception
revised’, Cognition 21: 1–36.
Ligeti, G., 1998: Ten Pieces for Wind Quintet, on London Winds, György Ligeti Edition,
Vol. 7: Chamber Music (Sony SK 62309).
Loram, I. D., H. Gollee, M. Lakie and P. J. Gawthrop, 2011: ‘Human control of an inverted
pendulum: is continuous control necessary? Is intermittent control effective? Is intermit-
tent control physiological?’, The Journal of Physiology 589/2: 307–24.
McGurk, H. and J. MacDonald, 1976: ‘Hearing lips and seeing voices’, Nature 264: 746–8.
Norwald, O., 1969: Lutosławski (Stockholm: Norstedt).
Nymoen, K., 2013: Methods and technologies for analysing links between musical sound and
body motion (PhD dissertation, University of Oslo).
Peeters, G., B. L. Giordano, P. Susini, N. Misdariis and S. McAdams, 2011: ‘The timbre
toolbox: extracting audio descriptors from musical signals’, Journal of the Acoustical
Society of America 130/5: 2902–16.
Petitot, J., 1985: Morphogenèse du Sens I (Paris: Presses Universitaires de France).
Petitot, J., 1990: ‘Forme’, in Encyclopædia Universalis (Paris: Encyclopædia Universalis).
Risset, J.-C., 1991: ‘Timbre analysis by synthesis: representations, imitations and variants
for musical composition’, in G. De Poli, A. Piccialli and C. Roads, eds., Representations
of Musical Signals (Cambridge, MA, and London: MIT Press), pp. 7–43.
Rosch, E., C. B. Mervis, W. D. Gray, D. M. Johnson and P. Boyes-Braem, 1976: ‘Basic
objects in natural categories’, Cognitive Psychology 8: 382–436.
Rosenbaum, D., 1991: Human Motor Control (San Diego, CA: Academic Press).
Rosenbaum, D., R. G. Cohen, S. A. Jax, D. J. Weiss and R. van der Wel, 2007: ‘The problem
of serial order in behavior: Lashley’s legacy’, Human Movement Science 26/4: 525–54.
Schaeffer, P., 1966: Traité des objets musicaux (Paris: Éditions du Seuil).
Schaeffer, P. (with sound examples by G. Reibel and B. Ferreyra), [1967] 1998: Solfège de
l’objet sonore (Paris: INA/GRM).
Schäffer, B., 1976: Introduction to Composition (Warsaw: PWM Edition).
Sethares, W. A., 2007: Rhythm and Transforms (Berlin: Springer).
Smith, B., ed., 1988: Foundations of Gestalt Theory (Munich and Vienna: Philosophia
Verlag).
Thom, R., 1983: Paraboles et catastrophes (Paris: Flammarion).
Xenakis, I., 1992: Formalized Music, rev edn. (Stuyvesant, NY: Pendragon Press).
Reflection
Lucia D’Errico, guitarist and graphic designer
There is no optical space in my experience of music. If I leave aside a sponta-

neous association of pitches with fields of colour (so flat and vibrant, though,
that they acquire almost a haptic quality), the role of sight is relegated to the
preliminary and purely intellectual moment of musical notation. The shape
that delineates itself when listening to or making music is rather the blind den-
sity of my own body. It is a body subjected to forces of different magnitude that
act from both inside and outside itself.
This shape is kept in dynamic tension by four force lines: the first (dis-
charge) anchors it to the ground, the second (charge) keeps it upright, the third
(advance) propels it forwards, and the fourth (recoil) backwards. Synchronic
musical elements organize themselves around these lines in a way that is sche-
matized in Figure R.1. Thus, whereas the bass has the role of a hidden region
where both balance and drive are located, melody is the recognizable and com-
municating part, as are the face and the hands. Harmony connects and reg-
ulates these regions like an organ system, and rhythmical elements fulfil the
motoric function. These tensions/elements can amalgamate, as well as inter-
change functions, as in a harmoniously working human body; but a musician
can also choose to dissociate or to omit some of them. It is in one such case that
we experience the harrowing beauty of the aria ‘Aus Liebe will mein Heiland
sterben’, from Johann Sebastian Bach’s St Matthew Passion. The accompani-
ment of the voice is restricted to high-sounding instruments only: our breath is
reduced to the length of our air tube, and whatever stands beneath is paralysed,
forgotten.
What is a more wonderful example of this body-like musical shape than the
song ‘Das ist ein Flöten und Geigen’, from Robert Schumann’s Dichterliebe,
based on a text by Heinrich Heine? A wedding feast is taking place, but that
of the poet’s beloved; he is, so to speak, peeping in through the window. On a
gauche waltz rhythm (the dance of sexual liberation at the time) that reproduces
30
Reflection: Lucia D’Errico 31
CHARGE
RECOIL ADVANCE
MELODIC
HARMONIC
BASS
RHYTHMIC
DISCHARGE
FIGURE R.1 Schematization of bodily music-shape forces (in colour at )
the musical frenzy of the party, the right hand of the piano weaves a suspended,
almost religious obbligato: ‘Dazwischen schluchzen und stöhnen/Die liebliche
Engelein’ (‘in between, sobbing and groaning,/the lovely little angels’). One
single sonic sensation contains the bodily giddiness of the happy couple and
the dejected inertia of the onlooker.
These four force lines need not be intended as vectors that cause a move-
ment throughout time, but rather as internal potentialities. Advance and recoil
are not forces that establish a chronological order; they interact with it, gen-
erating micro variations and perturbations inside a steady sequential grid.
Advance is not acceleration, but longing. Recoil is not ritardando, but lingering.
Additionally, this bodily shape, so complex and changing in itself, moves inside
another shape, which I would call architectural: the diachronic dimension of
music. Again, it is not an architecture one can see, but rather a space to cross
with blind eyes. This, depending on the levels of complexity, might resemble a
palace, a hut, or even a garden or a desert; it might have varying temperature
and light (but no optical shapes!). As a listener, I am led to move in unexplored
spaces. As a performer, it is I who is trying to lead someone else through an
architecture I know well. As a composer, I conceive this architecture first and
32 Music and Shape
then try to inhabit it until I am ready to distinguish and remember all of its
details.
Strange as it may sound, something very similar happens in my work as
a graphic designer. There are no optical shapes beforehand: there are forces,
which organize themselves on the empty canvas. The result is not predeter-
mined, but issues from the coagulation of these physical drives into visual ele-
ments. It is not a question of reproducing the visible, but of making visible
(Paul Klee). I ignore the subject I want to design, since it is dictated afterwards
by the arrangement of vectors I perceive somatically. For the same reason, the
habit of organizing music in an optical way as a timeline is as serviceable as it
is misleading. A musical experience is not the sonic rendering of a linear score.
On the contrary, a score should be nothing but the code, the deciphering of
which might recreate a planned spatial and haptic experience in the listener
through sound.
2
Shape, drawing and gesture

EMPIRICAL STUDIES OF CROSS-M ODALITY
Mats B. Küssner
Human processing of sound and music as

a multimodal phenomenon
Music—as pertaining to the very act of shaping sounds over time during a
performance—engages most of our senses. As audience members in a concert,
we hear the musical sounds, we see the musicians on stage, and we feel the
rhythmic beat, only to realize that we have been tapping our finger to it, and
perhaps we taste a moment of sweetness during an intensely emotional passage.
Although the latter seems metaphorical, it also seems an apt description, sug-
gesting an underlying mapping from sound experience to taste (Knöferle and
Spence 2012). Even sitting at home and listening to a record in solitude with
eyes closed necessarily entails a multimodal experience as we map features of
the musical sound onto other domains, particularly the spatial and visual. That
is the central argument of the chapter. We feel the melodic line ascending and
descending; we feel we are moving or being moved forward, gently at times or
with sudden force; we sense the brightness or gloominess of some passages; or
perhaps we conjure up internal images that the music invoked in us and that
now become an integral part of our listening experience. How do we map music
onto other domains and why do we do it so readily? In this chapter, I address
the former question in some depth by reviewing studies on individuals’ draw-
ings and gestures in response to musical sounds. I introduce these multifaceted
shapes of sound and music as a way of studying music perception and cogni-
tion empirically, and outline methodological issues and challenges. To begin
with, however, I take a very brief look at some potential explanations for why
these cross-modal mappings may exist in the first place.
33
34 Music and Shape
It is possible that our brains have evolved to be equipped with an innate

capacity for auditory-visual correspondences (Walker et al. 2010), though such
a view is currently contested (Lewkowicz and Minar 2014). What appears to be
undisputed, however, is that learning plays a crucial role in shaping cross-modal
correspondences (Spence and Deroy 2012). From an evolutionary perspective,
by far the most common mode of music listening is the experience of musical
sound emerging from social contexts. In communal activities—perhaps origi-
nally serving the purpose of group cohesion and bonding (Roederer 1984)—
we see and hear sounds being produced by our conspecifics who use various
gestures, postures and possibly instruments to create and/or accompany musi-
cal sounds. Indeed, the earliest couplings of visuo-spatial and auditory cues
are likely to happen in parent–infant interactions such as mothers singing to
their child (Trehub and Trainor 1998), displaying a wide range of (exaggerated)
expressive behaviour (e.g. facial expressions, gestures, etc.). And while we form
cross-modal associations by observing others making sounds and music, we
are also perceptive to cross-modal mappings of music in our own bodies, for
instance when we sing. Through proprioceptive feedback, we are able to feel
the rise of our larynx when producing a high-pitched sound with our voice
(Parkinson et al. 2012), or we might notice the raising of our eyebrows (Huron,
Dahl and Johnson 2009; Huron and Shanahan 2013). Through repeated expo-
sure to such couplings of perception and action, we form stable associations
between the actions performed and the sounds being heard, to the point where
both form a common representation (Prinz 1990).
Many of the effects found in cross-modal perception of music, and indeed
perception in general, have their origin in speech perception and cognitive lin-
guistics. The idea that the perception of speech is not merely the processing of
physical properties of the sound but largely based on an internal simulation of
the actions that produced the sound—formalized in the motor theory of speech
perception (Galantucci, Fowler and Turvey 2006; Liberman and Mattingly
1985)—has been highly influential in the cognitive sciences. But, of course,
apart from the biological mechanisms, the impact of culture—e.g. through
language— is evident and manifested in cross- cultural differences of map-
pings of pitch, for instance (Dolscheid et al. 2013; Eitan and Timmers 2010).
Influentially, Lakoff and Johnson (1980) argued that conceptual metaphors are
based on our experiences and interactions within a cultural environment, shap-
ing the way we think and perceive the world. That is, we may use our experi-
ence of MORE IS UP, LESS IS DOWN—originally referring to the numerous
instances in the physical world (i.e. the so-called source domain)—and map it
onto an abstract domain (i.e. the target domain) such as the pitch space where
MORE refers to higher pitches (see Zbikowski 2002).
All of these accounts have in common cross-modal experiences shaped by
bodily experiences within a particular cultural environment. In terms of cross-
modal mappings of music, my account is in line with scholars arguing that the
Shape, drawing and gesture 35
interaction of modalities is the primordial mode of music listening (Godøy

2003) and that music perception is a multimodal phenomenon rooted in our
bodies as a natural mediator between the physical world and musical experi-
ence (Leman 2007). If the body plays a central role in making sense of music,
then studying music perception through overt bodily responses such as draw-
ings and gestures should tell us something about this meaning-making process.
Traditional experimental paradigms

of cross-modal correspondences
How sound and music are mapped onto the visual and visuo-spatial domains—
with paradigms other than drawing or gesturing—has been reviewed at length
elsewhere (Eitan 2013; Spence 2011) and is not discussed here. However, it is
important to review the experimental paradigms underlying the vast majority
of empirical findings to date to be able to put drawing and gesturing approaches
into context.
To a large extent, increasing knowledge of cross-modal correspondences is
based on reaction-time paradigms that were developed by Garner in the 1960s
around the same time that the cognitive revolution gained momentum, with the
underlying metaphor of the human mind as a computer processing incoming
information.1 According to this view, sensory input from different modalities
is integrated at various levels of processing ranging from early sensory/percep-
tual levels to late semantic levels (for a review see Marks 2004). The speed with
which this processing occurs can be measured in behavioural experiments in
which participants respond to features of a dimension of a modality by press-
ing buttons which have been assigned certain feature values. In the simplest
case, there is only one modality involved, and features are varied only along
one dimension. For instance, participants may be asked to indicate as quickly
as possible whether the pitch (i.e. the relevant dimension) of a sound is high
or low, while the loudness (i.e. the irrelevant dimension) is kept constant. This
task—which has been termed ‘speeded identification’—often serves as a base-
line condition, involving two possible stimuli and two possible responses. If
the irrelevant dimension is varied as well (e.g. loudness: soft and loud), we get
four possible stimuli (high/soft, high/loud, low/soft, low/loud) while the num-
ber of possible responses is still two. In the latter scenario—‘speeded classifica-
tion’—participants’ task is to ignore the variation in the irrelevant dimension
(i.e. loudness) and indicate the feature value (high versus low) of the relevant
dimension (i.e. pitch). While these examples concern a single modality, there is
extensive research combining dimensions from several modalities (for a review
see Spence 2011). Whenever there are greater reaction times in comparison to
a baseline condition due to the variation of features in an irrelevant dimen-
sion or stimulus, this is referred to as ‘Garner interference’. On the other hand,
36 Music and Shape
whenever features from two dimensions—whether within a single modality or

across modalities—are aligned congruently (e.g. high pitch, high elevation)
such that the pairing gives rise to smaller reaction times in comparison to
incongruently aligned features from the same two dimensions (e.g. high pitch,
low elevation), this is referred to as a ‘congruence effect’.
In such reaction-time experiments it is important either to balance the
position of the response buttons across participants or to manipulate it
deliberately as a further independent variable due to the well-studied effects
of stimulus–response compatibility (Fitts and Seeger 1953). These repre-
sent another classic paradigm within which one may study cross-modal cor-
respondences. Crucially, the role of the participants’ actions, in the form
of button presses, becomes an integral part of the cross-modal mapping.
For instance, in an experimental setting where the two response buttons for
high and low pitch are arranged vertically, a high pitch is faster classified as
‘high’ when the corresponding button is the upper rather than the lower one
(Rusconi et al. 2006).
Besides the development and refinement of tasks involving speeded
responses, there is an even older type of paradigm concerned with unspeeded
responses. In fact, most of the early cross-modal mapping experiments con-
sisted of unspeeded tasks, asking participants to locate sounds with different
discrete pitches in space (e.g. Pratt 1930; Trimble 1934).
Another commonly observed unspeeded task is forced-choice matching.
When employing such a paradigm, individuals are asked to choose from a lim-
ited set of responses—there may be several but in some cases as few as two—
the one they think fits best with a stimulus presented. In a series of experiments,
Walker (1987) asked people to match pure tones which varied in frequency,
amplitude, waveform and duration with abstract visual figures which varied in
vertical and horizontal arrangement, size, pattern and shape. But ‘real’ musi-
cal excerpts and prints of paintings have also been used in one of the earliest
empirical studies in which participants were asked to match musical sound to
pictorial representations (Cowles 1935).
All paradigms described thus far have in common that participants’
responses are fairly restricted. While this allows researchers to investigate cross-
modal mappings rigorously by refining their paradigms and manipulations fur-
ther and adding to an ever-increasing body of evidence, the rigour comes at
the cost of richer, qualitative data which provide another fruitful angle on the
object of study: this is why researchers have applied paradigms involving open-
ended responses. Studying cross-modal mappings of sound and music with
free drawings and other bodily gestures opens up new pathways for enquiry. In
the following two sections, I provide an in-depth summary of studies applying
drawing and gesturing paradigms in order to investigate the perceived shape of
sound and music.
Drawings of sound and music
Children’s drawings of sound and music have been studied extensively, creat-
ing a large body of empirical evidence and proving influential for studies with
adults. They are thus reviewed here in some depth before moving on to adults’
drawings of sound and music.
CHILDREN’S VISUAL REPRESENTATION OF SIMPLE SOUND

STIMULI AND MUSICAL EXCERPTS
Children’s drawings have played an important role in psychology as it has

been argued that they form a window onto a child’s cognitive development
(Hargreaves 1978; Olson 1970; Piaget and Inhelder 1973; Werner 1980). In a
musical context, drawings of simple sound stimuli and musical excerpts might
thus be seen as insights into music cognition and the development of musi-
cal thinking (Davidson and Scripp 1988). Even though it is a moot question
exactly what these drawings represent—windows onto, or rather reflections of,
musical thinking (Barrett 2000)—they have been studied extensively since the
end of the 1970s, owing to two broadly shared assumptions among research-
ers (Barrett 2005: 125): first, young children may not have developed yet the
language to express adequately their musical thinking, and second, some musi-
cal experiences may defy linguistic descriptions and would be better and more
revealingly described nonverbally.
In a series of seminal experiments investigating visual representations of
simple rhythmic fragments, Bamberger (1980, 1982) paved the way for numer-
ous studies investigating children’s, as well as adults’, invented notations
of music. On the basis of shapes produced in her experiments, in which she
asked children aged four to twelve years first to clap a simple rhythm and then
to draw it, Bamberger (1982) proposed a developmental trajectory from ‘rhyth-
mic scribbles’ mimicking the clapping action with the pen, through figural
representations capturing perceptual groupings of the sounds, to metric rep-
resentations displaying the awareness of an underlying metric pulse by assign-
ing each symbol a particular duration. However, this Piagetian view, in which
each stage is replaced by the next, has been challenged by evidence showing
that children acquire a ‘database of strategies’ (Barrett 2005: 130), using one
or several approaches that seem most appropriate given the nature of the task
and the stimuli (Reybrouck, Verschaffel and Lauwerier 2009). For example,
Upitis (1987), among others who extended the work on visual representations
of rhythmic sequences (Davidson and Colley 1987; Davidson and Scripp 1988;
Smith, Cuddy and Upitis 1994), found that, regardless of musical training,
children aged seven to twelve years are all able to make sense of rhythm by
using figural or metrical representations or a combination of both types. Upitis
38 Music and Shape
used various active and passive rhythm tasks—including clapping rhythms,

drawing (and recognizing drawn) rhythms, verbal interpretations and tapping
along—and found that children draw on a large pool of representational strat-
egies. Importantly, she also emphasized the role of context, and was able to
show in subsequent studies that children are much less likely to represent the
rhythmic structure if it is embedded within an unknown melody (Upitis 1992).
Only when the pitch structure is fairly simple (e.g. an ascending scale) and the
rhythmic structure more complex do children show a more elaborate visual
representation of the rhythmic structure (Upitis 1990).
These findings are echoed by Davidson and Scripp (1988: 222), who call for
‘increasingly divergent paths of rhythm and pitch in representational devel-
opment’, seeing ‘rhythm and pitch in a figure-ground relationship, that is, the
rhythmic “figure” in isolation becomes “ground” when pitch is introduced into
the context of the phrase’ (ibid.: 226). In a musical culture based largely on
pitch, it is perhaps not surprising that children prefer, and find it easier, to draw
the pitch rather than the rhythm of a melody. Recent findings support this ten-
dency: Verschaffel et al. (2010) found that stimuli whose salient feature is the
pitch or the melody give rise to more differentiated visualizations than stimuli
whose salient feature is related to either rhythm or dynamics.
These are all examples of individual musical parameters studied either in
isolation or within the context of simple musical fragments. The question of
whether findings from such studies can and should be generalized to ‘real’
musical excerpts is currently debated (Elkoshi 2002; Reybrouck et al. 2009;
Verschaffel et al. 2010). Asking more than one hundred children aged seven to
eight-and-a-half years to draw rhythmic sequences that were either produced
(in isolation) by the children or part of a musical excerpt they listened to,
Elkoshi (2002) found no correlation between the visualizations of short sound
fragments and the musical excerpt, arguing that this gap cannot be closed and
that one may not infer from one to the other (see also Reybrouck et al. 2009).
On the other hand, more recent evidence suggests that such a correlation may
well exist (Verschaffel et al. 2010). Testing a comparably large group of eight-
to-nine-and eleven-to-twelve-year-olds, with and without musical training,
revealed that the quantity of differentiated visualizations2 in response to short
simple sound stimuli, each of which had been designed to highlight one spe-
cific musical parameter (pitch, duration and loudness), correlated positively
with the quantity of differentiated visualizations in response to real musical
excerpts, chosen to highlight three corresponding musical features (melody,
rhythm and dynamics).
Since proponents of Gestalt approaches may well have a point, it is important
to look at some of the evidence from real/complex musical excerpts. Gromko
(1994) asked sixty children aged four to eight years to sing or play a short
folk song provided by the author and then to ‘write the way the song sounds’
(ibid.: 139). Moreover, the children’s perceptual discrimination was tested in
a standardized rhythm and tonal task. Results revealed a positive correlation

between the musical understanding rating, computed on the basis of the per-
formance in the singing/playing and the perceptual discrimination task, and the
depiction of rhythmic and tonal elements in their invented notations, suggest-
ing that representation—alongside the more traditional measures of produc-
tion and perception—may indeed reflect the development of children’s musical
understanding. Comparing invented notations of familiar and unfamiliar
melodies of fifty children aged six to nine years with no formal musical train-
ing outside school, Upitis (1990: 94) found that the most commonly produced
shapes and symbols are ‘(a) icons, (b) words, (c) discrete marks for pitches and/
or durations, and (d) continuous lines for pitch and/or mood’. While there was
no apparent effect of age, an effect of familiarity showed that words and pic-
tures were more common for familiar songs—according to the children, that is
enough to recognize the tune—whereas discrete symbols for pitch were more
common for unfamiliar songs. Using the same familiar song as Upitis, ‘Twinkle,
Twinkle, Little Star’, and testing twenty Suzuki-trained3 children aged five to
ten years (duration of training varying from seven months to four years), Hair
(1993) found that apart from the youngest children who used pictures only,
the choice of pictures, icons, music symbols, and abstract lines and shapes was
similarly distributed across levels of age and musical training.
I have shown already that there is no clear developmental trajectory of the
strategies for representing music, but evidence pertaining to the influence of
musical training is contradictory: some researchers have found that increased
levels of musical training in children lead to more differentiated visualizations
of sound and music (Reybrouck et al. 2009; Verschaffel et al. 2010), while oth-
ers have found no effect of training (Hair 1993; Upitis 1987), suggesting that a
great deal depends on the nature of the task and the stimuli.
In one case, musical training even appeared to be detrimental to the accuracy
of the visual representations (Davidson, Scripp and Welsh 1988). The authors
asked more than four hundred musically trained and untrained children, ado-
lescents and adults to notate the two songs ‘Row, Row, Row Your Boat’ and
‘Happy Birthday’. More than 90 per cent of the trained participants aged
twelve to eighteen years were unable to produce a correct conventional nota-
tion for the pitch of ‘Happy Birthday’, while their invented notations showed
fewer errors. Caused by what the authors called ‘concept-driven errors’, many
trained participants assumed that the first and last notes of ‘Happy Birthday’
had to be the same and erroneously ‘corrected’ their invented notations too.
However, a group of trained participants who had focused exclusively on learn-
ing to sight-read songs relied more on their perceptual abilities and made no
conceptual errors, providing evidence that the kind of musical training children
receive significantly affects their musical understanding.
Finally, a study investigating both visual and kinaesthetic responses to
music is particularly pertinent here (Kerchner 2000). Asking twelve musically
40 Music and Shape
trained and untrained children aged seven to eight years and ten to eleven years
to listen to the first movement of Bach’s Brandenburg Concerto No. 2 and to
describe their listening experience both verbally, by creating a ‘listening map’,
and kinaesthetically, by moving their body, revealed that the most commonly
addressed ‘perceptual topics’ included ‘instrument, register, continuous motion,
formal sections, repetition, dynamics, tempo, contour, and pattern’ (ibid.: 36–7).
The type of visualization was dependent on age: the younger group created less
differentiated mappings—drawing pictures, the contour or the instruments—
whereas the older group used words and combinations of shapes to represent
both extramusical properties (e.g. mood) and musical parameters such as the beat.
Regarding the kinaesthetic responses, both groups depicted a broad variety of
musical parameters such as ‘beat, subdivided beat, articulation, melodic rhythm,
embellishment, duration, style, phrase, subphrases and motivic fragments, con-
tour, form, and pattern’ (ibid.: 42). Perhaps expectedly, both the visual and the
kinaesthetic responses were more differentiated than the verbal responses.
If the assumption that some musical experiences defy linguistic descriptions
is correct, the same should hold for adults. Indeed, some of the studies aimed
at uncovering aspects of children’s musical understanding through visual rep-
resentations have included adult participants as well. Davidson et al. (1988)
reported that invented notations of ‘Happy Birthday’ by seven-year-olds are
comparable to those of ten-year-olds and untrained adults. Moreover, it was
revealed that children older than nine years, as well as musically untrained
adults, show very stable figural representations, while only participants able to
read music display fully developed metric representations (Bamberger 1982).
Smith et al. (1994) found similar drawings of rhythmic sequences across groups
of musically untrained children and trained and untrained adults. In the next
section I focus on adults’ drawings of sound and music in more detail.
ADULTS’ VISUAL REPRESENTATION OF SIMPLE SOUND

STIMULI AND MUSICAL EXCERPTS
Compared to the amount of evidence accumulated from children’s drawings,

that available from adults is considerably smaller, although what evidence does
exist is motivated by a greater variety of research questions. As in studies with
children, there exists a distinction between simple sound stimuli and more com-
plex musical excerpts. Regarding the former type, few studies have been carried
out thus far, approaching the subject from a number of angles.
Influenced by the theorizing of composer and pioneer of musique concrète,
Pierre Schaeffer (1966), work from the fourMs research group4 is pertinent here
(Godøy et al. 2006; Haga 2008). Schaeffer proposed that through repeated expo-
sure to sound segments listeners should disengage from the sound source and
focus entirely on the sonic event, something referred to as acousmatic listening.
By drawing a sonic event on paper over and over again—or simply imagining
the shape of it in one’s mind—otherwise hidden or inaccessible features of the

musical object are supposed to be revealed. Godøy and colleagues (Godøy, Haga
and Jensenius 2006; Godøy 2010) and Haga (2008) tested this in an exploratory
study in which they asked nine participants with varying degrees of musical
training to represent short sound fragments (2–6 seconds long) with a pen on
an electronic graphics tablet. The sound stimuli—produced with traditional and
electronic instruments, as well as taken from the environment—were categorized
according to a typology proposed by Schaeffer (1966), and comprised impulsive,
continuous and iterative sounds, whereby both pitch and timbre were classi-
fied into ‘stable’, ‘unstable/changing’ and ‘undefined’. Although this study was
more concerned with the hand gestures and participants were unable to see the
trace they were creating, the analysis was based on the resultant drawings. It
was revealed that, regardless of their level of expertise, individuals are fairly
consistent, for example in representing pitch with height and the decay of a
percussive sound with a descending line, but differ in respect of sound segments
with multiple features such as a constant pitch and changing timbre, which some
participants represented with a horizontal line while others drew curved shapes.
Küssner and Leech-Wilkinson (2014) were particularly interested in the
influence of musical training on such ‘sound-tracings’, and they asked forty-
one musically trained and thirty untrained individuals to represent visually a
set of pure tones varied in pitch, loudness and tempo. Concerned particularly
with the visual representations of the sound stimuli, the participants could see
their drawings—also carried out with a pen on an electronics graphics tablet—
on a screen in front of them. Unlike the experimental procedure by Godøy and
colleagues in which participants drew after they heard the sounds, in this study
individuals were asked to draw along with the sound as it was played. Küssner
and Leech-Wilkinson found that, overall, pitch is represented on the vertical
axis and loudness with the size. However, representational strategies chosen
by untrained participants were more varied than those of trained participants.
On the other hand, a comparison of the subgroups of trained and untrained
participants who explicitly stated that they used height for pitch and size for
loudness revealed that musically trained participants are more accurate than
untrained ones, possibly because trained participants’ perception–action cou-
plings have been shaped more extensively.
Another study worth noting here focused on a cross-cultural comparison
of visual representations of sound between the UK, Japan and Papua New
Guinea. Using simple sound stimuli varying in pitch contour and asking par-
ticipants to create marks on a sheet of paper so that other community members
could associate them with the sound heard, Athanasopoulos and Moran (2013)
found that UK participants and Japanese participants familiar with western
notation used the y axis for pitch and the x axis for time, proceeding from left
to right. Participants from a traditional Japanese music background depicted
time vertically, starting at the top and moving down, which probably relates to
42 Music and Shape
traditional Japanese writing. While both UK and Japanese participants used

symbolic representations, Papua New Guineans showed iconic representations,
depicting aspects not deliberately manipulated by the authors such as timbre
(e.g. flute sound) or loudness.
These three studies already give an idea of how drawing paradigms can be
applied in a variety of contexts to address important questions related to music
perception and cognition. But, of course, it is vital to consider drawings of
‘real’ musical excerpts too.
Possibly the earliest study examining listeners’ drawings of music is that
of Hooper and Powell (1970), who sought to shed light on the influence of
the type of music (absolute versus programme), the activity during listening
(accompanying with rhythm instruments versus sitting still and listening care-
fully or for pleasure) and the presentation mode (live versus recorded). Their
results revealed that participants’ drawings were more elaborate when the music
was ‘absolute’, the participants rhythmically engaged and the presentation live.
Discussing their findings in terms of music education, the authors suggest that
especially liveness and participation may give rise to increased visual imagery.
Gromko (1995) investigated the extent to which drawing responses of adults
without formal musical education reflect their musical understanding. To that
end, she presented her 127 participants with various excerpts of classical music
and asked them to ‘create an iconographic representation of the musical sound,
using lines, shapes, or graphics’ (ibid.: 34) and to provide verbal descriptions
of the excerpts. Results revealed that fewer than 50 per cent indicated musical
properties, such as melodic lines, rhythmic groupings or dynamics in their draw-
ings and fewer than 25 per cent in their verbal descriptions. Of those who did,
fewer than 5 per cent represented more than only rhythmic elements (‘enactive
scribbles’). Given that 60 per cent reported some involvement in musical perfor-
mance activities in high school, Gromko concluded that visual representations
reveal little about individuals’ musical understanding.
According to the work of Tan and Kelly (2004), however, there is a clear
difference between musically trained and untrained participants. Unlike other
researchers, the authors presented individuals with short but complete musical
compositions and took great care to suggest as little as possible in their instruc-
tion, since mentioning ‘shapes’, ‘lines’ etc. might have influenced individuals’
choices. It was revealed that trained participants by and large show abstract
representations, focusing on musical properties such as melodic themes, repeti-
tion or timbre. On the other hand, musically untrained participants chose to
depict extramusical ideas such as associative pictures including narratives and
emotions. Their drawings also often included the listener as an agent or nar-
rator. This trend was confirmed in the study by Küssner and Leech-Wilkinson
(2014) in which two short musical excerpts led musically trained participants
to represent their pitch contour, while some untrained participants changed
the strategy they had applied for the pure tones and chose to create pictorial
representations based on associative ideas.
Finally, a drawing study within a clinical setting is worth mentioning here.
De Bruyn and colleagues (2012) asked a group of participants with Autism
Spectrum Disorder (ASD) and a group of controls to draw along with vari-
ous musical excerpts on an electronic graphics tablet, focusing on either the
rhythmic structure or the melodic contour. Results revealed that both groups
performed equally well in the rhythm condition, but participants with ASD
performed slightly better in the melody condition. Overall, the results are inter-
preted as evidence that patients with ASD have no difficulty imitating struc-
tural aspects of the music.5
All of the drawing studies reviewed here can be regarded as involving special
(two-dimensional) types of gestures as well—gestures with the side effect of
creating a visible trace on paper or screen. Next, I turn to empirical evidence of
‘proper’ three-dimensional gestures in response to sound and music.
Gestural representations of sound and music
Empirical research on gestural representations of sound and music is still in

its infancy. To clarify, by ‘gestural representation’ I refer here to experimen-
tal paradigms in which individuals were specifically asked to represent or
depict aspects of (musical) sound with hand gestures.6 Similar to drawing
approaches—though far more sparse—the first studies on gestural representa-
tions of music were carried out with children in a music educational context to
explore new methods of music listening and to shed light on the development
of musical understanding (Espeland 1987; Kerchner 2000). In a more recent
study, Kohn and Eitan (2009) investigated five-and eight-year-old children’s
gestural responses to sound stimuli varied in pitch, loudness and tempo. Their
analysis was based on a procedure in which independent observers trained in
Laban Movement Analysis7 were asked to rate the children’s movements along
the three spatial axes, as well as their muscular energy and the speed. Results
revealed that pitch was associated with the vertical axis, loudness with the ver-
tical axis and muscular energy, and tempo with speed and muscular energy.
More specifically, increase in loudness gave rise to upward movements and
heightened muscular energy, while decrease in loudness resulted in down-
ward movements and lowered muscular energy. Pitch was represented with an
upward–downward movement when the pitch contour was increasing–decreas-
ing. However, decreasing–increasing pitch contour did not lead to consistent
downward–upward movements, a result which is interpreted in light of a com-
monly observed bias for increasing–decreasing contours (Eitan and Granot
2006; Küssner et al. 2014).
44 Music and Shape
Küssner et al. (2014) ran a similar experiment with adult musically trained
and untrained participants who were presented with a series of pure tones con-
currently varying in pitch, loudness and tempo. The authors found that—just
as with drawing approaches—musically trained participants show more accu-
rate pitch–height mappings. It was also revealed that the bias for increasing–
decreasing contours does not hold for musically trained participants. There
were multiple strategies for representing the loudness, depending on the com-
plexity of the stimulus: if only loudness was changed, participants associated
the loudness with both the y axis and the z axis; in more complex stimuli, loud-
ness was associated with muscular energy, operationalized as fast shaking-
hand movements. Tempo was associated with the speed of hand movement
and elapsed time with the x axis (see also Küssner 2014). Moreover, this study
revealed interaction effects between the concurrently manipulated auditory
features—such as pitch and loudness affecting the association between tempo
and speed of hand movement—suggesting that gestural mappings of isolated
musical parameters should not automatically be generalized to more complex
auditory stimuli such as music.
Another study investigating isolated musical parameters has been carried
out by Nymoen et al. (2011), in which they asked participants to move a rod in
response to pitched and nonpitched sounds. It was revealed that pitch was most
strongly associated with the vertical axis and loudness with speed and hori-
zontal movements. Using a more restricted instruction, Kozak, Nymoen and
Godøy (2012) asked their participants to carry out either smooth or discon-
tinuous circular hand movements in response to sound stimuli manipulated in
rhythmic complexity, attack envelope, pitch, loudness and brightness. Focusing
on individuals’ ability to synchronize with the sound stimuli, they found that
discontinuous movement patterns resulted in better synchronization, with
musically trained participants performing more accurately in some trials only,
but never worse than untrained participants. Moreover, smooth attack enve-
lopes resulted in more motion, regardless of musical training.
Investigating gestural responses to everyday sounds— and more specifi-
cally, action-and nonaction-related sounds—Caramiaux et al. (2014) tested
the hypothesis that action-related sounds would give rise to sound-producing
gestures, whereas nonaction-related sounds would entail the representation
of their sonic shape. Confirming their hypothesis, the authors discovered that
speed profiles of participants’ movements were more similar for nonaction-
than for action-related sounds. It was suggested that the identification of the
source of an action-related sound (e.g. pouring cereal into a bowl) leads to
more idiosyncratic hand gestures than tracing the sonic shapes of nonaction-
related sounds.
There are very few studies investigating how adults represent music—that is,
‘real’ musical excerpts as opposed to a set of musical features (pitch, loudness,
timbre, etc.)—with three-dimensional hand gestures. Haga (2008) asked three
trained dancers and three untrained individuals to respond with spontaneous

gestures to various musical excerpts including pieces by Vivaldi and Ligeti and
one electronic piece of music composed for the purposes of the study. The
results of this observational study showed that there was broad consensus
among trained and untrained participants. The more detailed and complex the
musical excerpt, the more variation was observed in the gestures. Interestingly,
the dancers were often seen adding their own interpretative gestures to fill parts
in the musical excerpts in which a pulse was missing (see also Küssner 2013).
Moreover, it was observed that dancers developed their gestures on repeated
presentation of a musical excerpt, remembering what they had done previously
and exploring further gestural shapings of the music.
In a study using more restricted hand gestures, (western) participants
were asked to move a joystick in response to three pieces of traditional guqin
Chinese music (Leman et al. 2009). Participants repeatedly listened and ges-
tured along with the music over four sessions in which each piece was presented
twice consecutively. The findings revealed that the relative number of consistent
responses—i.e. similar velocity patterns—grew over the course of the experi-
ment, especially for the two more melodic pieces of the experimental stimuli.
These two melodic pieces also led to progressively similar movement responses
across participants, while the third piece—described by the authors as having ‘a
more narrative character with less fluent melodic line’ (ibid.: 264)—gave rise to
increasingly idiosyncratic movement responses. Besides recording participants’
movements, Leman and colleagues also recorded the movements of the musi-
cian and correlated them with the listeners’ movements. It was found that the
correlation between the musician’s shoulder movement and the participants’
arm movement strengthened over the course of the experiment for the two
melodic pieces, suggesting that the movement velocity patterns are shared not
only between listeners but also to some extent between musician and listener.
In a more recent study by the same group (Maes et al. 2014), the relationship
between music and movement was investigated by comparing listeners’ free
movement responses to music with their linguistic descriptions of the expres-
sive qualities of the music. The musical excerpt used in this study was the begin-
ning of the first movement of Brahms’ First Piano Concerto. The participants
were told: ‘[t]‌ranslate your experience of the music into free full-body move-
ment. Try to become absorbed in the music that is presented and express your
feelings into body movement. There is no good or wrong way of doing it. Just
perform what comes up in you’ (ibid.: 71). While the participants moved during
the whole length of the excerpt, the authors identified three respective ‘heroic’
and ‘lyric’ passages, each thirty seconds long, for the purpose of their analyses.
On the basis of Laban’s Effort–Shape model, participants rated the expressive
qualities of the excerpts on a bipolar scale consisting of twenty-four adjectives,
sixteen pertaining to effort and eight to shape. Using a motion-capture system,
the researchers extracted seven movement features and matched them to the
46 Music and Shape
effort and shape categories. Results revealed that all the movement features
clearly differentiated between the two types of excerpt. For instance, if the
average value for ‘acceleration’ was high for the heroic passages, it was low for
the lyric passages. Moreover, there was an effect of musical training, show-
ing that trained participants achieved higher values for the movement features
‘size’ and ‘height’. This suggests that they moved more and filled more space
with their gestures during the experiment, possibly because they were more
familiar (and comfortable) with the music. Regarding the analysis of the lin-
guistic expressions, there was much agreement among the participants as to
how well a particular adjective described the expressive qualities of the music.
Furthermore, it was found that the extremes of the movement features corre-
lated with the extremes of the adjective scales such that an excerpt which was
rated, for instance, as conveying the expressive qualities ‘big’, ‘broad’, ‘thick’
and ‘exalting’ also gave rise to a high value for the movement feature ‘size’. The
authors interpret their findings as evidence for the sharing of expressive quali-
ties of music in linguistic expressions and body movements.
Having reviewed both drawing and gesture studies and shown the diversity
of contexts in which they were carried out, I now focus on some methodologi-
cal issues and how they can be addressed in future studies.
Methodological issues
When cross-modal mappings of auditory stimuli are studied, the outcome will
depend to a large extent on the specifics of the experiment such as the choice
of stimuli, the experimental setting and the instruction given to participants.
By discussing some of the issues involved I hope to provide a helpful, if by no
means exhaustive, overview for researchers who wish to carry out experiments
on cross-modal mappings of sound and music.
MUSIC VERSUS SOUND
This dichotomy is not specific to the study of cross-modal mappings but can
be found in any other field in which researchers have to face the problem of the
whole versus its parts. Unlike psycho-acousticians who exclusively work with
highly controlled, synthesized sound stimuli, music researchers are particularly
concerned with the unravelling of cross-modal mappings of real music, and a
broadly accepted way to study these is to investigate its constituent parts such
as pitch, loudness or timbre. As I have shown above, the problem with study-
ing characteristics of musical sounds in isolation (e.g. change in pitch) is the
creation of an ontological gap: we cannot be sure that findings from studies
using synthesized pure tones in order to investigate cross-modal mappings of
pitch apply equally to situations in which we listen to the changing pitches of
a musical performance. There are too many other factors involved in the latter
that render generalizations problematic. On the other hand, the choice of ‘real’
musical excerpts as experimental stimuli gives rise to a number of confounding
variables since it is unavoidable that other musical qualities such as dynamics
or articulation—or at least timbral qualities—will be co-varied with pitch. This
makes it difficult, if not impossible, to study causal links. I therefore suggest
that researchers should, whenever possible, include both types of stimuli in
their experiments (e.g. Eitan and Timmers 2010; Küssner and Leech-Wilkinson
2014) in order to get a better idea of the extent to which findings from highly
controlled psycho-acoustical stimuli hold true for musical excerpts, and also of
the extent to which findings from studies using musical excerpts can be repli-
cated by manipulating the musical sound feature of interest in isolation.
PURE TONES VERSUS MIDI
A further option—which might be seen as an attempt to bridge that onto-

logical gap—may be to synthesize auditory stimuli that resemble ‘real’ musi-
cal sounds, as can be achieved through the use of MIDI. For instance, Eitan
and Granot (2006) used synthesized piano sounds to study cross-modal map-
pings of various musical features such as pitch, dynamics and speed. While
such an approach has the advantage of presenting participants with more
‘natural’ stimuli in comparison with pure tones, it leaves open the question of
whether the same results would have been obtained with, say, guitar or trom-
bone sounds. Any step towards a more ecological musical stimulus comes at the
cost of introducing new variables that need to be controlled for in an experi-
ment aiming to uncover causal relationships. And while advances in music syn-
thesizing software allow features such as ‘expression’ or rubato to be switched
on, the gap between this and human musical performance—though gradually
shrinking—is still very audible. When designing an experiment, researchers
thus need to consider carefully the advantages and disadvantages of employing
MIDI-based sound stimuli.
ISOLATED VERSUS CONCURRENTLY VARIED
Although arguably simplistic compared to real musical excerpts, pure tones can
be synthesized with varying degrees of complexity. However, most studies so
far—at least those concerned with music cognition—have included pure tones
whose features were manipulated in isolation. There is scope for many more
studies using controlled pure tones (or more naturally sounding ones, such as
MIDI sounds) whose features are concurrently varied in a systematic manner
(for recent examples see Eitan and Granot 2011; Küssner et al. 2014). As men-
tioned above, in most cases music consists of the dynamic co-variation of sev-
eral musical parameters. These co-variations may, to some extent, be recreated
48 Music and Shape
in the synthesis of pure tones, achieving more ecologically valid stimuli while
keeping possible confounding variables at a minimum.
MUSICAL EXCERPTS VERSUS WHOLE COMPOSITIONS
In almost all studies investigating cross-modal mappings of music, research-

ers have used relatively short excerpts from longer musical compositions.
One notable exception is the study by Tan and Kelly (2004) in which musi-
cally trained and untrained participants were asked to depict graphically whole
musical compositions. The authors raised the important issue that short musi-
cal excerpts, when taken out of context within a piece, may lead to varying
visualizations and cross-modal mappings. I agree that the context plays an
important role, perhaps not so much for basic mappings of sound features such
as pitch and loudness but for more elaborate (visual) representations of music
that take into account instrumentation, texture, harmony, repetition and so on.
Even though it is probably hardly ever feasible to include recordings of whole
symphonies in an experiment, there is scope for studying the effects of shorter,
yet complete musical compositions on people’s visual representations, and for
comparing them with responses to shorter, out-of-context, musical excerpts.
LIVE VERSUS RECORDED
The ‘liveness’ aspect of a musical performance has recently attracted increased

attention, relating to topics such as audience engagement (Sloboda 2013),
performer–audience interaction (Whitney 2013) and emotional responses in
the listener (Egermann et al. 2013). Being physically present at a concert might
indeed give rise to quite different visualizations and representations of music
from those engendered when listening to a recording in a laboratory setting.
While one pioneering study (Hooper and Powell 1970) revealed that pictorial
representations of music in a live context led to more elaborate responses, there
is scope for more research of that kind. It should make intuitive sense that the
visual presence of musicians, their body movements and instruments, as well
as the presence of other audience members, may lead one to associate different
shapes from those generated during solitary listening.
ACTIVE VERSUS PASSIVE LISTENING
Apart from the ‘liveness’ aspect, there is evidence that individuals’ motor activ-
ity during listening affects their cross-modal mappings of music. For instance,
it has been suggested that the motor behaviour during listening influences chil-
dren’s visual representations of musical excerpts (Fung and Gromko 2001). A
group of children allowed to move with props or in sand while listening to the
music produced visualizations that included more detailed representations
of rhythm, beat and groupings of notes compared to a group of children who

were asked to sit still. Hooper and Powell (1970) reported similar results for
adults who were accompanying musical excerpts rhythmically: they showed
more elaborate visual representations than groups of adults who were told
either to listen carefully or to listen for enjoyment. It is therefore plausible that
the overt engagement in motor activities shifts our attention to rhythmic prop-
erties, which—during suppression of motor activity—might not have reached
the threshold of consciousness.
NATURE OF TASK: SPONTANEOUS—M ANDATORY—E LABORATE
The nature of the task, including experimental stimuli but also the exact word-
ing of the instruction and participants’ interpretation of it, determines what
is being assessed during an experiment. As Rusconi and colleagues (2006)
pointed out in a critique of some classic psychophysical experiments investi-
gating pitch–height mappings, there is a crucial difference between spontane-
ous and mandatory mappings. Spontaneous cross-modal mappings are seen as
occurring automatically, independent of the context and possibly without our
being aware of it, whereas mandatory mappings require our full consciousness
and deliberate action. At best, the latter are used to refine some finding well
supported by empirical evidence; at worst, they introduce highly artificial cat-
egories to an experiment, leading to meaningless responses.
Besides mandatory cross-modal mappings, which are restrained by a lim-
ited choice of response categories, there are also what might be called elabo-
rate responses. Whether spontaneous or not,8 they constitute free, unrestricted
responses to some stimulus, for instance by drawing a sound or a piece of
music. While such paradigms provide richer data than, for instance, reaction
time measures, they often also require some unstandardized analysis proce-
dure, complicating the comparisons between studies. Whichever paradigm
researchers apply after weighing advantages and disadvantages, it is important
that they are aware of the kind(s) of cross-modal mappings they are measuring.
SYNCHRONOUS VERSUS ASYNCHRONOUS
Thanks to the availability of adequate experimental tools such as electronic

graphics tablets (Küssner et al. 2011), researchers investigating visualizations
of sound and music have been able to study ‘sound-tracings’ (Godøy et al.
2006) or the process of visualizing sounds (Küssner and Leech-Wilkinson
2014). This approach might offer an additional angle different from the focus
in most previous studies, i.e. the final product of the sound/music visualization.
Asking participants to draw or gesture along with the sound enables research-
ers to study not only how they map them cross-modally but also the degree to
which they are in synchrony with various sound/musical features. Particularly
50 Music and Shape
for researchers regarding perception as an active process based on action–

perception cycles, paying attention to the action that creates a certain sound
visualization appears to be overdue.
Future directions
Cross-modal mappings of sound and music have been studied for a long time,
dating back to the work of Carl Stumpf (1883), who investigated metaphori-
cal mappings of pitch, for instance. Matching tasks and reaction-time para-
digms dominated psychological studies on cross-modal correspondences in the
twentieth century and have given rise to an impressive amount of empirical
evidence (Spence 2011), with ample opportunity for future studies. However,
the advent of the embodied cognition research programme (Shapiro 2007) has
led to a rethinking of cognition and put considerable emphasis on the role of
the body and its interaction with the physical environment in cognitive pro-
cesses. Consequently, epistemologies are changing and new paradigms are
being developed that consider more carefully the role of the body in psycho-
logical experiments. In musicology, the formalization of an embodied music
cognition theory (Leman 2007) has given a new impulse to studying music cog-
nition with the direct involvement of the body. Leman’s ‘graphical attuning’
and Godøy’s (2006) ‘sound-tracing’ represent new ways of studying sound and
music cross-modally.
This progress would not have been possible, of course, without the devel-
opment of new technologies. Electronic graphics tablets and motion-capture
systems allow researchers to measure participants’ responses to sound and
music with unprecedented precision. Importantly, and in line with the notion
of embodied, goal-directed actions in real time, they provide insights into
the shaping of cross-modal correspondences rather than its final products—the
shapes. However, such approaches come with issues pertaining to both the equip-
ment and the analysis techniques. For instance, some motion-capture systems
are still very expensive or require custom-made software (Küssner et al. 2011).
Another problem is that participants have to wear markers (Maes et al. 2014),
move large joysticks (Leman et al. 2009) or hold a remote controller in their
hand (Küssner et al. 2014) in order to indicate the shapes of auditory stimuli
with their bodies. Thus, one of the challenges for (music) researchers will be to
develop tools that are both less intrusive and less costly.
Crucially, techniques for analysing tracings of sound and music need to be
developed further. First attempts have been made for the analysis of drawings
(Noyce, Küssner and Sollich 2013), as well as for free three-dimensional gestures
(Nymoen et al. 2013), identifying techniques such as Gaussian Processes, non-
parametric and canonical correlations, and pattern recognition classifiers. Due
to music’s unfolding nature over time, the issue of analysing time-dependent
data is a well-known problem in music psychology and has been discussed

before by scholars concerned with emotional responses to music (Levitin et al.
2007; Schubert and Dunsmuir 1999). Only joint efforts by researchers from
various disciplines such as musicology, psychology, mathematics and computer
science make such endeavours possible nowadays, and it will be pivotal to (con-
tinue to) share testing software and detailed insight into analysis techniques
between researchers in the future.
Conclusion
I have provided an overview of studying the perceived shapes of sound and

music from various methodological and epistemological angles. Traditional
reaction-time paradigms, among other approaches, have revealed how people
map auditory features onto the visual or visuo-spatial domain. Recent stud-
ies involving people’s overt bodily representations of sound and music are an
opportunity to develop a fresh perspective on a familiar subject. The exten-
sive empirical work carried out in the realm of developmental psychology has
shown that children’s drawings can reveal a great deal about how they make
sense of sound and music. And there is no reason to believe that there should
be any less revelatory potential for adults’ drawings. Indeed, first empirical
investigations with adults have proven useful in illuminating the role of musi-
cal training in music cognition, the effect of literacy in cross-cultural compar-
isons of sound shapes and the role of cognitive skills in a clinical setting, to
name but a few contexts. Even more so, free bodily gestures provide scope for
studies investigating cross-modal mappings of sound and music. Although
gestures are ubiquitous in everyday life and often observed in response
to music— from finger tapping to dancing— only recent developments in
motion-capture technologies have turned them into a serious alternative for
studying sound and music cross-modally. Leman’s (2007) ‘second-person
descriptions’—subjective responses to music articulated through verbal or
nonverbal descriptions of bodily phenomena—provide a theoretical basis on
which more research into cross-modal perception and cognition by means of
drawings and gestures can be carried out. Godøy and colleagues realized the
potential of carrying out ‘systematic and large-scale studies of sound-gesture
relationships’ at least a decade ago, if not before (Godøy 1997, 2006). First
attempts have been made—as shown in this chapter—but there is still a long
way to go. In conclusion, perhaps only one thing is clear: if multimodal per-
ception of music is indeed essentially based on our bodies interacting with
the environment, using appropriate body-centred experimental paradigms
and analysis techniques to investigate cross-modal mappings of music will be
a necessary step on our mission to capture the full breadth of human musical
experience.
52 Music and Shape
Acknowledgements
This work was supported by King’s College London and by the AHRC
Research Centre for Musical Performance as Creative Practice (grant number
RC/AH/D502527/1).
References
Athanasopoulos, G. and N. Moran, 2013: ‘Cross- cultural representations of musical

shape’, Empirical Musicology Review 8/3–4: 185–99.
Bamberger, J., 1980: ‘Cognitive structuring in the apprehension and description of simple
rhythms’, Archives de psychologie 48: 171–99.
Bamberger, J., 1982: ‘Revisiting children’s drawings of simple rhythms: a function for
reflection-in-action’, in S. Strauss, ed., U-shaped Behavioral Growth (New York: Academic
Press), pp. 191–226.
Barrett, M. S., 2000: ‘Windows, mirrors and reflections: a case study of adult constructions
of children’s musical thinking’, Bulletin of the Council for Research in Music Education
145: 43–61.
Barrett, M. S., 2005: ‘Representation, cognition, and communication: invented notation in
children’s musical communication’, in D. Miell, R. MacDonald and D. Hargreaves, eds.,
Musical Communication (Oxford: Oxford University Press), pp. 117‒42.
Burger, B., 2013: ‘Move the way you feel: effects of musical features, perceived emo-
tions, and personality on music-induced movement’ (PhD dissertation, University of
Jyväskylä).
Caramiaux, B., F. Bevilacqua, T. Bianco, N. Schnell, O. Houix and P. Susini, 2014: ‘The role
of sound source perception in gestural sound description’, ACM Transactions on Applied
Perception 11/1: Article 1.
Claxton, G., 1980: ‘Cognitive psychology: a suitable case for what sort of treatment?’, in
G. Claxton, ed., Cognitive Psychology: New Directions (London: Routledge & Kegan
Paul), pp. 1‒25.
Cowles, J. T., 1935: ‘An experimental study of the pairing of certain auditory and visual
stimuli’, Journal of Experimental Psychology 18/4: 461–9.
Davidson, L. and B. Colley, 1987: ‘Children’s rhythmic development from age 5 to 7: per-
formance, notation, and reading of rhythmic patterns’, in J. C. Peery, I. W. Peery and T.
W. Draper, eds., Music and Child Development (New York: Springer), pp. 107–36.
Davidson, L. and L. Scripp, 1988: ‘Young children’s musical representations: windows on
music cognition’, in J. A. Sloboda, ed., Generative Processes in Music: The Psychology
of Performance, Improvisation, and Composition (New York: Oxford University Press),
pp. 195–230.
Davidson, L., L. Scripp and P. Welsh, 1988: ‘ “Happy Birthday”: evidence for conflicts of
perceptual knowledge and conceptual understanding’, Journal of Aesthetic Education
22/1: 65–74.
Davies, E., 2006: Beyond Dance: Laban’s Legacy of Movement Analysis (New York:
Routledge).
De Bruyn, L., D. Moelants and M. Leman, 2012: ‘An embodied approach to testing musi-
cal empathy in participants with an autism spectrum disorder’, Music and Medicine
4/1: 28–36.
Dolscheid, S., S. Shayan, A. Majid and D. Casasanto, 2013: ‘The thickness of musical pitch:
psychophysical evidence for linguistic relativity’, Psychological Science 24/5: 613–21.
Egermann, H., M. Pearce, G. Wiggins and S. McAdams, 2013: ‘Probabilistic models of
expectation violation predict psychophysiological emotional responses to live concert
music’, Cognitive, Affective, & Behavioral Neuroscience 13/3: 533–53.
Eitan, Z., 2013: ‘How pitch and loudness shape musical space and motion: new findings
and persisting questions’, in S.-L. Tan, A. Cohen, S. Lipscomb and R. Kendall, eds.,
The Psychology of Music in Multimedia (Oxford: Oxford University Press), pp. 161–87.
Eitan, Z. and R. Y. Granot, 2006: ‘How music moves: musical parameters and listeners’
images of motion’, Music Perception 23/3: 221–48.
Eitan, Z. and R. Y. Granot, 2011: ‘Listeners’ images of motion and the interaction of
musical parameters’, paper presented at the 10th Conference of the Society for Music
Perception and Cognition (SMPC), Rochester, NY, USA, 11–14 August 2011.
Eitan, Z. and R. Timmers, 2010: ‘Beethoven’s last piano sonata and those who follow croc-
odiles: cross-domain mappings of auditory pitch in a musical context’, Cognition 114/3:
405–22.
Elkoshi, R., 2002: ‘An investigation into children’s responses through drawing, to short
musical fragments and complete compositions’, Music Education Research 4/2: 199–211.
Espeland, M., 1987: ‘Music in use: responsive music listening in the primary school’, British
Journal of Music Education 4/3: 283–97.
Fitts, P. M. and C. M. Seeger, 1953: ‘SR compatibility: spatial characteristics of stimulus
and response codes’, Journal of Experimental Psychology 46/3: 199–210.
Fung, C. V. and J. E. Gromko, 2001: ‘Effects of active versus passive listening on the qual-
ity of children’s invented notations and preferences for two pieces from an unfamiliar
culture’, Psychology of Music 29/2: 128–38.
Galantucci, B., C. A. Fowler and M. T. Turvey, 2006: ‘The motor theory of speech percep-
tion reviewed’, Psychonomic Bulletin & Review 13/3: 361–77.
Godøy, R. I., 1997: ‘Knowledge in music theory by shapes of musical objects and sound-
producing actions’, in M. Leman, ed., Music, Gestalt, and Computing: Studies in
Cognitive and Systematic Musicology (Berlin: Springer), pp. 89–102.
Godøy, R. I., 2003: ‘Motor-mimetic music cognition’, Leonardo 36/4: 317–19.
Godøy, R. I., 2006: ‘Gestural-sonorous objects: embodied extensions of Schaeffer’s con-
ceptual apparatus’, Organised Sound 11/2: 149–57.
Godøy, R. I., 2010: ‘Gestural affordances of musical sound’, in R. I. Godøy and M. Leman,
eds., Musical Gestures: Sound, Movement, and Meaning (New York: Routledge).
Godøy, R. I., E. Haga and A. R. Jensenius, 2006: ‘Exploring music-related gestures by
sound-tracing: a preliminary study’, paper presented at the 2nd ConGAS International
Symposium on Gesture Interfaces for Multimedia Systems, Leeds, UK, 9–10 May 2006.
Gromko, J. E., 1994: ‘Children’s invented notations as measures of musical understanding’,
Psychology of Music 22/2: 136–47.
Gromko, J. E., 1995: ‘Invented iconographic and verbal representations of musical sound:
their information content and usefulness in retrieval tasks’, The Quarterly Journal of
Music Teaching and Learning 6: 32–43.
54 Music and Shape
Haga, E., 2008: ‘Correspondences between music and body movement’ (PhD dissertation,
University of Oslo).
Hair, H. I., 1993: ‘Children’s descriptions and representations of music’, Bulletin of the Council
for Research in Music Education 119: 41–8.
Hargreaves, D. J., 1978: ‘Psychological studies of children’s drawing’, Educational Review
30/3: 247–54.
Hooper, P. P. and E. R. Powell, 1970: ‘Influences of musical variables on pictorial connota-
tions’, Journal of Psychology 76/1: 125–8.
Huron, D. and D. Shanahan, 2013: ‘Eyebrow movements and vocal pitch height: evidence
consistent with an ethological signal’, The Journal of the Acoustical Society of America
133/5: 2947–52.
Huron, D., S. Dahl and R. Johnson, 2009: ‘Facial expression and vocal pitch height: evi-
dence of an intermodal association’, Empirical Musicology Review 4/3: 93–100.
Kerchner, J. L., 2000: ‘Children’s verbal, visual, and kinesthetic responses: insight into their music
listening experience’, Bulletin of the Council for Research in Music Education 146: 31–50.
Knöferle, K. and C. Spence, 2012: ‘Crossmodal correspondences between sounds and
tastes’, Psychonomic Bulletin & Review 19/6: 992–1006.
Kohn, D. and Z. Eitan, 2009: ‘Musical parameters and children’s movement responses’, in
J. Louhivuori, T. Eerola, S. Saarikallio, T. Himberg and P. S. Eerola, eds., 7th Triennial
Conference of the European Society for the Cognitive Sciences of Music (Jyväskylä: ESCOM).
Kozak, M., K. Nymoen and R. I. Godøy, 2012: ‘Effects of spectral features of sound
on gesture type and timing’, in E. Efthimiou, G. Kouroupetroglou and S.-E. Fotinea,
eds., Gesture and Sign Language in Human– Computer Interaction and Embodied
Communication (Berlin: Springer), pp. 69–80.
Küssner, M. B., 2013: ‘Music and shape’, Literary and Linguistic Computing 28/3: 472–9.
Küssner, M. B., 2014: ‘Shape, drawing and gesture: cross-modal mappings of sound and
music’ (PhD dissertation, King’s College London).
Küssner, M. B. and D. Leech-Wilkinson, 2014: ‘Investigating the influence of musical
training on cross-modal correspondences and sensorimotor skills in a real-time drawing
paradigm’, Psychology of Music 42/3: 448–69.
Küssner, M. B., N. Gold, D. Tidhar, H. M. Prior and D. Leech- Wilkinson, 2011:
‘Synaesthetic traces: digital acquisition of musical shapes’, presented at the Supporting
Digital Humanities Conference: Answering the unaskable, Copenhagen, Denmark, 17–
18 November 2011.
Küssner, M. B., D. Tidhar, H. M. Prior and D. Leech-Wilkinson, 2014: ‘Musicians are
more consistent: gestural cross-modal mappings of pitch, loudness and tempo in real-
time’, Frontiers in Psychology 5/789, https://doi.org/10.3389/fpsyg.2014.00789 (accessed
9 April 2017).
Lakoff, G. and M. Johnson, 1980: Metaphors We Live By (Chicago: University of Chicago
Press).
Leman, M., 2007: Embodied Music Cognition and Mediation Technology (Cambridge,
MA: MIT Press).
Leman, M., F. Desmet, F. Styns, L. Van Noorden and D. Moelants, 2009: ‘Sharing musical
expression through embodied listening: a case study based on Chinese Guqin music’,
Music Perception 26/3: 263–78.
Levitin, D. J., R. L. Nuzzo, B. W. Vines and J. O. Ramsay, 2007: ‘Introduction to functional
data analysis’, Canadian Psychology/Psychologie canadienne 48/3: 135–55.
Lewkowicz, D. J. and N. J. Minar, 2014: ‘Infants are not sensitive to synesthetic cross-modality

correspondences: a comment on Walker et al. (2010)’, Psychological Science 25/3: 832‒4.
Liberman, A. M. and I. G. Mattingly, 1985: ‘The motor theory of speech perception
revised’, Cognition 21/1: 1–36.
Maes, P.-J. and M. Leman, 2013: ‘The influence of body movements on children’s percep-
tion of music with an ambiguous expressive character’, PloS ONE 8/1: e54682.
Maes, P.-J., E. Van Dyck, M. Lesaffre, M. Leman and P. M. Kroonenberg, 2014: ‘The
coupling of action and perception in musical meaning formation’, Music Perception
32/1: 67–84.
Marks, L. E., 2004: ‘Cross-modal interactions in speeded classification’, in G. A. Calvert, C.
Spence and B. E. Stein, eds., Handbook of Multisensory Processes (Cambridge, MA: MIT
Press), pp. 85–105.
Noyce, G. L., M. B. Küssner and P. Sollich, 2013: ‘Quantifying shapes: mathematical tech-
niques for analysing visual representations of sound and music’, Empirical Musicology
Review 8/2: 128–54.
Nymoen, K., B. Caramiaux, M. Kozak and J. Torresen, 2011: ‘Analyzing sound tracings:
a multimodal approach to music information retrieval’, paper presented at the 1st
International ACM Workshop on Music Information Retrieval with User-Centered and
Multimodal Strategies (MIRUM), Scottsdale, AZ, USA, 28 November–1 December 2011.
Nymoen, K., R. I. Godøy, A. R. Jensenius and J. Torresen, 2013: ‘Analyzing corre-
spondence between sound objects and body motion’, ACM Transactions on Applied
Perception 10/2: Article 9.
Olson, D. R., 1970: Cognitive Development: The Child’s Acquisition of Diagonality
(New York: Academic Press).
Parkinson, C., P. J. Kohler, B. Sievers and T. Wheatley, 2012: ‘Associations between audi-
tory pitch and visual elevation do not depend on language: evidence from a remote
population’, Perception 41/7: 854–61.
Piaget, J. and B. Inhelder, 1973: Memory and Intelligence (London: Routledge & Kegan Paul).
Pratt, C. C., 1930: ‘The spatial character of high and low tones’, Journal of Experimental
Psychology 13/3: 278–85.
Prinz, W., 1990: ‘A common coding approach to perception and action’, in O. Neumann and
W. Prinz, eds., Relationships between Perception and Action (Berlin: Springer), pp. 167–201.
Reybrouck, M., L. Verschaffel and S. Lauwerier, 2009: ‘Children’s graphical notations
as representational tools for musical sense-making in a music-listening task’, British
Journal of Music Education 26/2: 189–211.
Roederer, J. G., 1984: ‘The search for a survival value of music’, Music Perception 1/3: 350–6.
Rusconi, E., B. Kwan, B. L. Giordano, C. Umiltà and B. Butterworth, 2006: ‘Spatial repre-
sentation of pitch height: the SMARC effect’, Cognition 99/2: 113–29.
Schaeffer, P., 1966: Traité des objets musicaux (Paris: Editions du Seuil).
Schubert, E. and W. Dunsmuir, 1999: ‘Regression modelling continuous data in music psychol-
ogy’, in S. W. Yi, ed., Music, Mind, and Science (Seoul: National University Press), pp. 298–352.
Shapiro, L., 2007: ‘The embodied cognition research programme’, Philosophy Compass
2/2: 338–46.
Sloboda, J. A., 2013: ‘How does it strike you? Obtaining artist-directed feedback from the
audience at a site-specific performance of a Monteverdi opera’, paper presented at the
Performance Studies Network Second International Conference, Cambridge, UK, 4–7
April 2013.
56 Music and Shape
Smith, K. C., L. L. Cuddy and R. Upitis, 1994: ‘Figural and metric understanding of
rhythm’, Psychology of Music 22/2: 117–35.
Spence, C., 2011: ‘Crossmodal correspondences: a tutorial review’, Attention, Perception, &
Psychophysics 73/4: 971–95.
Spence, C. and O. Deroy, 2012: ‘Crossmodal correspondences: innate or learned?’, i-Perception
3/5: 316–18.
Stumpf, C., 1883: Tonpsychologie (Leipzig: S. Hirzel).
Suzuki, S., E. Mills and T. C. Murphy, 1973: The Suzuki Concept: An Introduction to a
Successful Method for Early Music Education (Berkeley, CA: Diablo Press).
Tan, S.-L. and M. E. Kelly, 2004: ‘Graphic representations of short musical compositions’,
Thompson, M., 2012: ‘The application of motion capture to embodied music cognition
research’ (PhD dissertation, University of Jyväskylä).
Trehub, S. E. and L. Trainor, 1998: ‘Singing to infants: lullabies and play songs’, in C. Rovee-
Collier, L. P. Lipsitt and H. Hayne, eds., Advances in Infancy Research, Vol. 12 (Stamford,
CT: Ablex), pp. 43–78.
Trimble, O. C., 1934: ‘Localization of sound in the anterior-posterior and vertical dimen-
sions of “auditory” space’, British Journal of Psychology: General Section 24/3: 320–34.
Upitis, R., 1987: ‘Children’s understanding of rhythm: the relationship between develop-
ment and music training’, Psychomusicology: Music, Mind & Brain 7/1: 41–60.
Upitis, R., 1990: ‘Children’s invented notations of familiar and unfamiliar melodies’,
Psychomusicology: A Journal of Research in Music Cognition 9/1: 89–106.
Upitis, R., 1992: Can I Play You My Song? The Compositions and Invented Notations of
Children (Portsmouth, NH: Heinemann).
Van Dyck, E., 2013: ‘The influence of music and emotion on dance movement’ (PhD dis-
sertation, Ghent University).
Verschaffel, L., M. Reybrouck, M. Janssens and W. Van Dooren, 2010: ‘Using graphical
notations to assess children’s experiencing of simple and complex musical fragments’,
Walker, P., J. G. Bremner, U. Mason, J. Spring, K. Mattock, A. Slater and S. P. Johnson, 2010:
‘Preverbal infants’ sensitivity to synaesthetic cross-modality correspondences’, Psychological
Science 21/1: 21–5.
Walker, R., 1987: ‘The effects of culture, environment, age, and musical training on choices
of visual metaphors for sound’, Perception & Psychophysics 42/5: 491–502.
Werner, H., 1980: Comparative Psychology of Mental Development, 3rd edn (New York:
International Universities Press).
Whitney, K., 2013: ‘Singing in duet with the listener’s voice: a dynamic model of the joint shap-
ing of musical content in live concert performance’, paper presented at the Performance
Studies Network Second International Conference, Cambridge, UK, 4–7 April 2013.
Zbikowski, L. M., 2002: Conceptualizing Music: Cognitive Structure, Theory, and Analysis
(New York: Oxford University Press).
Reflection
Anna Meredith, composer
Shape is both the most important aspect of my composing and the hardest
thing to describe. Before I write any piece, whether a piece for orchestra or an
electronic track, I draw a sketch of its contour along a timeline; so my drawers
are stuffed with pages of jaggy lines, builds and cuts which help me control my
pacing—one of the most important things to me in my music. One of these
sketches and its associated composition can be accessed at .
As to what the lines mean, that’s harder to pin down. At the risk of sounding
flaky, I think the best description might be that they are tracing the energy of
a piece. So a big diagonal build on my sketch might not necessarily mean ‘get
louder’ or ‘get faster’ but could suggest a way of controlling the musical energy
of an idea as my way of showing the trajectory of a line or fragment I’ve come
up with.
When I’m writing a piece, it feels like this drawing/sketching process is my
way of auditioning my ideas: so if I’ve got something, no matter how little,
I then imagine it going through the dramatic shapes I need for the piece to
see if the material will be appropriate. This involves keeping half an eye on a
stopwatch while striding round my studio tunelessly singing bits of the mate-
rial and muttering things like ‘idea breaks apart and glitches here’ or ‘melodic
line builds until it takes over whole ensemble’, to see if I think it’ll work. Once
I’ve got the right ideas, and am confident that they’ll stand up to the drama I’ve
got planned for them, my next step becomes more of a zooming in, looking at
part of the shapes, working out exactly how I get from A to B and filling in the
detail.
57
3
Cross-modal correspondences and affect in

a Schubert song
Zohar Eitan, Renee Timmers and Mordechai Adler
Western music is imbued with conventional mappings of musical features onto

aspects of the human and natural world. Some such correspondences have
become well-established musical symbols. Melodic fall and rise, for instance,
have represented both physical and metaphorical descents and ascents at least
since the ninth century C.E., as settings of ‘descendit de caelis’ and ‘ascendit in
caelum’ in the Credo of the Latin mass attest. Experimental work in psycho-
physics, perception and cognition, however, suggests that such mappings are
not mere conventions of musical style, since mappings of sound dimensions
like pitch and loudness onto nonauditory features (e.g. visual brightness, object
size or height) consistently and automatically occur outside musical contexts.
There is abundant evidence that cross-modal associations involving auditory
features may be activated automatically and implicitly, in particular in the con-
text of simultaneous stimulation. If visual and auditory stimuli, for instance,
co-occur, and participants are asked to respond to one dimension only, the
second dimension nevertheless influences processing of the first dimension. In
particular, presentations of visual stimuli that are congruent or incongruent
with auditory stimuli (e.g. spatial height and pitch ‘height’, visual brightness
and auditory loudness) facilitate or interfere with the processing of the audi-
tory stimuli and vice versa (for reviews of relevant empirical research see Eitan
2013; Eitan and Timmers 2010; Marks 2004; Spence 2011). Furthermore, while
cross-modal mappings of sound are widely reflected in language (e.g. ‘high’
and ‘low’ pitch, ‘bright’ and ‘dark’ sound) and in other conventional symbolic
idioms, such as western music notation, ample research suggests that they may
originate from sources other than language or culturally ingrained convention.
For instance, some audiovisual mappings may be discerned in preverbal infants
and even in nonhuman species. These include correspondences of pitch and
58
Cross-modal correspondences in a Schubert song 59
spatial height (Walker et al. 2010; Wagner et al.1981; see Lewkowicz and Minar
2014 for a critique), pitch and visual shape (e.g. round versus sharp; Walker et
al. 2010), pitch and luminance (Ludwig, Adachi and Matsuzawa 2011), pitch
and physical size (Morton 1994; see also Tsur 2006), and loudness and lumi-
nance (Lewkowicz and Turkewitz 1980). As can be seen from these examples,
mappings of auditory features onto visual-spatial dimensions in particular are
frequent, highlighting a possible central role of notions related to shape.
While experimental studies can suggest the kind of mappings expected to
play a role in music listening, the actual manifestation of cross-modal interac-
tion in music may be confounded by the diversity of mappings that might be acti-
vated simultaneously, and by contextual factors that influence the connotations
activated. We aim to demonstrate that, multiplicity and context-dependency
notwithstanding, an analysis of cross-domain mappings in music, informed by
experimental findings in cross-modal research, can elucidate important aspects
of musical meaning and reference. In particular, we examine the interrelation-
ship between two central pillars of musical meaning: cross-modal and emo-
tional mappings of musical features. Furthermore, we aim to demonstrate how
multiplicity of cross-modal interaction is instrumental in generating complex,
multilayered musical meanings, which in combination may often be most easily
and efficiently summarized by the metaphor of shape. Investigating a musical
setting of a text permeated with references to nonauditory sensory domains
may serve as a useful point of departure for such endeavour. We chose to con-
centrate on Schubert’s well-known (and oft-discussed) setting of Heine’s ‘Die
Stadt’ (from Schwanengesang D. 957), examining both score-based (composi-
tional) and performance-based features, the latter grounded on quantitative
analysis of recorded music.
MULTIPLICITY OF MAPPINGS
An important and under-investigated issue concerning cross-modal correspon-

dences of musical features is the one-to-many relationships these correspon-
dences often present: a feature of musical sound or structure may correspond
with diverse nonauditory dimensions. Higher pitch, for instance, corresponds
perceptually with smaller object size, higher spatial location, lighter colour
or sharper (pointed) shape, among other features (Eitan and Timmers 2010).
Likewise, different attributes of music or musical sound may conspire in rep-
resenting a single nonauditory feature. For instance, lower pitch, increased
loudness and longer duration all correspond with larger, heavier objects (for
research reviews see Eitan 2013; Eitan and Timmers 2010; Marks 2004; Spence
2011). Only a few studies, however, have investigated how concurrent variations
in multiple auditory dimensions (ubiquitous in music) affect cross-modal cor-
respondences (but see Adler 2014 for relevant experimental work).
60 Music and Shape
Importantly, in musical contexts, auditory features that map onto features

of other sensory domains (such as vision, touch or motion) also associate regu-
larly with dimensions of emotion, like valence and activity, and with specific
basic emotions. For instance, low pitch, corresponding perceptually with dark
colour and dim light (Ludwig et al. 2011; Marks 1989, 2004; Melara 1989;
Spence 2011), also suggests negative emotional valence, particularly sadness
(Collier and Hubbard 2001, 2004; Eitan and Timmers 2010; Hevner 1937).
These are nonarbitrary mappings, as indicated by implicit cross-modal effects
of emotional associations. For example, musical features associated with emo-
tion influence the emotional processing of visual scenes (Cohen 2001; Boltz
2004). Correspondingly, emotions associated with visual (Boltz 2013; Timmers
and Crook 2014) or verbal stimuli (Weger et al. 2007) influence the perception
of music presented concurrently, such that, for instance, positive or negative
valence associated with words or images may enhance the perception of high
and low pitches respectively.
Furthermore, dimensions of nonauditory modalities that often map onto
features of sound (e.g. luminance or height) may themselves be associated
with emotion. For instance, the dichotomies dim-bright and dark-light, which
map onto pitch and loudness, also associate with emotional valence, such that
brighter light and lighter colour correlate with positive valence. This is evident
both in language (e.g. ‘dark’ and ‘bright’ moods) and in nonverbal measures
of emotion, often expressed implicitly. For instance, positively valenced words
were processed faster when printed in white, rather than black; the opposite was
true for negative words (Meier, Robinson and Clore 2004). Correspondingly,
evaluation words positively or negatively affected brightness perception (Meier
and Robinson 2005; Meier et al. 2007), suggesting that the origin of the valence
attribution may be related to the evaluation of day versus night. Similarly,
spatial height and spatial rise correlate with positive emotion, as suggested
by both language metaphors (feeling high or low, high-spirited) and implicit
nonverbal measures. For instance, the valence of words presented to partici-
pants affects spatial–visual attention, such that positive words shift attention
upwards, and negative words shift attention downwards (Meier and Robinson
2004). Analogously, moving objects up or down enhances recall of positive and
negative episodic memories respectively (Casasanto and Dijkstra 2010; see also
Freddi, Cretenet and Dru 2013, and Meier and Robinson 2005).
In interpreting cross- domain mappings in music, then, an interrelated
triad of mappings should be considered: between sound and other sensory
modalities (in particular visual–spatial), sound and emotion, and nonauditory
modalities and emotion (e.g. low pitch–dark colour, low pitch–sadness, dark
colour–sadness); each of these three types of correspondence itself suggests
multiple mappings (e.g. low pitch may be dark, but also large and spatially
low). Furthermore, some cross-modal mappings of music may be mediated by
shared emotional associations. For instance, Palmer et al. (2013) show, in a
cross-cultural study, that listeners’ colour and emotional associations of musi-

cal pieces are strongly correlated. Thus, for instance, slower music in the minor
mode may be perceived as ‘darker’ since musical features such as minor mode
or slow tempo and visual features (e.g. dark colour) are both associated with
similar (e.g. ‘sad’) emotional quality. Contributions to and modulations of this
triadic mapping between emotion, sound and visual-spatial metaphors lie in
the realm of both performers and composers, and it is something that unfolds
dynamically over time.
As an example that has a rich tradition of performance and commentary,
we explore the interrelationships between cross-modal and affective mappings
of musical features through both score-based and performance analyses of
Schubert’s ‘Die Stadt’, his setting of Heine’s ‘Am fernen Horizonte’. We inves-
tigate how Schubert’s text-setting employs both types of mappings and how
different performances of this lied modulate such mappings. We analyse points
in the music where cross-modal and affective features are aligned, and where
they seem noncongruent (e.g. a sunrise revealing a ‘dark’ emotional state). In
exploring points where different musical features may suggest contradictory
mappings, we investigate both the composer’s and the performers’ strategies in
dealing with such complexities.
We begin with a brief commentary on Heine’s text, examining interrelation-
ships, congruencies and incongruences between its visual, kinaesthetic and
emotional features. We show how a set of contrasts and parallelisms between
the poem’s three stanzas are suggested by these interrelations. These observa-
tions lead to an analysis of the musical structure of Schubert’s setting, reflect-
ing the structures observed in the text analysis. In particular, the musical
analysis emphasizes cross-modal mappings and their relationships with emo-
tion and affect. We compare (using quantitative analysis of acoustic data) three
recorded performances of the lied, by Dietrich Fischer-Dieskau (henceforth
DFD), Ian Bostridge (IB) and Thomas Quasthoff (TQ), examining how the
performed interpretations reflect or modify the interrelationships of cross-
modal and affective mappings in Schubert’s Heine setting. Finally, we discuss
the contribution of the notion of shape in the analysis and its relevance for
coherence of perception within an ever-varied multidimensional context.
Heine’s ‘Am fernen Horizonte’: perception, emotion

and narrative structure
A retrospective précis of Heine’s ‘Am fernen Horizonte’ (renamed ‘Die Stadt’ in

Schubert’s setting) may communicate the following: a broken-hearted narrator
travels by boat from dusk to sunrise, gazing at the city in which he has lost his
beloved. As the sun rises, the city is radiantly revealed and with it the narrator’s
glowing heartbreak.
62 Music and Shape
TABLE 3.1 Original text and English translation of ‘Am fernen Horizonte’
Am fernen Horizonte On the far horizon

Erscheint, wie ein Nebelbild, Appears, like a misty vision,
Die Stadt mit ihren Türmen, The town, with its turrets
In Abenddämmrung gehüllt. Shrouded in dusk.
Ein feuchter Windzug kräuselt A damp wind ruffles
Die graue Wasserbahn; The course of the grey water;
Mit traurigem Takte rudert With mournful strokes
Der Schiffer in meinem Kahn. The boatman rows my boat.
Die Sonne hebt sich noch einmal The sun rises once more,
Leuchtend vom Boden empor Glowing upwards from the earth
Und zeigt mir jene Stelle, And shows me that place
Wo ich das Liebste verlor. Where I lost my beloved.
Translation by Richard Wigmore
In the poem itself, however (Table 3.1), this narrative is revealed only at
the very end. The first stanza describes a city seen from afar at dusk. We are
given no explicit information about the narrator’s identity, emotions, actions or
whereabouts (indeed, a naïve reading of this stanza could ascribe it to a third-
person narrator, gazing impartially at a remote view). We know nothing of a
water-trip or of the grief of lost love. We don’t know who gazes at the town or
what (if anything) it means to him.
What we do obtain is considerable visual information. We know that the
town is seen from afar, at the horizon (Am fernen Horizonte). We know that
its outlines are veiled as a foggy image (wie ein Nebelbild) and rather dark,
shrouded in dusk (In Abenddämmrung gehüllt). We know quite a bit about
space and light, but little (at least explicitly) about anything else that matters.
While the first stanza presents a gaze at a remote and static object (the town),
the second stanza is a close-up shot of the narrator’s immediate surroundings
(water, boat, boatman’s rowing), involving both motion (Windzug kräuselt, rud-
ert) and emotion (traurigem). Furthermore, at the end of this stanza it is clear
that the poem is narrated by its own protagonist in a first-person narration (in
meinem Kahn) and that this protagonist is neither objective nor impartial. Even
the oar strokes are described as ‘mournful’ (traurigem Takte), though we can,
at this stage, only guess what the mourning is about.
In perceptual terms, then, the second stanza contrasts with the first with
regard to distance (far–near) and motion (dynamic–static; Table 3.2). Note
that in addition to dimensions of motion and colour (graue Wasserbahn), this
stanza also involves the tactile modality (feuchter Windzug), consistent with the
close-by perspective it presents. These changes in the depiction of perceptual
realms are in line with the changes in narrative perspective, stressing first-per-
son narrative and strong (though still subdued) emotions.
It is only in the final (third) stanza—indeed, only in its last line—that the
crux of the poem is revealed: we now know what the town means to the nar-
rator and why he keeps gazing at it from dusk to sunrise.1 Appropriately, the
TABLE 3.2 ‘Die Stadt’, stanza 1 versus 2: contrasting and parallel dimensions
Dimension 1st Stanza 2nd Stanza
Distance Far Near

Motion Static/passive Dynamic (oar strokes, wind)
Light Dark, misty Grey
Sensory modalities Visual Kinaesthetic, tactile, visual
Emotion Implicit Explicit (‘mournful’)
Narration mode Yet unknown First person
TABLE 3.3 ‘Die Stadt’, stanza 1 versus 3: contrasting and parallel dimensions
Dimension 1st Stanza 3rd Stanza
Object described Town Town

Distance Far Apparently nearer
Motion Static Dynamic—upward (sunrise)
Luminosity Dark Bright
Lucidity Misty Clear
Sensory modalities Visual Visual
Emotion Implicit Explicit (‘loss of beloved’)
Narration mode Yet unknown First person
agent of this narrative and its emotional revelation is the very source of clar-
ity: the rising sun itself.
The third stanza both parallels and contrasts with the first (see Table 3.3);
no less importantly, it complements it. Both stanzas involve viewing the same
object—the town—and both emphasize the perceptual dimension of visual
brightness. However, the two stanzas contrast with regard to the view itself,
as well as its emotional underpinnings. Visually, the scene is now bright and
painfully clear, highlighted by the glowing, rising sun, and thus contrasted with
the darker, dim view of the opening stanza. Moreover, the last stanza involves
motion and change (particularly upward motion, associated with positive and
active emotions), rather than stasis: the sun is ‘rising from the earth’ (vom
Boden empor).2
These ‘perceptual’ contrasts between the stanzas are accompanied by
emotional and narrative correlates, as the previously veiled connotations
of the distant town now become painfully clear to both reader and pro-
tagonist. However, from another perspective, perceptual metaphor and
emotional import are strikingly incongruous here. As mentioned, visual
brightness (luminosity) and lightness widely serve as metaphors for emo-
tional valence, such that brighter light and lighter colour correlate with
positive valence. This association, evident in verbal metaphor (Stimmung
64 Music and Shape
hellt sich auf—literally, mood brightens up),3 also affects behaviour and
cognition implicitly and automatically, as evidenced in diverse empirical
work (Meier and Robinson 2005; Meier et al. 2007). Similarly, spatial rise
correlates with active and positive emotion, as suggested by both language
metaphors (e.g. Die Stimmung steigt—the mood rises) and nonverbal exper-
imental measures (Casasanto and Dijkstra 2010; Freddi et al. 2013; Meier
and Robinson 2004).
In particular, the rising sun serves as a metaphor for ‘elevated’—hopeful,
cheerful and active—emotions:
• … he who kisses the joy as it flies /Lives in eternity’s sun rise (Blake,
‘Eternity’)
• But soft! What light through yonder window breaks? It is the east,
and Juliet is the sun! (Shakespeare, Romeo and Juliet, II/ii)
Such hopeful or joyful, ‘sunny’ emotions are often associated with renewal or
creation:
• The sunrise is a glorious birth (Wordsworth, ‘Intimations of
Immortality’)
• Was it light that spake from the darkness /Or music that shone from
the word /When the night was enkindled with sound /of the sun or
the first-born bird? (Swinburne, ‘Music: An Ode’)
• Himmelhoch jauchzend, zum Tode betrübt (heavenly rejoicing, then

deathly sorrowing; Goethe, Egmont)
In ‘Die Stadt’, in apparent incongruity with such metaphors and with a host
of empirical studies of the association of light and mood mentioned above, the
rising sun (which, importantly, is the subject and active agent of this stanza: it
‘shows’ the protagonist the town) evokes a ‘dark’ memory and a mood of
mournful, hopeless despair. In the next sections, we investigate how both
Schubert and some of his most prominent present-day performers encounter
this seeming contradiction, as well as other aspects of Heine’s imagery.
Schubert’s reframing of Heine’s narrative
Schubert’s setting of Heine’s evocative text (see Appendix 3.1 ) has been
extensively analysed from diverse perspectives (e.g. Clark 2002; Hascher
2008; Kerman 1962; R. Kramer 1994; L. Kramer 2003, 2004; Morgan 1976;
Schwarz 1986; Youens 2007). As noted, our main goal in the present analysis is
elucidating how Schubert’s musical setting encounters cross-modal mappings
and their interactions with emotional expression, as suggested by the text. To
lay the ground for that analysis, however, we first present some observations
concerning the structure of Schubert’s song as it relates to Heine’s text.
The vocal sections of ‘Die Stadt’ present an ABA’ design, consistent with
the text’s structure, as described above. The third stanza repeats and comple-
ments the first, while the second stanza contrasts with both. The outer stanzas
(bars 6–14, 27–35), harmonically and melodically closed, are similar to each
other in their harmonic and melodic structures. They also present a similar
rhythmic structure (including the conspicuous dotted rhythms in the piano
accompaniment) and piano texture. The third stanza, however, contrasts with
the first in several conspicuous ‘surface’ features, particularly dynamics (f to ff,
contrasting with the overall pp of the first two stanzas) and register (the piano
accompaniment rises an octave and the bass is doubled). The vocal line also
rises higher than in the opening stanza (to g2, the highpoint of the entire song,
on ‘Liebste’, bar 34) and is more disjunct and angular, presenting the largest
melodic intervals in the song (fourths, fifths, minor sixth, octave; bars 29–31,
33–35). In contrast, the vocal line of both stanzas 1 and 2 presents only seconds
and thirds. Stanzas 1 and 3, then, while structurally almost identical,4 contrast
in conspicuous expressive aspects (dynamics, register, vocal contour, interval
size), the last stanza achieving a more dramatic, decisive closure, complement-
ing the tonally stable yet muted opening.
The musical setting of stanza 2 (bars 14–27), like its text, strikingly contrasts
with those of both outer stanzas. While the settings of both stanzas 1 and 3
depict a harmonically closed structure, a homophonic texture and an arched,
mostly ascending melodic contour, stanza 2 introduces a static yet dissonant
harmony throughout (the ambiguous diminished-seventh chord C–E♭–F♯–A,
which also shapes the melodic line), a florid, arpeggiated accompaniment fig-
ure and a continuously falling vocal contour (descending from the previously
established high point, e♭2, to c1). Together, these features embody a paradoxi-
cal combination of several metaphorical movements: rapid, repetitive surface
motion (the piano figuration), which is yet static (unchanging harmony) and
aimless overall, going nowhere (diminished-seventh chord, harmony devoid of
any clear tonal ‘direction’). This notwithstanding, a constant, steep fall under-
lies the entire stanza (the vocal contour).
Figures 3.1–3.3 quantitatively plot some of the relationships among the
three vocal stanzas, as described above. Figure 3.1 depicts the contour of the
vocal line (top, black line) expressed in terms of the weighted average pitch
per two-bar phrase (weighted according to note duration). Additionally, it
shows the mean absolute melodic interval per two-bar phrase (bottom, grey
line), which indicates how much the pitch of the vocal line varies in successive
two-bar phrases. Figure 3.2 plots the mean intensity (left) and the maximum
intensity (right) per two-bar phrase for three performances of ‘Die Stadt’ (to be
discussed separately later).5 Figure 3.3 plots the mean and standard deviation
of the rhythmic durations present in the vocal line per two-bar phrase.
The figures suggest a complex web of similarities and contrasts between the
three stanzas. Stanzas 1 and 3 are similar in melodic contour, both presenting
66 Music and Shape
75
16
70
Mean absolute interval

Mean weighted pitch
12
65
8
60 4
0
55
0 2 4 6 8 10 12
Two-bar phrase
FIGURE 3.1 Mean weighted pitch (black line) and mean absolute pitch interval (grey line) per
two-bar phrase
an ascending contour (with stanza 3 rising higher), which contrasts with the
descending contour of stanza 2 (Figure 3.1, top). With regard to melodic inter-
vals, however, it is stanza 3, presenting larger intervals, which contrasts with
both stanzas 1 and 2. This pairing is also depicted by intensity (Figure 3.2):
stanza 3 presents (in all three performances) considerably higher intensity than
both earlier stanzas (intensity contours, which differ for all three stanzas, also
vary with performance, which will be discussed later). The rhythm of the vocal
line, on the other hand, shows a process of change (Figure 3.3), in which the
stanzas become more rhythmically diverse, in particular through the presence
of longer durations.
These complex interrelationships notwithstanding, the three vocal stan-
zas could present a fairly conventional narrative structure, in which the outer,
stable stanzas frame a central unstable one, with the last stanza intensifying
and dramatizing the concluding tonal closure through louder dynamics, higher
register and larger melodic intervals. Yet Schubert turns this ‘reasonable’ form
upside-down (or rather, inside-out): he frames the vocal sections with introduc-
tory and concluding sections, both identical to the piano part of the central
second stanza, with its harmonically ambiguous diminished-seventh harmony
and florid arpeggiations.
The expressive and structural implications of this framing have been fre-
quently observed and debated in the critical and analytical literature (e.g. Clark
2002; Kerman 1962; Kramer 2003, 2004; Morgan 1976; Schwarz 1986; Youens
2007), and we do not address them at length. Two related outcomes of this
Mean intensity DFD Maximum intensity DFD
Mean intensity IB Maximum intensity IB
Mean intensity TQ Maximum intensity TQ
80 85
75 80
Intensity (dB)
70 75
65 70
60 65
55 60
0 2 4 6 8 10 12 0 2 4 6 8 10 12
Two-bar phrase Two-bar phrase
FIGURE 3.2 Mean intensity (left) and maximum intensity (right) per two-bar phrase for three performers. Intensity was measured from commercial
recordings combining the piano and the vocal line. Interruptions in lines indicate bars that are separated by piano accompaniment intermezzi.
68 Music and Shape
0.8
Duration (crotchets)
0.6
AvDur
StdDur
0.4
0.2
0
0 2 4 6 8 10 12
Two-bar phrase
FIGURE 3.3 Average rhythmic durations (black line) of the vocal line and standard deviation of
rhythmic durations (grey line) within successive two-bar phrases
gambit should be pointed out, however. Structurally, it turns the song from
what could have been a tonally and narratively closed entity (as described
above) to an open one—perhaps (as Morgan 1976 and others suggest) as a
link to other songs in Schwanengesang.6 Narratively, rather than an intermedi-
ate stage connecting dusk (first stanza) and sunrise (third stanza), the nightly
rowing scene of the second stanza is also a frame for the entire song, sup-
plying the material for its opening and closing piano figuration. Due to this
framing, Schubert’s song now takes place in a constant, perhaps eternal limbo,
accompanied by Charon’s constant rowing, leading nowhere; thus, Schubert’s
foggy framing perhaps suggests who the boatman is and what ancient tale—the
Orphean tale of love lost—the narrator (and Heine) is trying to retell.
Structure and cross-domain mappings in Schubert’s ‘Die Stadt’
Having discussed general characteristics of the structure of ‘Die Stadt’, we now

turn to some of the main cross-domain mappings that play a role in connecting
sound and images evoked by the text and by Schubert’s music, discussing in
particular associations with light, distance, motion and emotion.
LIGHT
As noted above, ample experimental research in perception and psychophysics

(for reviews see Eitan and Timmers 2010; Marks 2000, 2004; Spence 2011) sug-
gests that visual brightness corresponds with auditory loudness (louder/
brighter), pitch height (higher/ brighter) and pitch direction (rising pitch/
brighter). Visual brightness is also associated with aspects of the sound spec-
trum, particularly spectral centroid (higher/brighter). Research also suggests
associations of colour lightness or brightness with modality (major is lighter
and brighter than minor; Bresin 2005, Palmer et al. 2013), tempo (faster music
associates with lighter colours; Palmer et al. 2013), and interval size (larger
melodic intervals associated with more extreme degrees of brightness or dark-
ness; Hubbard 1996).
Schubert’s ‘Die Stadt’ uses the most conspicuous of these correspondences
unequivocally. Thus, the dimensions contrasting the first and second stanzas,
set in dusk, and the third stanzas, depicting sunrise, are those most widely and
conspicuously associated with brightness: sound intensity (which has been
associated with visual brightness even in newborns; Lewkowicz and Turkewitz
1980) and pitch height (associated with colour lightness and brightness both in
humans and in other primates; Ludwig et al. 2011).
Sound intensity also affects the spectral structure of the musical sound
(both piano and vocal), such that louder sound emphasizes higher, ‘brighter’
spectral components; hence, loudness contrasts between the third stanza and
the preceding stanzas entail corresponding differences in spectral ‘brightness’
associated with visual brightness (Griscom and Palmer 2012). To examine
whether the analogy of visual brightness and spectral structure is expressed in
performances of ‘Die Stadt’, we calculated the median spectral centroid for the
three stanzas (piano solo sections excluded) for each performance (Figure 3.4).
Spectral centroid was measured using the Libxtract plugin available in Sonic
1600
Median spectral centroid (Hz)
1400
1200
DFD
1000 IB
TQ
800
600
1 2 3
Stanza
FIGURE 3.4 Median spectral centroid (Hz) per stanza for three performances of Schubert’s ‘Die Stadt’.
Spectral centroid was measured from commercial recordings combining piano and vocal line.
70 Music and Shape
Visualiser.7 As the figure shows, the increase in intensity in the third stanza is
indeed accompanied by a rise in spectral centroid (compared to second stanza)
for all performances, and for two of the three performances (TQ and DFD)
median spectral centroid in the third stanza is also considerably higher than in
the first stanza.
Additionally, larger melodic intervals emphasize cross- modal mappings
of pitch and brightness, producing more extreme (bright or dark) mappings
(Hubbard 1996). Hence, the concentration of the largest melodic intervals in
the setting of ‘Leuchtend vom Boden empor’ (literally, ‘glowing upwards from
the earth’) is telling.
A different type of allusion to light quality (yet unaccounted for by cross-
modal empirical research) involves the diminished-seventh sonority, which
frames the song and underlies its central stanza. Due to its symmetrical struc-
ture, the diminished-seventh chord is the most ambiguous sonority in the tonal
harmonic palette, and may be associated (in enharmonic interpretations) with
virtually every tonal centre. Though in its present context this ambiguity is
not exploited, the chord may serve as an apt symbol of the foggy visual (as
well as emotional) quality shrouding the song. Whether this high-level sym-
bolic association (grounded in tonal syntax, rather than basic perceptual cor-
respondences) also affects listeners’ perception is an intriguing question, which
remains to be empirically explored.
DISTANCE
Acoustically, sound intensity (the main physical determinant of perceived

loudness) is the strongest correlate of physical distance, decreasing by approxi-
mately 6 dB with the doubling of the distance from a sound source. Another
acoustical cue for distance is spectral filtering in the upper spectral regions,
which increases with distance (Blauert 1997). In other words, softer sounds, as
well as duller sounds (possessing lesser energy in the higher spectral regions)
are both associated with greater physical distance. As noted, for both vocal
and most instrumental musical sound, higher spectral components tend to be
emphasized as increases in sound intensity and its main perceptual correlate,
loudness (Sundberg 1999; see also Melara and Marks 1990 regarding interac-
tions of timbre, loudness and pitch).
Pitch direction and distance change are also associated. An association
between pitch rise and looming (rapidly approaching) motion was found
even for nonhuman primates (rhesus monkeys; Ghazanfar and Meier 2009).
Such association in humans is suggested by a tendency to hear rising pitch
with unchanging intensity as increasing in loudness (and thus as approach-
ing: Neuhoff, McBeath and Wanzie 1999). There also seems to be an acoustical
basis for perceptual correspondences of pitch height or pitch direction with
spatial distance: the Doppler effect, in which frequency is shifted down for a
receding source. Thus, a lower or descending pitch would be associated with

greater or increasing distance. Note, however, that pitch–distance relationships
are not unequivocal: an association between pitch rise and increasing distance
was found in music-related imagery tasks (Eitan and Granot 2006).
Additionally, temporal and spatial distances are strongly associated in per-
ception and cognition, such that shorter duration is congruent with shorter
spatial distance (Merritt, Casasanto and Brannon 2010; Walsh 2003). This sug-
gests that manipulation of tempo or inter-onset intervals (IOI) may also serve
to suggest changes from far (larger distance–slow tempo or longer IOI) to near
(shorter distance–faster tempo or shorter IOI).
For a song beginning with a gaze at the distant horizon (Am fernen
Horizonte), manipulation of such distance-related attributes is of particular
interest. Comparing the setting of the first and third stanzas suggests a far–
near dichotomy through loudness contrasts (pp versus f / ff; see mean and max-
imum intensities in performances of the song, Figure 3.2). This also generates
differences in spectral energy, although the two do not need to be in a direct
linear relationship in performances of the score (compare Figures 3.2 and 3.4).
The low, muted pitch register of the piano accompaniment in the first stanza
emphasizes the sense of vast distance; the higher register in the third stanza is
thus another correlate of the protagonist’s approach to the town and, meta-
phorically, to his own anguish and pain, distant and veiled in the first stanza.
This approach is further indicated by the ascending vocal line, which suggests
a ‘close up’ on the nearing town at the song’s highpoint (‘Liebste’, bar 34), a
highpoint in pitch, loudness and emotional intensity, and the point where the
narrative is finally revealed.
While loudness is the most conspicuous attribute generating the far–near
contrast between the first and the third stanzas, in the second stanza (and in the
framing piano introduction and conclusion, which allude to it) Schubert rather
uses timbre and temporal density (i.e. IOI) to convey the dimension of spatial
distance. As noted above, the text of the second stanza suggests a ‘close up’ on
the narrator’s immediate surroundings (water, boat), thus contrasting with the
gaze at the distant town in the first stanza. This far–near contrast, however, is
not expressed through loudness contrast, as a pianissimo indication prevails
throughout this stanza (in actual performance loudness even tends to decrease,
as Figure 3.2 indicates). Rather, emphasis of the upper partials of the piano
sound may convey physical proximity, as implied by the score and brought out
in performance. The registration and doubling of the repeated diminished-sev-
enth chord, underlying the entire stanza, coincide with the higher overtones
from the bass: c2 (the 8th partial), f♯2 (the 11th partial) and a2 (the 13th partial)
and emphasize proximity rather than distance of the bass. The shorter IOIs
(demisemiquavers) used in the piano arpeggiation of the second stanza further
articulate a shorter distance and ensuing smaller physical dimension (e.g. water
ripples) associated with the text.
72 Music and Shape
Importantly, it is the piano accompaniment, rather than the voice, which

activates the far–near contrasts between the first and second stanzas. The
voice remains remote—soft and low, as well as decreasing in pitch and loud-
ness throughout the stanza (Figures 3.1 and 3.2). The change of distance from
the first stanza to the second involves the external, physical image. It is only
in the third stanza (where distance-related features, particularly loudness and
pitch contour, are also associated with the voice part) that emotional distance
changes too: the approach to the town and to the emotional content associated
with it now becomes personal.
MOTION
Textually, the three stanzas are clearly distinguished from each other in the
qualities of motion they suggest. The first stanza does not allude to motion
in any direct way. The second, in contrast, is full of motion, and suggests two
simultaneous types of movement: the erratic wind, creating ruffles in the water,
and the measured, ‘mournful’ oar strokes (Takte—also musical beats). The
third stanza is underlined by a single majestic motion: the rise of the sun ‘from
the earth’.
Schubert applies two types of mapping to suggest these motion qualities.
One is the direct analogy between temporal aspects of physical motion (e.g.
pace, regularity) and aspects of rhythm (IOI, metric accent). The other analogy
maps pitch space onto physical space, and thus pitch change (e.g. rise or fall,
steps or leaps) onto physical motion (for reviews of relevant cognitive research
see Eitan and Granot 2006; Eitan 2013). In the second stanza (bars 16–26),
both mappings are applied. The arc of rapid arpeggiation on first beats sug-
gests, through both rhythm and pitch contour, the wind and the water ripples
that the wind generates. The boatman’s Takte (the repeated, steady lowering of
the oars into the water) are alluded to by accented As, repetitively descending
two octaves (second and third beats). In the third stanza it is the pitch/space
analogy, applied in the vocal line, which suggests motion (the rising sun): the
rising vocal contour and the concentration of upward leaps (fifth, fourth) on
‘vom Boden empor’ (bar 30).
CROSS-M ODAL AND EMOTIONAL CORRESPONDENCES
As noted above, cross-modal and emotional mappings are often highly cor-
related, in music and elsewhere. Low pitch, for instance, is associated with
darker (low lightness) or dimmer (low brightness) visual stimuli, as well as with
negative, low-intensity emotion (e.g. sadness). Correspondingly, negative emo-
tional states—‘dark’ emotions—are themselves associated with darker or dim-
mer visual stimuli. Similarly, high or ascending visual stimuli, high or rising
auditory pitch, and positive, ‘uplifting’ moods are also cognitively associated.
A comparable relationship (as discussed above) may be discerned for slower

tempo (or longer IOI), darker colour and sad emotion.
In the third stanza of ‘Die Stadt’, Heine’s text itself challenges these estab-
lished analogies, relating brightness and spatial rise—both associated with pos-
itive emotion—with the poem’s painful conclusion. Schubert’s setting includes
some robust auditory correlates of increased visual brightness: higher intensity
(and the ensuing spectral ‘brightness’) and pitch register, as well as spatial rise
(overall vocal contour, large accented rising intervals). These, however, are com-
bined with the hallmarks of negative emotional valence in music: minor mode,
and slow tempo and pace. Both features are particularly underscored here: the
minor modality by the use of the lower second degree (bar 32), the slow pace
due to the contrast with the faster motion in the preceding and following piano
arpeggiations. Furthermore, at the vocal line climaxes (bars 29–31, 33–34) the
voice doubles the bass line, rather than the upper line of the piano texture as it
did in the first stanza. Hence, the rise to the song’s highpoint is now paradoxi-
cally associated with the deepest, lowest register.
The result of this combination of features—a loud, brighter sound joined
with intensified minor modality and slow pace—is not a ‘compromise’ between
the apparently conflicting features of increasing brightness and deep sorrow.
Rather, it provides an analogue of the state of tragic revelation suggested by
Heine’s text, a state of extreme ‘darkness visible’ where pain veiled (and muted)
by night, fog and distance is now revealed by the cruel sunlight, revealing to the
protagonist those ‘regions of sorrow … where peace and rest can never dwell,’
and where ‘hope never comes’ (Milton, Paradise Lost, Book 1).
This painful clarity of negative emotion in the third stanza is differentiated
from a fuzzier negativity in the second stanza, with its uncertain dissonant har-
mony. We have argued above that the diminished-seventh chords may associate
with lack of visual clarity. Additionally, uncertainty and dissonance emphasize
negative valence, as does the descent in melodic contour in the vocal line of this
stanza (Timmers and Philippou 2010; Collier and Hubbard 2001). This leaves
the first stanza relatively unaffected emotionally (although within the general
mood conveyed by the minor mode and slow tempo). Indeed, it is the stanza
with the stablest melodic and rhythmic characteristics, containing relatively
small pitch changes and stable rhythmic patterning.
Cross-domain mappings in three performances of ‘Die Stadt’
We now examine in more detail three performances of the score and how these
modify or add to the observed cross-modal mappings and affective associa-
tions. We focus here on performers’ local variations in intensity and tempo.
Both types of variations map in various ways onto cross-modal and affective
dimensions. As discussed above, intensity is closely associated with the distance
74 Music and Shape
of a sound source. Physically, it is also associated with energy, quantity and

size of a sound source: a larger number of sound sources, or sources which are
physically larger, generally create a louder sound. Possibly related to this physi-
cal mapping between intensity and energy is the contribution of variations in
intensity to perceived changes in emotional arousal in music (Schubert 2004;
Coutinho and Cangelosi 2011): intensity, spectral centroid and tempo are all
strong predictors of emotional arousal (Coutinho and Cangelosi 2009, 2011;
Gingras, Marin and Fitch 2014). Not surprisingly, growth in intensity and
energy are associated with increases in physical motion, possibly because larger
and faster motion produces greater impact in sound (e.g. Dahl, Grossbach
and Altenmüller 2011). Moreover, as discussed earlier, intensity tends to be
mapped onto luminosity, with louder sounds perceived as brighter (Lewkowicz
and Turkewitz 1980; Marks 1989). Since visual brightness is itself associated
with emotional valence (Meier et al. 2004; see above), this mapping implies a
contribution of sound intensity to the perception of emotional valence. A sys-
tematic relationship between timbral brightness (sharpness) and valence has
indeed been found (Coutinho and Cangelosi 2009, 2011), while a relationship
between intensity and valence may also be present depending on the context
(Timmers 2007).
Variations in tempo may map onto many of the same dimensions as inten-
sity. Duration is closely associated with physical distance or length (Merritt
et al. 2010). Higher tempo compared to lower is associated with faster motion
and higher energy. These associations may play a role in the contribution
of tempo to the perception of emotional arousal in music (Coutinho and
Cangelosi 2009, 2011). According to Walker, Walker and Francis (2012), tempo
is one of the dimensions that may be linked through ‘crosstalk’ with various
other dimensions of connotative meaning. Crosstalk refers to mappings that
are activated through other mappings that regularly occur. For example, if
higher pitch is associated with brightness and smaller size, then smaller size
and brighter objects may also be associated, which is indeed what Walker et al.
(2012) demonstrated. In a similar way, faster tempo is linked to smaller, lighter
and sharper objects, which are brighter and spatially higher. Notably, these
analogies also imply an association between faster tempo and positive valence,
assuming other parameters remain equal. Some evidence for the association
of tempo and valence has been found (e.g. Lake, LaBar and Meck 2014), but
further empirical verification is needed within and outside the context of music.
Table 3.4 shows the details of the three recordings used for the quantita-
tive analysis. These performances, by three well-known singers, were deemed
substantially different from each other (based on our subjective listening),
representing possible interpretations of the song and its diverse cross-domain
mappings.
The measurements of intensity and tempo were done in PRAAT, freely
available audio analysis software which includes facilities to annotate audio
TABLE 3.4 Recorded performances of ‘Die Stadt’ by Fischer-Dieskau, Bostridge and Quasthoff
Singer Pianist Index Record Details (Release on CD)
Dietrich Fischer-Dieskau Gerald Moore DFD Deutsche Grammophon (2005)

Ian Bostridge Antonio Pappano IB EMI Classics (2009)
Thomas Quasthoff Justus Zeyen TQ Deutsche Grammophon (2001)
files and automatically analyse audio characteristics, including intensity meas-

ures in dB. First, the onsets of two-bar phrases were manually indicated, which
provided a segmentation of the audio file. Onsets were defined to coincide with
the sung initiation of the phrase. Appendix 3.1 indicates the location of
phrase onsets using numerals (phrase 1, phrase 2, etc.). This was done only for
those phrases that included a vocal line, thus excluding the piano introduction,
interludes and coda. For phrases at the end of a stanza, phrase endings were
also determined, coinciding with the onset of the piano interlude. The phrase
onset and offset data were used to segment the music and to calculate phrase
durations. Using this segmentation of the audio file, we extracted the average
and maximum intensities of each two-bar phrase.
DYNAMIC SHAPING
Figure 3.2 shows the intensity values per two-bar phrase and their variation
across the three performances. Comparison of the two panels shows a strong
overlap between the profiles in mean and maximum intensity (left and right
panels, respectively), except that the maximum intensity is on average 5 dB
higher than the average intensity (see differences in scale). The intensity profiles
are also very highly correlated for the three singers and clearly separate the
third stanza from the first two, following the forte indication of the score and
the crescendo towards fortissimo in the final phrase, as discussed above.
Focusing on changes in maximum intensity per two-bar phrase within each
stanza, we find that intensity seems to correlate in particular with pitch con-
tour (compare Figures 3.1 and 3.2). Using partial correlations, we can corre-
late measured intensity with the weighted pitch per phrase, after correction for
correlations with a forte indication in the score. This means that the jump in
intensity is accounted for by the forte indication, and the remaining variation is
correlated with pitch height. Table 3.5 shows the resulting partial correlations,
which are strong for IB in particular, followed by TQ and DFD with lower but
still significant partial correlations.
Intensity and pitch height are both associated with visual brightness, a
prominent feature of the text. Intensity reinforces associations related to pitch
of the vocal line. In the first stanza, intensity and pitch rise, which may relate
to the appearance (erscheint) of the city at dusk. In the second stanza, intensity
and pitch descend and dissolve into the rowing motion of the accompaniment,
76 Music and Shape
TABLE 3.5 Partial correlations between pitch and inten-

sity after correction for correlations with dynamic indica-
tions in the score (N = 12)
DFD IB TQ
Mean weighted pitch .591* .813** .649*
* p < .05; ** p < .01.
which itself comes to a standstill in the fermata before third stanza. The soft-
ness of the low pitch disambiguates the low voice as being depleted of energy
rather than ‘full’ or ‘big’, which low voices can also be (Eitan and Timmers
2010). Additionally, it emphasizes the emotional distance of the protagonist.
While the second stanza involves a physical close-up, the protagonist is psycho-
logically distanced and isolated. In the third stanza, intensity is (as discussed)
a main parameter in the change in psychological distance: from remote and
passive to emotionally involved, and from a darkened, veiled mood to pain-
ful clarity. The increase in intensity and pitch within the stanza sustains this
process until the final tones of the singer, in which the source of the emotional
experience is revealed.
Within this general trend of matching intensity to pitch contour, the singers
deviate to varying degrees from a perfect correlation, which may be a way to
communicate the ambiguity of the pitch ‘rises’. DFD deviates most strongly
from the correlation with pitch height. In the first stanza, this is the case for the
last phrase, where he decreases in intensity rather than increasing, which can be
seen as a depiction of the dusk (Abenddämmrung) and the limbering darkness.
In the third stanza, intensity is high from the start. It builds up to a degree but
diminishes within the final phrase, where the loss of the beloved is acknow
ledged, and the return to the dark and subdued mood that follows is anticipated.
Both of these deviations from a matching of intensity with pitch contour seem
to qualify the pitch rises as dark in mood and depressed in emotion. The peaks
in pitch are moderated, emphasizing a distance and darkness of mood.
TEMPORAL SHAPING
As explained before, the duration of sung two-bar phrases was calculated from
the measured phrase onsets and offsets. All two-bar phrases in the first two
stanzas consisted of six crotchets. However, in the final stanza, phrases vary
in score duration: alternating phrases are slightly shorter or longer than six
beats, the longer phrases containing relatively long notes (minims for Boden
and Liebste). To make the phrase durations comparable, the measured dura-
tions were normalized by dividing performed duration by score duration and
multiplying this by 6, to show the duration of all phrases as if they consisted of
10
9
Normalized phrase duration
DFD
IB
8
TQ
5
0 2 4 6 8 10 12
Two-bar phrase
FIGURE 3.5 Normalized phrase duration of successive two-bar phrases in the performance by DFD, IB
and TQ. Interruptions in lines indicate bars that are separated by piano accompaniment intermezzi.
six crotchets. The normalized phrase durations can consequently be used as an

indication of relative tempo of the performed music: short (normalized) phrase
durations indicate a relatively fast tempo.
Figure 3.5 shows the variations in normalized durations of the phrases in
the three performances. DFD and TQ show very similar profiles of duration
variations (r = .886) shared to a degree by IB as well (r = .797 and r = .715).
All three performances speed up in tempo across the three stanzas. Within this
general trend, the performance of DFD is relatively slow and the performance
of IB relatively fast. The performance of IB is different in its local profile in the
second and third stanzas: IB does not slow down toward the end of the second
stanza, and he does not speed up in the second and third phrases of the third
stanza.
Pairwise correlations indicate that there are no consistent associations between
the variations in normalized phrase duration and any of the measured score fea-
tures (rhythmic duration, rhythmic variability, pitch height) with all correlations
being nonsignificant (|r| < .375, p > .05). The variations in normalized phrase
duration negatively correlate with variations in intensity for DFD and TQ. This
was not the case for IB (Table 3.6). This negative association between dynamics
and duration for DFD and TQ is related to a parallel in global trend rather than
in local trends: in particular in the final stanza, intensity and tempo are high,
compared to the other two stanzas. If this global trend is corrected for (using
partial correlations), the correlation between intensity and duration becomes
insignificant for all three performances (|r| < .309, p > .05). Correlations between
the forte indication and phrase durations are reliable for DFD and TQ, however,
78 Music and Shape
TABLE 3.6 Correlations of duration with phrase

intensity and forte indication (N = 12)
Singer DFD IB TQ
Phrase intensity -.572* -.304 -.702**

Forte indication -.589* -.534x -.691*
x p = .074; * p < .05; ** p < .01.
and close to significant for IB (Table 3.6). This highlights that tempo is used to
support the increase in proximity, activation and emotional arousal in the third
stanza (particularly in comparison to the first) and possibly (through crosstalk)
also the increase in brightness.
If we look at the local variations in tempo within stanzas, it seems that
tempo is used at times to intensify the effects of other elements of the music
and at times to moderate them. In the first stanza, for example, all three perfor-
mances gradually slow down towards the end of the stanza (Abenddämmrung).
In this stanza, the upward motion in pitch is accompanied by a slowing down
rather than a speeding up of tempo, perhaps associating with the stillness of
the evening and the static quality of the scene.
In the second stanza, tempo again decreases gradually for DFD and TQ,
although less strongly than in the first stanza. This time the slowing down
accompanies a decrescendo and a fall in pitch. The decrease in tempo intensi-
fies the associations related to the descent in pitch and intensity of the vocal
line, increasing a sense of the isolation and depression of the protagonist. In
contrast, IB does not slow down in the second stanza but keeps a steady tempo.
This choice may instead emphasize the steady motion of the oars, highlighting
external scenery rather than psychological process.
In the third stanza, the tempo is faster from the start of the stanza for IB,
while DFD and TQ speed up after a slower first phrase. This growth in motion
coincides with an increase in intensity and underlines the emotional intensity
and turmoil of the stanza. The changes in speed, increasing rhythmic irregu-
larity, further contribute to the emotional intensity of this stanza. IB extends
the phrases containing the longer notes (minims for Boden and Liebste), while
DFD and TQ lengthen in particular the final phrase and thus dramatically
accentuate the memory of the beloved (Liebste verlor).
SUMMARY OF COMPARISON OF PERFORMANCES
Of the three performances, IB’s can be seen as providing the most literal read-
ing of the score, being relatively steady in tempo at a local level, and showing a
very strong correlation between intensity and pitch. He emphasizes the global
changes in the poem and the music: there is stronger motion in the second
stanza than in the first, and intense emotional arousal in the third. The dra-
matic climaxes in the third stanza are underlined through intensity and rubato,
extending the moments of intense emotion. His performance is seemingly most
bright and active, employing a relatively fast tempo and rising intensity with
rising pitch.
DFD shows the strongest modification of affective and cross-modal map-
pings. His performance matches the global intensification of the poem and the
music with a global rise in the tempo across stanzas and contrasting dynamics,
in particular in the third stanza. However, each overall increase in tempo is
counterbalanced by a considerable decrease in local tempo. His performance
of the first two stanzas can be heard as the darkest, most mournful of the
three: any surges in brightness through rising pitch are darkened by using softer
dynamics and slower tempos. The emotional instability and intensity of the
third stanza is emphasized through sudden and strong tempo changes, while
the temporary nature of the vivid memory of the beloved is highlighted by an
early return to the subdued character of the start.
In TQ’s performance, the difference in character between the first two stan-
zas is less apparent than in the other performances. The second stanza is not
faster than the first, although there is a contrast in speed between the end of
the first and the start of the second. The two stanzas are also very similar in
overall intensity. At a local level, TQ shows similar uses of tempo and dynamics
to DFD and (in some cases) IB. TQ tapers off the increase in pitch in the first
stanza by limiting the growth in intensity. He also limits the decrease in intensity
in the pitch fall of the second stanza, and slows down only slightly in this stanza,
providing a relatively constant affective character. In contrast, emotional inten-
sity is very strong in his performance of the third stanza, with a sudden rise in
dynamics, spectral brightness (indicated by spectral centroid) and fluctuation in
tempo. The emotional climax is sustained until the final notes of the vocal line.
Discussion and conclusion

CROSS-M ODAL MAPPINGS AND EMOTION
Our ostensible purpose in this chapter has been to demonstrate through an

analysis of a specific piece—Schubert’s ‘Die Stadt’—how basic cross-modal
mappings, often investigated by cognitive science in simple experimental set-
tings, may serve to elucidate complex musical meanings, particularly in the
context of musical text-setting. These meanings are defined and modified over
time through a complex interaction of dimensions that come together in the
form of a constantly changing shape, as further discussed below. Specifically,
we explored the relationships between cross-modal, affective and sonic char-
acteristics in Schubert’s song, and the role of music performance in expressing
(and sometimes modifying) these relationships.
80 Music and Shape
Underlying our analysis is an attempt to demonstrate how two central pil-

lars of musical meaning—cross-modal and emotional mappings of musical
features—interrelate. In musical discourse these two approaches to musical
connotation have had fundamental roles, sometimes complementarily, some-
times competitively. Throughout the history of western music, conventional-
ized musical figures have been used to allude both to aspects of the natural
world and to human passions (particularly, though not solely, in the context
of setting words to music). However, arguments against the propriety of ‘tone
painting’ of the natural world (as contrasted with the depiction of emotion)
have recurred in musical and aesthetic discourse, particularly in the eighteenth
and nineteenth centuries (see Wilson, Buelow and Hoyt 2001 for a historical
survey, and Walden 2013 for discussions of representation in nineteenth-and
twentieth-century music). We have tried to demonstrate how a close musical
reading, informed by empirical psychophysical and cognitive research, may
suggest ways in which emotional and ‘pictorial’ cross-modal connotations of
music can interact.
While reinterpreting and reassociating traditions of musical representation
through the lens of empirical psychological research, the analysis also high-
lights the need for empirical studies of music and emotion to investigate fur-
ther the contribution of cross-modal mappings, including shape. It is striking
how closely and directly many cross-modal mappings of sound also map onto
the valence and arousal dimensions of emotion. Such ‘triadic’ connections of
sound, nonauditory dimensions and emotion include dimensions discussed in
this chapter, such as brightness, clarity, activation, power or motion, as well
as other perceptual dimensions, such as tactile roughness, sharpness (sharp–
blunt) or heat (Eitan and Timmers 2010; Spence 2011). Here we have demon-
strated how mappings of light, distance and motion onto sound may reveal
the interaction of descriptive (cross-modal) and affective meanings. Thus, for
instance, the role of sound intensity in modulating a sense of distance also has
strong affective implications, alluding to a psychological or affective ‘distance’
between subject and object of desire. Slow tempo can similarly induce the idea
of distance, which adds to its expressive power to modulate emotional arousal
and valence.
METHODOLOGICAL CONSIDERATIONS
Methodologically, our analysis attempts to demonstrate how quantitative

analysis (such as measuring intensity and tempo in performance) may be
integrated with ‘softer’ analysis of text, musical features and their relation-
ships. Measurements of pitch, tempo, intensity and duration allowed the
quantification of the relationships among these dimensions, which itself sup-
ported the interpretation of the multidimensional expression of concepts cen-
tral to Heine’s text. However, in the context of an exploratory (rather than
experimental) study, qualitative interpretation of the quantitative analysis is

essential. It is through consideration of the wider musical context that the vari
ations in measurement dimensions become meaningful. In other words (and
rather obviously), quantitative analysis quantifies the strength of relationships,
while qualitative analysis provides meaning.
There is a strong empirical base for most of the mappings discussed here.
Nevertheless, interpretations of acoustic and compositional characteristics in
terms of affective or cross-modal associations may not necessarily align with
listeners’ conscious experience of the music. Some such mappings may be
latently present: they may become apparent when prompted, as through con-
textual reading tasks (Dolscheid et al. 2013). On the other hand, mappings
may influence listeners’ perceptions willingly or unwillingly. For example, par-
ticipants can be instructed to focus attention on one dimension, which they
think they do, but their responses are nevertheless influenced by the secondary
dimension (ibid.).
While we have focused on mappings of low-level psychoacoustical param-
eters such as pitch height, tempo and loudness, it is not our intention to reduce
music to such basic properties. Rather, we aim to show the richness of conno-
tations implied even by basic sonic aspects. Nor it is our intention to suggest
that mappings are always linear and simple. Even apparently straightforward
mappings may heavily depend on context. For instance, slow tempo can suggest
relaxation, compared to a faster tempo that precedes it, and thus correspond
with positive valence; yet it can also be perceived as depressed in energy, thus
suggesting a negative connotation (as found, for example, in Timmers 2007).
PERFORMANCE AND THE MULTIPLICITY OF MAPPINGS
The multiplicity of possible mappings (both cross-modal and emotional) for

each musical feature, and the multiplicity of musical features that may associate
with each nonauditory dimension, particularly emphasize the role of musical
and textual context in interpreting such cross-domain mappings. Yet, as we
have shown, context may sometimes leave the contest of mappings unresolved
and even stress an embedded contradiction (as in the case of the painful bright-
ness concluding ‘Die Stadt’).
The power of the performer to contribute to the meaning of music is evi-
dently clear from the modulation of two dimensions (intensity and tempo) that
have strong affective and cross-modal implications. Moreover, the performer
plays a crucial role in interpreting, clarifying and choosing among possible
mappings: for example, the choice of particular tempo and intensity levels may
modify associations with pitch (Eitan and Granot 2006). The faster tempo and
the correlation between pitch and intensity in IB’s performance may indeed
make this particular performance brighter and more positive in connota-
tion, while we have argued that the decreases in tempo and intensity inserted
82 Music and Shape
in DFD’s performance at the end of stanzas emphasize the music’s darkness

and the protagonist’s isolation. This interpretation of performance extends our
understanding of the ways in which performers contribute to the expressiveness
of music (for an overview, see Fabian, Timmers and Schubert 2014), highlight-
ing how correlations with musical structure and deviations from such correla-
tions can be meaningful, but also emphasizing that performance aspects have
meaning irrespective of correlations with musical structure.
Importantly, modulation of meaning and emotion occurs in real time with
musicians making split-second choices and decisions throughout performance.
The multiplicity of the options for modulating the various dimensions of
sound to produce a variety of possible mappings, interacting in complex ways,
requires an efficient means of managing the overall dynamic profile, and its
affective associations, from moment to moment through a performance of a
score. What is required is a concept that maps easily between domains, on any
hierarchical level, and that can apply equally to scores, performances and expe-
riences, and within them to such aspects as narrative structure, form, loudness,
brightness, tempo, speed, density, register, intensity, harmonic or interval pat-
terning, pitch direction, sound spectrum, distance and timbre—all the dimen-
sions of score, sound and performance that we have discussed here. This is
what shape achieves. In that sense, shape can be seen as a synthesizing tool
that allows musicians to manage the otherwise bewilderingly complex field of
possibilities for action and meaning which cross-domain mapping presents to
performers and listeners from one moment to the next.
CONCLUSION
This analysis of ‘Die Stadt’ suggested ways through which cross-modal map-
pings contribute to the emotional and interpretative meaning of the song. We
emphasized how basic features of the music, including pitch, intensity and
tempo, are carefully employed to provide a multisensory experience of Heine’s
text. The textual context brought forward metaphors related to light, distance
and motion. Our analysis highlighted musical parallels to these metaphors and
affective connotations that come into play through particular treatment in the
composition and its performances, suggesting that the deceptively simple cross-
modal correspondences examined by experimental psychology may combine
to generate a highly complex, multivalenced web of musico-poetic meanings.
Finally we argued how the process of managing such a complex array of pos-
sibilities can be handled via the notion of musical shape.
Our analysis suggests that cross-modal mappings should be more centrally
included in models of the perception of emotion in music. We see such map-
pings as closely connected to processes captured under the mechanism of ‘emo-
tion contagion’ (Egermann and McAdams 2013) and attributed to relationships
with vocal expression of emotion. Including cross-domain mappings as a

source for affective associations helps to explain a wider variety of emotional
expression beyond the delimited conceptual framework of basic emotions.
References
Adler, M., 2014: ‘Cross-modal interactions and musical representation’, in Hebrew (PhD
dissertation, Tel Aviv University).
Blauert, J., 1997: Spatial Hearing: The Psychophysics of Human Sound Localization (Cambridge,
MA: MIT Press).
Boltz, M. G., 2004: ‘The cognitive processing of film and musical soundtracks’, Memory
and Cognition 32: 1194–205.
Boltz, M., 2013: ‘Music videos and visual influences on music perception and appreciation:
should you want your MTV?’, in S.-L. Tan, A. Cohen, S. Lipscomb and R. Kendall, eds.,
The Psychology of Music in Multimedia (Oxford: Oxford University Press), pp. 217–34.
Bresin, R., 2005: ‘What is the color of that music performance?’, in Proceedings of the
International Computer Music Conference 2005 (Barcelona: ICMC), pp. 367–70.
Casasanto, D. and K. Dijkstra, 2010: ‘Motor action and emotional memory’, Cognition
115: 179–85.
Clark, S., 2002: ‘Schubert, theory and analysis’, Music Analysis 21: 209–43.
Cohen, A. J., 2001: ‘Music as a source of emotion in film’, in P. N. Juslin and J. A. Sloboda, eds.,
Music and Emotion: Theory and Research (Oxford: Oxford University Press), pp. 249–72.
Collier, W. G. and T. L. Hubbard, 2001: ‘Judgments of happiness, brightness, speed and tempo
change of auditory stimuli varying in pitch and tempo’, Psychomusicology 17: 36–55.
Collier, W. G and T. L. Hubbard, 2004: ‘Musical scales and brightness evaluations: effects
of pitch, direction and scale mode’, Musicae Scientiae 8: 151–73.
Coutinho, E. and A. Cangelosi, 2009: ‘The use of spatio-temporal connectionist models in
psychological studies of musical emotions’, Music Perception 27: 1–15.
Coutinho, E. and A. Cangelosi, 2011: ‘Musical emotions: predicting second-by-second
subjective feelings of emotion from low-level psychoacoustic features and physiological
measurements’, Emotion 11: 921–37.
Dahl, S., M. Grossbach and E. Altenmüller, 2011: ‘Effect of dynamic level in drumming:
measurement of striking velocity, force, and sound level’, in Proceedings of Forum
Acusticum, June 27–July 1, 2011 (Aalborg, Denmark: Danish Acoustical Society), CD-
ROM, pp. 621–24.
Dolscheid, S., S. Shayan, A. Majid and D. Casasanto, 2013: ‘The thickness of musical
pitch: psychophysical evidence for linguistic relativity’, Psychological Science 24: 613–21.
Egermann, H. and S. McAdams, 2013: ‘Empathy and emotional contagion as a link
between recognized and felt emotions in music listening’, Music Perception 31: 139–56.
Eitan, Z., 2013: ‘How pitch and loudness shape musical space and motion’, in S.-L. Tan,
A. Cohen, S. Lipscomb and R. Kendall, eds., The Psychology of Music in Multimedia
(Oxford: Oxford University Press), pp. 165–91.
Eitan, Z. and R. Y. Granot, 2006: ‘How music moves: musical parameters and images of
motion’, Music Perception 23: 221–47.
84 Music and Shape
odiles: cross-domain mappings of auditory pitch in a musical context’, Cognition 114:
405–22.
Fabian, D., R. Timmers and E. Schubert, eds., 2014: Expressiveness in Music Performance:
Empirical Approaches Across Styles and Cultures (Oxford: Oxford University Press).
Freddi, S., J. Cretenet and V. Dru, 2013: ‘Vertical metaphor with motion and judgment: a
valenced congruency effect with fluency’, Psychological Research 78/5: 736–48. Available
at doi: 10.1007/s00426-013-0516-6 (accessed 9 April 2017).
Ghazanfar, A. A. and J. X. Maier, 2009: ‘Monkeys hear rising frequency sounds as loom-
ing’, Behavioral Neuroscience 123: 822‒7.
Ghazanfar, A. A., J. G. Neuhoff and N. K. Logothetis, 2009: ‘Auditory looming perception
in rhesus monkeys’, Proceedings of the National Academy of Sciences 99: 15755–7.
Gingras, B., M. M. Marin and W. T. Fitch, 2014: ‘Beyond intensity: spectral features effec-
tively predict music-induced subjective arousal’, The Quarterly Journal of Experimental
Psychology 67: 1428–46.
Griscom, W. S. and S. E. Palmer, 2012: ‘The color of musical sounds: color associates of
harmony and timbre in non-synesthetes’, Journal of Vision 12: abstract 74.
Hascher, X., 2008: ‘ “In dunklen Träumen”: Schubert’s Heine-Lieder through the psycho-
analytical prism’, Nineteenth-Century Music Review 5: 43–70.
Hevner, K., 1937: ‘The affective value of pitch and tempo in music’, The American Journal
of Psychology 49: 621–30.
Hubbard, T. L., 1996: ‘Synaesthesia-like mappings of lightness, pitch, and melodic inter-
val’, The American Journal of Psychology 109: 219–38.
Kerman, J., 1962: ‘A romantic detail in Schubert’s Schwanengesang’, The Musical Quarterly
48: 36–49.
Kramer, L., 2003: Franz Schubert: Sexuality, Subjectivity, Song (Cambridge: Cambridge
University Press).
Kramer, L., 2004: ‘Odradek analysis: reflections on musical ontology’, Music Analysis
23: 287–309.
Kramer, R., 1994: Distant Cycles: Schubert and the Conceiving of Song (Chicago: University
of Chicago Press).
Lake, J. I., K. S. LaBar and W. H. Meck, 2014: ‘Hear it playing low and slow: how pitch
level differentially influences time perception’, Acta Psychologica 149: 169–77.
Lewkowicz, D. J. and N. J. Minar, 2014: ‘Infants are not sensitive to synesthetic cross-
modality correspondences: a comment on Walker et al. (2010)’, Psychological Science
25: 832–4.
Lewkowicz, D. J. and G. Turkewitz, 1980: ‘Cross- modal equivalence in early
infancy: auditory-visual intensity matching’, Development Psychology 6: 597–607.
Litterick, L., 1996: ‘Recycling Schubert: on reading Richard Kramer’s Distant Cycles:
Schubert and the Conceiving of Song’, Nineteenth-Century Music 20: 77–95.
Ludwig, V. U., I. Adachi and T. Matsuzawa, 2011: ‘Visuoauditory mappings between high
luminance and high pitch are shared by chimpanzees (Pan troglodytes) and humans’,
Proceedings of the National Academy of Sciences 108: 20661‒5.
Marks, L. E., 1989: ‘On cross-modal similarity: the perceptual structure of pitch, loud-
ness, and brightness’, Journal of Experimental Psychology: Human Perception and
Performance 15: 583–602.
Marks, L. E., 2000: ‘Synesthesia’, in E. Cardeña, S. J. Lynn and S. C. Krippner, eds.,

Varieties of Anomalous Experience: Examining the Scientific Evidence (Washington,
DC: American Psychological Association), pp. 121–49.
Marks, L. E., 2004: ‘Cross-modal interactions in speeded classification’, in G. A. Calvert,
C. Spence and B. E. Stein, eds., The Handbook of Multisensory Processes (Cambridge,
MA: MIT Press), pp. 85–105.
Meier, B. P. and M. D. Robinson, 2004: ‘Why the sunny side is up: associations between
affect and vertical position’, Psychological Science 15: 243–7.
Meier, B. P. and M. D. Robinson, 2005: ‘The metaphorical representation of affect’,
Metaphor and Symbol 20: 239–57.
Meier, B. P., M. D. Robinson and G. L. Clore, 2004: ‘Why good guys wear white: automatic
inferences about stimulus valence based on brightness’, Psychological Science 15: 82–7.
Meier, B. P., M. D. Robinson, L. E. Crawford and W. J. Ahlvers, 2007: ‘When “light” and
“dark” thoughts become light and dark responses: affect biases brightness judgments’,
Emotion 7: 366–76.
Melara, R. D., 1989: ‘Similarity relations among synesthetic stimuli and their attributes’,
Journal of Experimental Psychology: Human Perception and Performance 15: 212–31.
Melara, R. D. and L. E. Marks, 1990: ‘Interaction among auditory dimensions: timbre,
pitch, and loudness’, Perception and Psychophysics 48: 169–78.
Merritt, D. J., D. Casasanto and E. M. Brannon, 2010: ‘Do monkeys think in metaphors?
Representations of space and time in monkeys and humans’, Cognition 117: 191–202.
Morgan, R. P., 1976: ‘Dissonant prolongation: theoretical and compositional precedents’,
Journal of Music Theory 20: 49–91.
Morton, E., 1994: ‘Sound symbolism and its role in non-human vertebrate communication’,
in L. Hinton, J. Nichols and J. Ohala, eds., Sound Symbolism (Cambridge: Cambridge
Neuhoff, J. G., M. K. McBeath and W. C. Wanzie, 1999: ‘Dynamic frequency change
influences loudness perception: a central, analytic process’, Journal of Experimental
Psychology: Human Perception and Performance 25: 1050–9.
Palmer, S. E., K. B. Schloss, Z. Xu and L. R. Prado-León, 2013: ‘Music–color associations
are mediated by emotion’, Proceedings of the National Academy of Sciences 110: 8836–41.
Reed, J., 1997: The Schubert Song Companion (Manchester: Manchester University Press).
Schubert, E., 2004: ‘Modeling perceived emotion with continuous musical features’, Music
Perception 21: 561–85.
Schwarz, D., 1986: ‘The ascent and arpeggiation in ‘Die Stadt,’ ‘Der Doppelgaenger,’ and
‘Der Atlas’ by Franz Schubert’, Indiana Theory Review 7: 39–45.
Spence, C., 2011: ‘Crossmodal correspondences: a tutorial review’, Attention, Perception
and Psychophysics 73: 971–95.
Sundberg, J., 1999: ‘The perception of singing’, in D. Deutsch, ed., The Psychology of
Music (New York: Academic Press), pp. 171–214.
Timmers, R., 2007: ‘Vocal expression in recorded performances of Schubert songs’,
Musicae Scientiae 11: 237–68.
Timmers, R. and H. Crook, 2014: ‘Affective priming in music listening: emotions as a
source of musical expectation’, Music Perception 31: 470–84.
Timmers, R. and M. Philippou, 2010: ‘Influences of musical certainty on perceived emo-
tions and, vice versa, influences of musical emotions on certainty in decision-making’,
86 Music and Shape
in S. M. Demorest, S. J. Morrison and P. S. Campbell, eds., Proceedings of the 11th

International Conference for Music Perception and Cognition (ICMPC11) (Seattle:
University of Washington), pp. 812–3.
Tsur, R., 2006: ‘Size–sound symbolism revisited’, Journal of Pragmatics 38: 905–24.
Wagner, S., E. Winner, D. Cicchetti and H. Gardner, 1981: ‘ “Metaphorical” mapping in
human infants’, Child Development 52: 728–31.
Walden, J. S., ed., 2013: Representation in Western Music (Cambridge: Cambridge
University Press).
Walker, P., J. G. Bremner, U. Mason, J. Spring, K. Mattock, A. Slater and S. P. Johnson,
2010: ‘Preverbal infants’ sensitivity to synaesthetic cross-modality correspondences’,
Psychological Science 21: 21–5.
Walker, L., P. Walker and B. Francis, 2012: ‘A common scheme for cross-sensory correspon-
dences across stimulus domains’, Perception 41: 1186–92.
Walsh, V., 2003: ‘A theory of magnitude: common cortical metrics of time, space and quan-
tity’, Trends in Cognitive Sciences 7: 483–8.
Weger, U. W., B. P. Meier, M. D. Robinson and A. W. Inhoff, 2007: ‘Things are sounding
up: affective influences on auditory tone perception’, Psychonomic Bulletin and Review
14: 517–21.
Wilson, B., G. J. Buelow and P. A. Hoyt, 2001: ‘Rhetoric and music’, Grove Music Online,
http://www.oxfordmusiconline.com/subscriber/article/grove/music/43166 (accessed
9 April 2017).
Youens, S., 2007: Heinrich Heine and the Lied (Cambridge: Cambridge University Press).
PART 2
Shapes composed
Reflection
George Benjamin, composer and conductor
Musical shape on the largest scale
A few works in the repertoire have a formal contour so simple that it can be
recalled in toto after a single hearing. Some, like Borodin’s Steppes of Central
Asia, Debussy’s Sunken Cathedral or the first of Berg’s Three Orchestral Pieces,
approach from the distance, reach an apogee and then recede. Others merely
build inexorably from virtual silence to a cataclysm—Grieg’s In the Hall of the
Mountain King, the passacaglia interlude from Ligeti’s Le grand macabre or,
most famously, Ravel’s Bolero.
In most of these, all surface resources of music—register, instrumental
density, velocity and above all volume—are exploited to create the most rudi-
mentary formal outline: an arch or a wedge. Some—Debussy and particularly
Berg—are marked by a much larger degree of internal diversity and intricacy,
though the fundamental structural mould still holds.
Other basic shapes rely less on incremental sonic display and instead employ
different, and more versatile, essential tools in structural definition, namely sym-
metry and repetition. Using these typically involves subtler resources, involving
above all thematic material and—at least until the early twentieth century—
key. Da capo arias and classical minuets are obviously moulded along these
lines, as are, to a lesser or greater extent, variation and rondo forms. Add the
arts of expanded contrast, transition and evolutionary development, and the
same forces also underpin sonata form. When this architectonic blueprint was
combined with the narrative thrust of the novel—or of opera—more dynamic
and unpredictable forms were the result, from Beethoven to Berlioz, Wagner,
Mahler, Debussy, Berg, Carter and beyond.
A complex musical work has many diverse—and simultaneous—shapes.
On the largest imaginable scale, the placing of grand orchestral perorations in
Wagner’s Götterdämmerung has a specific and precisely judged contour, as does
89
90 Music and Shape
the placing of silences, across the three acts. Similarly the large-scale rhythm of
thematic recall, the alternations between varying types of texture, the shadings
and pacing of the highly diverse harmonic palette, the labyrinthine tonal design
and the flirtation with cadence, all of these were supremely interlaced by this
master of dramatic architecture and proportion. Even the contrasts in tempo,
metre and phrase structure; the use of restricted registers and specific timbres
(piccolo, stopped horns, multiple harps, tam-tam and suspended cymbal); the
varying types of word-setting—all of these have a macro form, many of them
intersecting from time to time, some at the very surface of the music, others
more deeply buried within its construction.
The perception of large-scale form requires much guess-work from the
attentive ear during performance; particularly in modern music, the full shape
can be comprehended only as the very last note falls into silence. At first it may
be almost impossible to discern the type of formal play involved in a work,
or its manner of unfolding and its scale. This is one of the challenges—and
delights—for the listener, and the ear searches for exceptions as much as sym-
metries in order to orientate itself along the path of an unfamiliar work.
The arrival of the chorus in the fifth movement of Mahler’s Second, the
first notes of the harp at the very end of the first act of Tristan—these pianis-
simo entries, an hour into the structure, are decisive structural incidents, just as
potent and memorable as the most energetic prestissimo or extreme tutti climax.
The trombones in the Pastoral Symphony, the large bells in the Symphonie fan-
tastique or the gongs in Le marteau sans maître—these timbral signals, mark-
ing the later stages of each work, all share a similar function. In an opera of
such syntactical volatility and compression as Verdi’s Falstaff, the arrival of
symmetrical phrase patterns and continuous set forms in the third act has a
decisive influence on the recalled form, as does the lapping lullaby motion and
unambiguous tonal security in the recognition scene in Elektra.
Of all the factors involved in creating structures on this scale, perhaps the
least easy to grasp and to recall is the harmonic thread; but such works will
live or die according to the success in handling this most intangible of musical
phenomena. This is all the more challenging for a composer working outside
the predefined pathways and goalposts that the tonal system provides.
No other composer has exceeded Berg as a master of large-scale design, and
the plotting of structure across Wozzeck—on and below the surface—makes
most other music seem like child’s play. In particular, the third act has an irre-
sistible momentum and dramatic impulse, yet it also seems underpinned by
the deepest architectural foundations. The first two acts exploit a sequence of
older musical forms as scaffolding to support the frequently jagged, expression-
istic surface of the music—ranging from sonata form to passacaglia, scherzo
to fugue. However, beyond the opening scene—a highly idiosyncratic set of
variations—Act 3 is virtually free of conventional formal background, each
successive scene inventing a sui generis prototype of astounding originality.
Reflection: George Benjamin 91
But beneath its tumultuous moment-to-moment flow, there is a simple, large-

scale pattern—in effect, a sequence of widely spaced harmonic knots—which,
though not easy to discern, might have been of great use to Berg while compos-
ing this final act at a red-hot intensity in 1922.
The sequence of imitative instrumental lines at the beginning of the first
scene gradually assembles a bitonal hexachord (combining the triads of E♭
major and D major), the act’s first harmonic sonority (Figure R.2).
The complex and dense parallel harmonies that conclude the interlude at the
end of this first scene coalesce onto the identical combination of major triads,
superimposed over an ominous low B (Figure R.3).
The note B instigates and dominates the whole of the ensuing murder scene,
which concludes, famously, on two gigantic crescendos on this pitch, the first of
these culminating in a sfffz hexachord (Figure R.4).
After the tavern scene, the fourth scene commences with exactly the same
six-note collection, in a different registration (Figure R.5).
Remarkably, almost every note in the fourth scene is derived from this sin-
gle hexachord and its multiple inversions and transpositions, and it concludes
with a return to this ‘mother’ chord at its original pitch level, sustained for
almost twenty bars (though in a more sombre tessitura than at the opening).
Eventually this chord cadences onto a D minor triad with added supertonic,
from which the final interlude ensues (Figure R. 6).
After an enormous, cathartic climax this interlude concludes on the same D
minor sonority, spread pp across most of the orchestra (Figure R.7).
The opera ends with a luminous, floating carillon—four flutes and celesta
oscillating between two consonant tetrachords—underpinned by a hollow per-
fect fifth in the strings and harp (Figure R.8).
FIGURE R.2 Berg, Wozzeck, Act 3, bars 3–7

FIGURE R.4 Berg, Wozzeck, Act 3, bar 114


94 Music and Shape
This same oscillation over a perfect fifth is used at the end of the first act,
though in a darker and fuller register (Figure R.9).
The second act also concludes with the same harmonies, presented however
in a fragmented and gruff way, so deep in register that they are barely percep-
tible (Figure R.10).
Nevertheless, the conclusion of all three acts of Wozzeck ‘rhyme’
harmonically—as is the case, incidentally, in Berg’s other operatic masterpiece,
Lulu—and the complete third act is straddled by a daisy-chain of harmonic
connections below the surface, each recapitulating harmonic sonorities as a
means of closure, and each giving Berg a firm telos at which to aim his extraor-
dinarily diverse and subtle invention (Figure R.11).
Reflection: George Benjamin 95
FIGURE R.11 Berg, Wozzeck, Act 3: harmonic connections
A great question facing a modern composer at the start of a work is, Where
is the harmony going? Many allow the pitch material to expand without appar-
ent destination or predetermined direction, forging the form by fantasy and
focused improvisation. Others create a goal—going full circle by returning to
an opening harmony or by inventing a pseudo-tonic sonority at which to aim.
Equally it’s possible to envisage a complete path in advance—particularly if
the harmonic vocabulary isn’t too complex—which might be modified radically
as the composition evolves. Some harmonic plans preclude change, with the
resultant piece remaining locked into the identical, static blueprint from begin-
ning to end. Yet others allow the music to pursue a self-generating—or even
arbitrary—harmonic mechanism below the surface, abruptly cutting the music
off when there is no more to say.
While conceiving a new work, a composer today may well envisage a circle,
a series of blocks, a sequence of loops, a perpetually descending spiral—or a
combination of some or all of these—as a metaphor for its harmonic trajec-
tory. These simple, imagined shapes are integral to the compositional process,
not mere pretence or poetic analogy. Such a path may be planned in advance,
though usually, I suspect, it evolves during the act of composition. Regardless
of its provenance, a large-scale work—whatever the era or idiom—needs a firm
and decisive sense of closure at its end. The shape Berg proposes for the conclu-
sion of Wozzeck maintains more than a degree of relevance today.
4
Affective shapes and shapings of affect

in Bach’s Sonata for Unaccompanied Violin
No. 1 in G minor (BWV 1001)
Michael Spitzer
Emotions have shapes, and musical emotions mirror those shapes. This is a sim-
ple enough claim. But the multitude of assumptions packed into this statement
could fill a library. Indeed, it drives the industry of emotion studies, which has
overtaken music psychology and aesthetics (but not yet musicology) since the
humanities’ affective turn a decade or so ago.1 Do emotions have shapes, or is it
the behaviours, intonations and intentions associated with them? Why and how
is music emotional, and is the emotion expressed, induced or perceived? Does
emotion even exist? What is a ‘shape’, and how can a musical shape be captured
analytically? Is it the preserve of the composer or the performer? And so on.
In walking through this jungle, I put together six arguments, all prefabri-
cated: the theory lies in the assemblage rather than in the constituent ideas.
1. I speak not of ‘emotion’ in the round but of ‘emotions’ in the plural,
many comprising discrete basic categories such as sadness, happiness,
fear, tenderness and anger.
2. These categories are ethological, originating in adaptive animal
behaviour.
3. Emotional behaviour is expressed through goals, or ‘action tendencies’.
4. In music, emotional categories are associated with acoustic features
which are readily identifiable.
5. Goal-directed emotional behaviour in music is conceivable when we
think of music as a virtual person, according to ‘persona theory’.
6. I discriminate between emotional expression and induction, on the
basis that a listener discerns a musical process as being ‘expressive
of emotion’ (rather than transitively expressing a composer or
96 performer’s affective intentionality).
Shapes of affect in Bach’s Sonata in G minor 97
This assemblage may have seemed counter-intuitive to previous writers. For

instance, although the notion of ‘action tendency’ (e.g. the tendency to flee,
to fight, to love) has existed for a long time (Frijda 1986), music psychologists
have hitherto limited its application to behaviours that musical emotion is liable
to provoke: sad music may make us weep, happy music cause us to dance, and
so on (Sloboda and Juslin 2001: 87–9). By contrast, music analysis, with its
expert conception of musical form as a landscape of goal-directed tonal forces
(Nussbaum 2007), makes available fresh applications: ‘action tendency’ can
now suggest the actions, gestures and aspirations of a musical subject navigat-
ing tonal forces within a work.
Another assemblage is historical, bridging the epochal distances from the
evolutionary origin of emotion, through modern social habits, and then across
the aesthetic realm, finally reaching the peculiar domain of musical emotion.
We may question the relevance of a primitive adaptive emotion such as fear,
evolved millions of years ago, to our present-day aesthetic experience of Bach,
for example. This chapter is about affective shapes, and the shaping of affect,
exemplified in Bach’s Sonata for Unaccompanied Violin No. 1 in G minor
(BWV 1001). Before this analysis can get under way, we need a little more
stage-setting.
Emotion theorists since Darwin’s The Expression of Emotions in Man and
Animals like to model the evolutionary development of the human brain on the
archaeological metaphor of the city, with the oldest layers being the deepest.
According to Freud,
... suppose that Rome is not a human habitation but a psychical entity
with a similarly long and copious past—an entity, that is to say, in which
nothing that has once come into existence will have passed away and all
the earlier phases of development continue to exist alongside the latest
one. (Freud, Civilization and its Discontents, quoted from Oatley 2004: 63)
Emotions proper are held to have emerged not in the very oldest and innermost
layer of the brain—the ‘corpus striatum’—which controls basic animal rou-
tines such as walking, patrolling, foraging and mating, but in the central lim-
bic system. Often called the ‘emotion brain’, the limbic system arose with the
peculiarly social world of mammals unavailable to reptiles. It involves sociable
behaviours such as mother/infant care-giving, vocal signalling and play, and
is the site of the basic emotions: happiness, anger, fear, desire and sadness.
A crucial feature highlighted by the ‘city metaphor’ is that the more intellec-
tual neocortex—the third and newest layer, which developed over the six mil-
lion years in our evolution from the apes—does not supersede these two older
layers. On the contrary, ‘as with the organization of cities, earlier forms and
developments have continued, and provided for subsequent developments and
elaborations’ (ibid.: 67). In particular, the neocortex elaborates the sociality of
the limbic emotions. This notion of earlier layers persisting through later ones
98 Music and Shape
helps us understand the permeability of music’s emotional spaces. The notori-

ous bang in the slow movement of Haydn’s ‘Surprise’ Symphony is a construct
of abstract syntactic patterning, and hence of neocortical sophistication. At
the same time, we flinch because of our ancient brainstem reflex, a primitive
reaction towards sudden noises. The primeval shock is not superseded or cov-
ered over but co-opted as a template on which to hear the modern surprise.
Oatley charts the shift from ancient to modern emotions essentially as a
change from emotion as goal-driven (shared by animals) to emotion as social
(evinced by some animals, but quintessentially human). Thus happiness was
originally associated with goal fulfilment, and later became a symptom of
social cooperation. Sadness, once an emotion of goal loss, is now linked to loss
of relationship. Likewise, the primitive fear of danger was transformed into a
fear of social rejection. Importantly, the social emotions are structured accord-
ing to the template of the primitive emotions. Using the terms of Lakoff and
Johnson’s (1980) theory of metaphor, which traces a trajectory from the physi-
cal and embodied to the abstract and conceptual, I propose that there may be
a ‘metaphorical mapping’ from the primitive to the social in our experience of
emotion in musical structure (Spitzer 2004).
A pinnacle of socially refined emotion is the Trio from Mozart’s Symphony
No. 40 in G minor. Its loving tenderness is sonically expressed by a cluster of
acoustic features: diatonic and triadic sweetness, smoothness of line and rhythm,
soft dynamics and gentle tempo. It projects loving ‘shapes’ at the phrase level in
terms of tender-sounding gestures which put one in mind of a human dialogue:
dialogues of periodicity within instrumental groups, and between strings and
wind. The overlay of horns in thirds at the moment of recapitulation—instru-
ments conspicuously withheld until then—clinches the Trio’s analogy with an
operatic love duet. The full, sociable, manifestation of the emotional category,
love, is consummated at an architectonic level, the reprise.
Mozart’s Trio was the subject of Leonard Meyer’s longest analytical study,
a text where he floated his intriguing, and never fulfilled, concept of emotional
‘ethos’ (Meyer [1976] 2000; Spitzer 2009). This was an important and future-
facing departure for Meyer, because his seminal work on emotion actually
avoided talk of specific or ‘discrete’ emotions in the plural. The Trio is also sug-
gestive because it epitomizes the extreme conventionalization of musical style,
opening up an important space between the historical and cultural relativity
of stylistic codes, and the evolutionary pedigree of emotions themselves. Even
within modernity, the category of tenderness is recognizable across an aston-
ishing variety of historical styles, from Schubert lullabies to the opening of
Mahler’s Ninth Symphony; or, going backwards, from the Siciliana of Bach’s
G minor violin sonata to the medieval topos of sweetness (Wegman 2003).
Affective shapes interact with ‘display rules’ (Ekman 1984: 320–1), enabling us
to recognize a common emotion in Mahler and Bach.
My portmanteau argument and analysis is disposed in two parts, and it

explores various notions of musical shape, emotional shape and the process of
‘shaping’. In the first part of this chapter, which focuses on Bach’s first move-
ment, the Adagio, I propose a three-tier model of musical shape: emotion is
projected, respectively, through the relatively instantaneous level of acoustic
cues, midlevel phrasing and large-scale form. In short, I discover the same
three-tier model of Mozart’s Trio in Bach’s violin sonata. I show that emotions
themselves have shapes, which can be analysed through the interaction of three-
tiered musical shape with the display rules of the baroque language, including
specific formal models such as ritornello patterns. Furthermore, I understand
‘shaping’ as the process of expressing emotions by inflecting or transform-
ing structural models. Emotional shaping is thereby kindred with the shaping
enacted through performance, and I bring my analysis into dialogue with three
recordings of Bach’s first movement, including data captured in tempo and
dynamic maps. The second part of this study looks at the other movements,
first individually and then at the overall shape of the entire cycle. I argue that
emotional shape is also borne out through the interactions of the four move-
ments with each other. My starting point, however, is Naomi Cumming’s phe-
nomenological study of Bach’s Adagio, based on Peirce’s semiotic categories.
While Cumming’s analysis is rich, it is a useful measure of how much more can
be achieved through more recent work in emotion theory.
Part I: The Adagio
CUMMING’S ANALYSIS
Naomi Cumming’s The Sonic Self (2000) is a magisterial essay on musical

meaning. Conversant with analytic philosophy and music psychology, at its
heart the book is a theory of musical semiotics from a stringently Peircean per-
spective. Whereas readers of Jean-Jacques Nattiez and Raymond Monelle will
have come to the book with a reasonable familiarity with Peirce’s concepts (pri-
marily, his sign typology of icon, index and symbol; his phenomenological cat-
egories of firstness, secondness and thirdness), Cumming presented arguably
the first systematically persuasive application of Peircean semiotics to music.
Her argument is an elegant dance between the poles of music’s subjectivity
and signification, the former bespeaking music’s immediacy and uniqueness,
the latter reckoning with how musical expressivity is mediated through form
and convention. The book climaxes with a case study of Bach’s Adagio from
his Unaccompanied Violin Sonata in G minor, illuminated with Cumming’s
insight as a practising violinist. Cumming’s analysis suggests that the issue for
her was not ‘shape’ so much as ‘shaping’, as is implicit in her understanding of
musical ‘gesture’.
100 Music and Shape
As an example of gesture, Cumming identifies the series of three descend-

ing thirds (B♭–G, G–E♭, E♭–C♯) at bar 5 (2000: 225; see Figure 4.1). Part of
what makes these figures so ‘gestural’ is that they interrupt the directed tonal
motion to the cadence, a ‘wilful’ goal-orientation that Cumming identifies with
thirdness in music. Yet there is no need to be so specific, because by Cumming’s
lights gestures are pervasive as the common units of musical currency, being
coterminous with any midrange musical event. This is because of the ‘propen-
sity of listeners to hear’ in terms of ‘short, directed motions’ (165). The pri-
macy of gestures is striking, given that Cumming identifies them not with the
semiotic primacy of firstness and icons, but with the secondness of indexes.
Musical listening, then, seems to begin ‘in the middle’: not with sound (= first-
ness; iconicity) per se, but with the shaping of sound into indexical musical gestures.
Cumming’s view is that with gesture, general sound qualities are embodied in a
specific musical reality (the ‘secondness’ of Peirce’s ‘indexicality’), and in such a
FIGURE 4.1 Bach, Sonata for Unaccompanied Violin No. 1 in G minor (BWV 1001), Adagio, bars 1–13.
FIGURE 4.1 Continued
way that ‘the synthesis of its structural elements, when they are heard as embody-
ing aspects of [human] movement (in directionality, force, etc.), … suggest expres-
sive agency’ (149). In short, a gesture is a ‘melodic shaping’ (230), the embodiment
of a sonic quality as a particular musical event. Cumming’s notion of shaping is
enriched by the ontological differences of sign types; I part company with her, how-
ever, when she tries to extend the Peircean method to theories of emotion.
102 Music and Shape
Following Peirce, Cumming associates the categories of firstness and icon

with vocality, and those of secondness and index with embodied physical
motion. In turn, thirdness and symbol bring with them more elusive qualities of
‘will’, ‘desire’ and the goal-orientation of long-range musical argument, as with
the cadential drive momentarily arrested by the descending-third ‘gestures’ at
bar 5. If the musical ‘voice’ is characterized by immediacy, and musical gestures
by their singularity, then wilfulness in music is ‘rule-bound’, the logic of its rules
emerging only through the gradual unfolding of the formal process. There is
a suggestive alignment in Cumming’s system between the spectrum of voice–
gesture–wilfulness with that of timbre–melodic figure–formal process, as well as
with the rising proportions from local, through medium-range, to global. In prac-
tice, these dimensions are all imbricated within each other: tonal and gestural
properties are implicit within the musical detail, just as it is counter-intuitive to
ignore the role of texture and gesture in the shaping of large-scale form.
Equally suggestive is Cumming’s notion of the musical work as a ‘complex
synthesis’ of sound, gesture and process, each involving a different ontology
(of voice, body and will). Given that a musical gesture is relatively ‘blinkered’
(230), the ‘bringing together of gestural events into the aural perspective of a
tonal purpose is an act of “synthesis” between different kinds of signs’ (232).
For Cumming, this large-scale synthesis constitutes ‘shape’ at the highest level.
Complex synthesis, then, is as rich and vital as subjectivity itself, which is why
it leads Cumming into a theory of musical persona. Just as the human subject
uses voice, gesture and will to express emotion, so does the virtual persona pro-
jected by the musical work. The problem, however, is that Cumming’s theory
of emotion is much less supple than her Peircean underpinnings, looking rather
outdated from the vantage point of contemporary emotion theory. She articu-
lated her thoughts on musical emotion in debates with Langer, Levinson, Kivy
and Davies, but completed The Sonic Self before the publication of seminal
works by Juslin, Robinson and Nussbaum.
Cumming’s idea of emotion seems to be locked into rigid, and linguisti-
cally defined, epithets—i.e. words. She hears bars 1–2 as a blend of ‘pathos and
reflectiveness, spontaneity and containment’ (221), which makes it easier for
her to set up the more cognitive approach of Karl and Robinson (1995) and
others as straw-man arguments to be easily knocked down. By contrast, emo-
tion theory’s recent psychological turn—particularly the ‘appraisal theories’ of
Juslin and Robinson—makes such binaries insupportable. From the perspec-
tive of more recent theories, it is possible to construct a scenario of ‘sadness’
which is both complex and unitary, while also corresponding to everyday-life
expressions of this emotion.
SADNESS IN THE ADAGIO
Oatley’s ethological model of sadness as loss of goal or attachment has many

entailments. At a designative level, where the facial, vocal, gestural or attitudinal
expression of grief is evolved to elicit emotional support from others, the musi-
cal persona mimics the dejected face, drooping posture and plaints of a sad
person. These are the familiar ‘acoustic features’ of sadness codified by many
psychologists (Juslin 1997; Huron 2008; Gabrielsson and Lindström 2010): slow
tempo, minor-mode key, narrow intervals, legato articulation, variability of tex-
ture, preponderance of descending melodic contours and a high level of dis-
sonance, especially involving the semitone appoggiaturas of the pianto topic
(Monelle 2000). Descending lines suggest loss of physical and mental energy;
narrow intervals and legato articulation imitate low-energy mumbling.
A fresher perspective is afforded by Huron’s connection between sadness
in music and the ‘detail- oriented thinking’ of ‘depressive realism’ (Huron
2011: 48), following the work of Alloy and Abramson (1979), who consider the
impact of emotion on cognition and perception. From this angle, reflection and
self-reflection are seen as behavioural aspects of sadness, an emotion which is
an adaptive opportunity for a wounded organism to recover by taking stock of
the situation. How would this be illustrated analytically, given that psycholo-
gists of emotion have shied away from looking at the ‘structural features’ of
emotion beyond the parametric level? I suggest that ‘detail-oriented thinking’ is
borne out by thematic atomism and formal fragmentation, the way the Adagio
lurches rhapsodically from one contrast to another. Its lurching vicissitudes are
indeed another side of sadness’s lack of goal, just as its aimlessness is worked
out by spurts of spontaneous melismas and maggiore episodes. (For more on
such signalling see Spitzer 2009.) Huron identifies such major-key interludes
in minor-key works with ‘nostalgia’, which he thinks is a flavour of sadness.
We find such episodes at bars 2–3 (a lurch from G minor to B♭ major and
back again) and, more dramatically, at bars 11–13 (shifting from C minor to E♭
major and back to C). The pathos of these nostalgic moments is heightened by
their very interruption, or ‘containment’, to invoke Cumming’s term—part of
sadness’s relentless denial of goal-orientation.
Sharpening the focus on the opening phrase, ‘atomism’ is evinced in the
sheer density of the Adagio’s texture, a ‘thickness’ which demands ‘detail-
oriented’ reflection from the listener. Hence we see the reciprocal relationship
between sadness as a disposition of musical material (dense and fragmentary),
and sadness as a mode of hearing (acute and detail-oriented). Otherwise put,
we don’t just hear sadness, we also hear in a sad way. Density is heard in the
ways the opening phrase both invokes and resists formal and contrapuntal
schemata. Gjerdingen hears it as instantiating a 1–7 … 4–3 ‘Meyer schema’,2
even though this breaks his own rule (2007: 112) that 4–3s shouldn’t overlap
1–7s. At the very least, the schema is deformed. Better to hear it, I suggest, as
a mutual interference of two schemata: a 1–7 … 7–1 (complementary pianti
weeping gestures, with the second F♯ displaced up an octave), and a 5–4 …
4–3, a descending line which will emerge in the fugue subject, but introduced
here with the opening D elided. The tritone leap from C to F♯ leaves the C high
and dry, seeming to foreshadow the subdominant bias of the movement (as in
104 Music and Shape
the C minor ritornello of bar 13). There is a similar tension between the pro-
pensity of analysts to read the Adagio ‘top-down’ as a Schenkerian 8-descent
(Cumming 2000: 233), or ‘bottom-up’ as a descending, rule-of-the-octave bass
pattern (Lester 1999: 34). This either-or binary detracts from the messy, poly-
phonic richness of the Adagio’s texture, for instance the quasi-canonic counter-
point in bars 2–3, where the melody’s E♭–D step is mirrored a little later in the
‘bass’. This quasi-canon—a mensurally distorted canon at the octave (recall-
ing the G major/minor canons in the Goldberg Variations)—is missed by both
Lester and Cumming, but is suggestive of the Adagio’s very self-reflection.
Thus Cumming’s four affective epithets— pathos, reflection, spontaneity
and containment—really hang together as a package of entailments of a single
emotional category, sadness, considered as a type of adaptive behaviour. Pathos
and reflection are, respectively, outward-and inward-facing behaviours: gestur-
ing to observers, reflecting on loss. Spontaneity and containment are comple-
mentary symptoms of goal loss: energy breaks out, breaks down or is blocked.
SHAPE AND SHAPING
A behavioural approach to musical emotion begs the question of agency: What

is it that ‘behaves’? The philosophical theory of musical persona, developed by
Levinson (1990), Davies (1994), Cumming (2000), Nussbaum (2007) and oth-
ers, proposes the answer at the highest level. What ‘behaves’ (read: speaks, ges-
tures, moves, wills, weeps, fights, flees, dances, etc.) is a virtual subject projected
through the interplay of tonal forces across an imaginary musical landscape.
Analytically, it is easiest to show this by considering expressivity in music as
transformation, or inflection, of a model, akin to Arthur Danto’s account of
individual (stylistic) ‘manner’ as ‘adverbial’ (see Ross 2003: 234), focusing on
the ‘how’ (the inflection of pattern) rather than the ‘what’ (the origin and status
of the pattern itself). Much, if not all, western music is composed by elabo-
rating a stylistic, formal or contrapuntal model. A plausible model for Bach’s
Adagio—indeed, for a great deal of his music—is Wilhelm Fischer’s conceptu-
alization of a three-part ritornello scheme of Vordersatz–Fortspinnung–Epilog,
as recuperated and developed by Laurence Dreyfus (1998: 61). A possible source
for the Adagio is the Vivaldian staple exemplified by the Largo of his Violin
Concerto Op. 3 No. 6 (see Figure 4.2). An opening I–V–I gambit (Vordersatz)
leads to a more dynamic (Fortspinnung) central module typified by a circle-of-
fifths progression underpinned by a descending bass (akin to rule-of-the-octave
bass descents). The model is rounded out by a V–I closing gambit (Epilog).
Gauged against Vivaldi’s prototype, the relative complexity and density of the
Adagio’s opening ritornello, bars 1–4, snaps much more vividly into view. See
in particular the central Fortspinnung module (end of bar 2 to middle of bar 4).
A fifth cycle is discernible, but heavily disguised. The B♭, the first note of the
cycle, is buried as an inner voice beneath the top G on the third beat of bar 2.
FIGURE 4.2 Vivaldi, Violin Concerto Op. 3 No. 6, Largo, bars 1–6
B♭ is formally dislocated from the next note of the cycle because it is projected as
an ending of the first phrase (Vordersatz): the note is relatively long (a quaver),
resolves the preceding tonal tension (dominant-seventh harmony) with a tonic
and is articulated by the following demisemiquaver rest. If the B♭ is an ending,
then the lurch up to the E♭ sounds like a new beginning, metrically stronger than
the first beat of the next bar. The A natural, the third beat of the cycle, is metri-
cally weakened by being displaced by a quaver in bar 3; and it is disconnected
from the E♭ because that note had fallen back down to a B♭. D, the fourth note of
the cycle, is reached only two quavers after A: each of the four steps of the cycle is
differentiated by a distinct textural shape and metrical placement. This makes the
‘skeleton’ of the phrase, its grammatical deep structure, quite challenging to hear.
Moreover, this sense of discontinuity is compounded by the abrupt tonal shifts
across bars 2–3 from G minor to B♭ major and back to G minor.
Hence pinpointing bars 2–3 shows how the Adagio’s fifth cycle is highly
deformed, effectively into a series of isolated structural notes, suggesting a
‘detail-oriented’ listening in line with the ‘depressive realism’ of sadness.
106 Music and Shape
There are two diametrically opposite accounts of ‘shape’ that can be drawn
from this example. In the first, the Adagio’s sadness, with its detail-oriented
depressive realism, is ‘adverbial’, being a transformation of a formal model. All
four movements of Bach’s sonata begin with the same Vordersatz–Fortspinnung–
Epilog ritornello model (see Figure 4.3). Focusing on the central cycle-of-fifths
module spotlights the successive transformations, each one of which produces
FIGURE 4.3 Inflections of the fifth cycle

a different emotion—more on this in Part II. The key point here is that eliciting
contrasting expressive character by transforming a framework is the essence of
variation form. See also, in the history of theory, Heinichen’s (1711) or Niedt’s
([1706] 1721) lessons to budding composers on how to adapt expressive figura-
tions to libretti in order to project differing emotions. This adverbial account
highlights ‘shaping’ rather than ‘shape’.
The opposite account discovers ‘shape’ in the pattern rather than in its
inflection. By ‘pattern’, I mean the shape of the music’s ‘behaviour’. Elsewhere,
I termed such dynamic emotional shapes ‘affective trajectories’ (Spitzer 2013).
Although Oatley and others characterize sad behaviour with loss of goal, the
emotion is not without directionality. Sadness is a strongly aversive emotion;
for an emotion whose essence is loss of goal, the only goal for sadness, para-
doxically, is to stop being sad. This affordance is strangely ignored by psy-
chologists of musical expectancy. Margulis (2005), for instance, theorizes (after
Huron) three classes of expectation: surprise, denial and expectation proper.
But the sadness of the Adagio’s opening phrase, as an aversive emotion, is
surely implicative of an escape from this sadness. The lurch into the tender/
happy B♭ major episode in bars 2–3 is admittedly a ‘surprise’ in its abruptness.
Yet isn’t this flight to the major, to a positive valence, implied by the aversive
quality of the opening? (Conversely, in the many works where such flight to the
major is denied, isn’t this minor-mode standstill registered as a form of ‘con-
tainment’ or even repression?)
It is important to pin this trajectory to fundamentals in order to u ndergird
more complex, Lacan-tinged, explanations (Spitzer 2013). For instance, if
sad music takes separation anxiety as an axiom, then its trajectory seeks to
recreate (recuperate, memorialize, return to) the severed social bonds, typically
in the form of a maggiore ‘dream image’. The dream image at the centre of
the ritornello’s central module, bars 2–3, for all its brevity, is more animated
(the scalar uprush to E♭) and intervallically more expansive (fourths, tritones,
sixths), and it momentarily even trips into a dance lilt. And then this episode
is just as s uddenly snuffed out by the F♯, returning the music to G minor. The
‘shape’, then, is an implicative drive away from materials associated with sad-
ness, towards those expressive of tenderness and happiness, accompanied by a
sudden ‘opening out’ or expansiveness, suggestive of feelings discharged from
within, or liberated from a constraint; and then a sudden return to the ini-
tial state. Importantly, the middle tender/happy state is not separable from this
process, but part of sadness’s trajectory. It is helpful that Huron characterizes
tender/happy music contextualized within sad music as ‘nostalgia’ (although
maggiore episodes are surely not all backwards-facing: see the discourse gen-
erated by Levinson’s identification of the second group of Mendelssohn’s
Hebrides Overture with ‘hope’; Levinson 1990, Karl and Robinson 1995).
The value of such a broad conception of ‘shape’ is that it doesn’t commit
us either to a formal model (such as Fischer’s and Dreyfus’s ritornello model)
108 Music and Shape
or to specific pitches, rhythms or even contours. This is an important consid-

eration to bear in mind, given the complex politics of the score–performance
relationship: from this standpoint, what is being performed is not just a score,
but also a performance shape inscribed within the score. Since performances,
qua performance activities, rehearse emotional shapes in their own right, their
relationship to the shapes in the score is thus more akin to ‘mirroring’ than to
mechanical reproduction. I say more about this later.
Thus I hear the B♭ major episode at bars 2–3 as having the same shape
as the turn from C minor to E♭ major at bars 11–13. Contextually, they are
analogous: the middle module of the opening ritornello, the central climax of
the piece (bar 11 is exactly midway). Tonally, the patterns are similar: g–B♭–g,
c–E♭–c. Motivically, thematically and formally, however, their materials are
completely different. The ‘scalar uprush’ at bar 2 is possibly discernible in the
seventh ascent, B♭–A♭, at bar 11; but this ascent actually begins in C minor
at the start of the bar, and on a B♮, with the B♭–A♭ uprush really elaborat-
ing a middleground voice-leading progression from D to E♭. The commonal-
ity of shape, rather, is heard at the level of shared affective trajectory. The
key difference is one of scale: the affective trajectory is massively amplified.
Everything now is bigger and more clearly pronounced. Its sadness is sadder:
the interlocking suspensions and chains of major sevenths at bar 11 consti-
tute the Adagio’s most excruciating moment. Its dream image is more ecstatic
and extended: the hint of dance at bars 2–3 is now really confirmed; the
remarkable opening up of its register to two octaves, climaxing with the bold
leaps between A♭s, suggests an uprush of emotion, feeling erupting from the
depths of the music. These leaps elaborate perhaps the emblematic gesture of
Bach’s violin music—the rising arpeggiation across multiple-stopped strings.
This rise, together with the straining resistance of the strings, lends itself par-
ticularly well to a feeling of emotional discharge. Finally, the collapse back
to the minor is far more dramatic than earlier at bar 3: after a build-up to
a cadence in E♭ major across bars 11–12, the cadence is dramatically inter-
rupted by a diminished-seventh chord at bar 13, which returns the music to a
minor key. The interrupted cadence at bar 13, underscored by a pause, is the
Adagio’s salient event, and it ushers in the subdominant reprise (the ritor-
nello in C minor), a structural deformation constituting a dissonance at an
architectonic level.
The Adagio’s emotional shape, then, is rendered at successively higher struc-
tural levels: first, ‘vocally’ implicit in the acoustic features of the opening into-
nations; second, ‘gesturally’ explicit at the level of the phrase (bars 2–3); third,
formally fulfilled at the level of architecture (bars 11–13). Mozart’s Trio also
does that, and it is plausible that many works in the western repertoire pro
ject emotional shape at rising levels. Cumming’s vocality–gesture–will progres-
sion points in this direction, although her Peircean lens arguably occludes more
than it illuminates.
PERFORMANCE SHAPES
Before we turn to the rest of Bach’s sonata to explore ‘shape’ and ‘shaping’ at
the level of the cycle, we need to complete—even ‘consummate’—this dialectic
in the reality of musical performance. Isn’t ‘shaping’ what a violinist does with
Bach’s materials? On the other hand, can one speak of performance ‘shapes’
across an interpretation? A common experimental protocol in emotion psy-
chology research is to get a performer to interpret the same phrase in different
ways so as to project varying affective states. Is it thereby legitimate to view
‘adverbial’ compositional processes, such as variations, as ‘performative’ in this
respect, the composer shaping a musical model into a distinct affect just as a
performer shapes the music? If so, then a notion of emotional shape/shaping
may shed new light on the interaction of scores and performances.
In a market saturated with recordings of Bach’s music for unaccompanied
violin, I have selected distinguished versions by Itzhak Perlman, Sergiu Luca
and Gidon Kremer. Although Cumming doesn’t engage with specific perfor-
mances of the Adagio, her Peircean triad voice–gesture–will suggests generic
differences between these three violinists’ approaches. Perlman’s classic 1988
recording epitomizes mainstream late twentieth-century interpretative practice,
playing the piece with large-scale, often symmetrical phrasing. Perlman brings
out the broad formal unfolding of the Adagio, the ‘will’ of the tones. Luca’s
1992 ‘historically informed’ (HIP) recording is focused much more sharply
on the intricate gestures of the Adagio’s rhetorical delivery. It is tempting to
style HIP ‘gestural’, after Cumming, although its rhetorical quality reminds
us that it is difficult to conceive of musical gesture apart from vocality. That
said, the portamento ‘sobs’ prevalent in early twentieth-century practice, as in
Fritz Kreisler’s 1926 recording, may sound even more vocal than HIP. My third
example, Kremer’s 1981 version, is interesting for combining modern tech-
niques with intricate phrasing, yet the latter expressing not HIP sensibilities
so much as rhapsodic waywardness. Taking the Kremer last, I begin with a
point-by-point comparison of the Perlman and Luca versions, concentrating
on the ‘emotional shape’ of the opening ritornello and its ‘architectural’ expan-
sion across bars 11–13, in the light of tempo and dynamic maps of the perfor-
mances (Figures 4.4 and 4.5).3
Luca’s rendering of Bach’s opening projects the wave- like spectral and
dynamic shapes highly characteristic of the period bow (Fabian 2005: 95). The
short baroque bow is conducive to the ‘ “period” stroke’: soft onset and rapid
decay. A spectrogram easily reveals that the higher frequencies crest and fall
across Luca’s bow strokes on the strong beats of bars 1–2, and that the steep
oscillations are matched by the dynamic swells and ebbs. Conversely, spec-
trograms of Perlman’s performance, on a modern bow and instrument, show
his solidly sustained tone and dynamics. Luca’s spectral/dynamic wave shape
is also mirrored in the oscillations of the tempo maps, but not in synchrony
Luca
40 0
30 –20
Tempo (BPM)
Energy (dB)
20 –40
10 –60
0 –80
1.1 1.3 2.1 2.3 3.1 3.3 4.1 4.3 5.1 5.3 6.1 6.3 7.1 7.3 8.1 8.3 9.1 9.3 10.1 10.3 11.1 11.3 12.1 12.3 13.1
bar.beat
FIGURE 4.4 Tempo and dynamic map of Luca, bars 1–13

Perlman
40 0
30 –20
Tempo (BPM)
Energy (dB)
20 –40
10 –60
0 –80
1.1 1.3 2.1 2.3 3.1 3.3 4.1 4.3 5.1 5.3 6.1 6.3 7.1 7.3 8.1 8.3 9.1 9.3 10.1 10.3 11.1 11.3 12.1 12.3 13.1
bar.beat
FIGURE 4.5 Tempo and dynamic map of Perlman, bars 1–13

112 Music and Shape
with the note swells, and differently between the two players. It is interesting
that both Luca and Perlman begin at similar tempos (21 bpm), and accelerate
to a peak at beat 3 of the first bar (Luca 25.2 bpm; Perlman 23.9 bpm), before
slowing down. Both players also decelerate towards the end of bar 2 (Luca
23.5 bpm; Perlman 16.7 bpm), against the grain of an older performance tradi-
tion (perhaps beginning with Joachim’s 1903 recording) of taking the ‘uprush’
scale at beats 3–4 somewhat faster. In both recordings, then, the ritornello’s
Vordersatz is shaped by a nearly identical tempo wave (Perlman: 21–23.9–16.7
bpm; Luca: 21–25.2–23.5 bpm), helping to project it as a self-contained unit, a
sort of sonic pillar.
Luca and Perlman drift further apart in how they treat the remainder of the
ritornello and the music immediately after it. A lot of my analysis pivots on
the boundary between the Vordersatz and Epilog of Bach’s ritornello, marked
by the B♭–F♯ gesture at bar 3 and the wrench it effects back from major and
minor. The three performances interpret this boundary in different ways.
Luca articulates the four semiquavers at the beginning of bar 3 very care-
fully, with a hint of dotted rhythm on the first of each pair, and a diminuendo
towards the quaver D (from -26 to -34 dB) thereby rendering the louder B♭–F♯
tonal interruption more dramatic (from -34 to -21 dB). On the one hand, this
cuts off the fifth-cycle Fortspinnung module from the Epilog. On the other, there
is surprisingly little deceleration into the long cadenza-like melisma at bar 3
(from 30 bpm at bar 3.2 to 26.6 bpm at bar 3.4), despite the performance tra-
dition in non-HIP recordings of taking the melisma considerably slower. Yet
both aspects bespeak the same tendency of HIP readings to focus on small-unit
articulation and play down broader contrasts. In this respect, Perlman’s read-
ing is markedly different, epitomizing the mainstream tradition’s preference for
seamless legato, uniformity of tone, long-range or block-like contrasts, and the
projection of large-scale structure.
Where Luca separates modules 2 and 3, Perlman’s powerful sense of line
drives through them in a fine art of transition. He maintains a high dynamic
level across the four semiquavers (-25 dB), all articulated evenly and with equal
intensity, swelling successively through the B♭–F♯ gesture to the G resolution
at bar 3.3 (-27 dB), climaxing with the melisma (-31 to -35 dB). The arrival
of this gesture, then, is smoothly mediated and subsumed into the swell into
the melisma: it is part of a wave, rather than a brusque shock. Perlman’s art
of transition is underscored by tempo changes: the B♭ major dream image at
the start of bar 3 is fastest (accelerating from 20.2 bpm at bar 3.1 to 24.1 bpm
at bar 3.2), and his Adagio subsequently decelerates through the B♭–F♯ ges-
ture to the luxuriously paced melisma (21 bpm at bar 3.3 to 15.1 at bar 3.4).
Compared to Luca, Perlman widens the tempo differential between dream
image and melisma: a slight difference of 4.8 bpm with Luca (from 31.4 to 26.6
bpm), nearly double that with Perlman (from 24.1 to 15.1 bpm). Otherwise put,
in Perlman’s recording, the duration of quaver beats from the D through the
B♭–F♯ gesture to the climactic G lengthens in increments of 0.4 seconds, from

1.3" to 1.7" to 2.1", bespeaking an extraordinarily precise grasp of rhythmic
modulation. Moreover, the constituent notes of Perlman’s melisma are pro-
jected as individual entities (similarly to the melismas of bar 1), rather than
passed through quickly as subordinate diminutions, as they are in Luca’s
performance. By contrast, the post-cadential material, from bar 4 beat 3, is
remarkable for its lack of rhythmic flexibility: Perlman now plays with absolute
regularity of tempo, regaining and sustaining a fast 21 bpm. The faster tempo
and rhythmic regularity serve to place the preceding melisma into deeper pro-
file (although Perlman eases out of it gently, not on the V6$of bar 4 but on
the tonic chord two beats later). This broad contrast between bars 3 and 4—a
contrast carefully mediated in wave-like transitions—reveals Perlman as the
longer-range formal thinker.
Another instance of Perlman’s projection of large-scale patterns is borne
out in a striking relationship between the ritornello and the ‘architectural’ cli-
max at bars 11–13. Now, Perlman and Luca concur in reserving the clearest
instantiation of ‘wave tempo’ to this point: in both recordings, the sudden turn
to E♭ major on the third beat of bar 11 is the point where up/down tempo flux
becomes synchronized to the beat. From this point, both Perlman and Luca
slow down and speed up from one beat to the next, climaxing at the third beat
of bar 12 with a slope down to the interrupted cadence and fermata. This regu-
lar tempo wave never appears elsewhere in Luca’s performance, but it does in
Perlman’s: in the Fortspinnung and Epilog of the ritornello at bars 2–3, at twice
the amplitude—oscillating every two beats, rather than every single beat. The
climax at bars 11–13, then, performs the tempo wave twice as fast as at bars
2–3, the model for its shape—on the crotchet rather than on the minim. The
original performance shape is thereby accelerated and intensified, in elegant
‘contrary motion’ to the material’s greater expansiveness at bars 11–13.
The differential between the peak and trough of Perlman’s tempo through
the Fortspinnung and Epilog is 9 bpm across two beats (24 bpm at bar 3.2; 15
bpm at bar 3.4), from the climax of the ‘dream image’ to the depressive nadir
of the melisma. The tempo shape mirrors the emotional shape, as might seem
natural (Luca does not do this). The differential at the architectural climax,
between the peak of bar 11.3 (23.1 bpm) and the trough of bar 13.2 (14.9 bpm),
is almost exactly the same, 8.2 bpm, but now spread out much more expan-
sively across seven beats. The vertiginous beat-to-beat differential (averaging
4.4 bpm) within this two-bar stretch further heightens the excitement, but does
not muddle the impression that the passage, in Perlman’s performance, has the
same tempo/affect shape as in the ritornello.
Kremer, a player noted for idiosyncrasy, combines aspects of modern and
HIP idioms: a somewhat fractured rendering of individual detail with con-
temporary bowing practice (Figure 4.6). Compared to Luca and Perlman,
Kremer is expressively ‘deviant’, insofar as deviation from a norm is a standard
Kremer
40 0
30 –20
Tempo (BPM)
Energy (dB)
20 –40
10 –60
0 –80
1.1 1.3 2.1 2.3 3.1 3.3 4.1 4.3 5.1 5.3 6.1 6.3 7.1 7.3 8.1 8.3 9.1 9.3 10.1 10.3 11.1 11.3 12.1 12.3 13.1
bar.beat
FIGURE 4.6 Tempo and dynamic map of Kremer, bars 1–13

technique of creating an expressive effect, involving, in Eric Clarke’s words,

‘deliberate departures from the indications of the written score’ (2003). Despite
his modern bow, Kremer plays the Vordersatz with dramatic hairpin diminuen-
dos (unlike Perlman’s solidly sustained tone), recalling Luca’s shapes but with
the contrast vertiginously amplified. Kremer begins at 18.6 bpm, slower than
Perlman and Luca’s 21 bpm. Where the others’ tempos rise and fall through bar 1,
Kremer gets even slower, to 15.7 bpm at beat 4, lurching to a faster tempo at
bar 2, whose four beats then slide down successively towards the uprush (20
bpm, 18.9 bpm, 16.7 bpm, 14.6 bpm). Kremer, like the others, begins bar 3
at a faster tempo (21.7 bpm): where Luca decelerates and Perlman gets even
faster, Kremer actually keeps a steady pace before suddenly accelerating at the
climactic G of bar 3 (from 21.3 at the B♭–F♯ gesture to 25.6 bpm), thus taking
the boundary half a beat later than Perlman, not at the B♭–F♯ gesture but at
its note of resolution. Regarding the architectural climax at bar 11, although
Kremer, like the others, performs a turning point on the third beat (fast, fol-
lowed by a slope down to the interruption and fermata), he doesn’t project
Perlman’s or Luca’s ‘synchronized tempo wave’. In fact, Kremer’s entire read-
ing conspicuously disdains any regular tempo waves, a marker, perhaps, of his
mannerist irregularity. This is also manifest in the lack of synchrony between
his tempo and dynamic curves. In both Luca’s and Perlman’s performances,
tempo and dynamics generally shadow each other, especially at the opening
(i.e. they get faster and louder, slower and softer, at the same time). In Kremer’s
performance, tempo and dynamics are more independent from each other.4
Lest one dismiss Kremer’s reading as wantonly obscure, there are aspects to
his ritornello that are strikingly revealing. Like Luca, Kremer articulates the
initial four semiquavers at bar 3 irregularly, yet in reverse: not dotted (Luca) but
iambic, like Scotch snaps. He thereby brings out the notes of the fourth cycle
(A and D) from under their appoggiaturas (B♭ and E♭): from the standpoint
of projecting the skeleton of the Fortspinnung, Kremer is thus clearer than
Perlman or Luca. Another telling detail is that Kremer, unlike practically every
other front-rank exponent of this Adagio, plays the F♯ at bar 4 without the trill.
Yet waiving the trill pays huge expressive dividends for Kremer: it throws the
emphasis on the G♮ on the strong beat of bar 4, and highlights the gesture as an
expansion (semiquavers into quavers) of the B♭–A and E♭–D pianti at the start
of the bar. Indeed, this is where the semiquaver pianti’s iambic shaping becomes
strategic: setting up the ‘deviant’ pianti to be straightened out and resolved, as
a climactic G–F♯ trochee. The tensions of Kremer’s phrasing discharge into the
bar 4 G–F♯ climax as in no other performance. This is how Kremer conspires to
combine intricate attention to local detail with broad formal thinking. (Taking
an opposite path to a similar end, Perlman had downplayed the semiquavers at
bar 4 not via rhythmic displacement but by powering through them.)
Perlman, Luca and Kremer’s performance styles are all expressive in
their own ways. Following Schubert and Fabian’s appeal for a typology of
116 Music and Shape
‘expressiveness’ in musical performance, Perlman and Luca’s idioms are char-

acterized, respectively, as ‘mainstream expressive’ (‘long-range fluctuations of
dynamics, tempo rubato and shaping of singing melodic lines’; Schubert and
Fabian: 575) and ‘baroque-appropriate stylish’ (ibid.: 581). According to Schubert
and Fabian, listeners evaluate the ‘stylishness’ of the latter by its perceived fit
within an historical (baroque) grammar of expressiveness. I would argue that
Kremer’s performance is expressive in a third way, as ‘deviant’: Clarke’s ‘deliber-
ate departure’ from a set of norms. Indeed, what heightens Kremer’s deviance
is that he seems to play the first two performance options—HIP intricacy and
‘mainstream’ cantabile—against each other into a sort of interference pattern.
How, then, do these three distinct styles of ‘expressiveness’ relate to the emo-
tional shapes of sadness? A bland reply would be that performance styles sit
next to compositional styles as just another variety of ‘display rules’, elaborat-
ing emotional categories in terms of their various grammars. A more interest-
ing solution, as hinted earlier, is to see them as ‘mirroring’ emotional shapes
in performative terms. HIP, mainstream and deviant styles each generalize a
particular aspect of the package of entailments that constitutes sadness. HIP
fits with sadness’s orientation to detail; mainstream performance with the
legato smoothness associated with sadness’s ‘mumbled articulation’ (Huron
2011: 149), in contrast to HIP’s drier articulation; deviant performance with
sadness’s goal-evasion and sudden contrast. No performance style can mono
polize an emotional shape. The three styles we have looked at elaborate par-
ticular aspects of that shape. The situation is quite complex; from a different
standpoint, for instance, Perlman and Kremer’s interpretations could actu-
ally be said to be more detail-oriented than Luca’s, because they project the
ornaments—especially the melismas at bars 1 and 3—thematically, rather than
subsuming them hierarchically. Making a meal of these little notes is unhistori-
cal, therefore expressively deviant, and thus more pathetic. Much depends on
one’s point of view.
Part II: The Cycle
The remaining three movements of Bach’s sonata— Fuga, Siciliana and

Presto—project contrasting emotional categories. Even if we recognize that
the emotions are distinct from each other, identifying what these emotions are
and ascribing linguistic emotion terms to them a re very different matters. The
tenderness of the Siciliana is nearly as patent as the sadness of the Adagio
(ostensibly, as I recount below, because we learn tenderness from the cradle,
and it is foundational for later relationships; conversely, its loss—inducing sep-
aration anxiety and depression—is equally prevalent). By contrast, whether the
Fuga and Presto are fearful or angry (or deliberative or impassioned) is very
much open to question, and indeed subject to performance interpretation. The
crux, however, is the relativity of this openness: the emotions of the fugue and

finale are more ambiguous than those of the first and third movements. In a
parallel study, based on the audibility of Bach’s emotions for two sets of listen-
ers of differing expertise (Spitzer and Coutinho 2014), one of the ‘take-home
messages’ was that nearly everybody identified sadness and tenderness, respec-
tively, in the Adagio and Siciliana, whereas opinion was much more divided for
the other two movements.5
Faced with the ineffability of some musical emotions—a consequence, per-
haps, of a loss of historical sensibility—one approach would be to hypothesize
these emotions on a purely theoretical basis, audacious, even outlandish, as
that might sound. In short, one could speculate that these emotions really are
intrinsic to the musical shapes inscribed within the compositional trace, even
if nobody today has the historical ears to hear them. That is the approach
I have attempted with the Fuga and Presto. As with the Adagio, my theoretical
analysis spotlights the ritornello’s central Fortspinnung module for clarity of
comparison.
OF ANGER, TENDERNESS AND FEAR
Like sadness, the other basic emotions can be defined in terms of goals and
social relationship. Anger is typically triggered by the frustration of an ‘active
goal’, leading to aggressive behaviour such as fighting. Tenderness, or love, is
associated with ‘physical and mental closeness’ and with nurturing behaviour.
Fear is stimulated by an appraisal of ‘danger or goal conflict’, the subject react-
ing with withdrawal (e.g. fleeing) or freezing (e.g. trembling) behaviours (Oatley
2004: 81–2).
A cursory overview of the sonata’s remaining three movements does suggest
that their material unfolds these respective emotional behaviours. If the second
movement’s fugal opening expresses a kind of tetchy, repressed or even ‘cold’
anger, then the music ‘lashes out’ in the successive eruptions of semiquaver pas-
sagework. These ‘eruptions’ recall James Russell’s ‘script’ for anger (1991: 39),
a more elaborated version of Oatley’s schema: after an offence, a person glares
and scowls, will feel internal tension and agitation and a desire for retribu-
tion, and finally will lose control and strike out. The Siciliana is generically and
topically a lullaby, its tenderness mirroring the intimate and nurturing social
closeness of a dialogue between a mother and child (described by Colwyn
Trevarthen as the ‘primary intersubjectivity’ enacted in their rhythmic turns
of cross-modal dialogue; 1999–2000: 177). The literature on lullabies is exten-
sive, often referring to their cross-cultural features of simplicity, smoothness,
descending contours, relative slowness and short phrasing (Unyk et al. 1992;
Trainor and Hannon 2013). Daniel Leech-Wilkinson has linked the preponder-
ance of falling pitch contours in art-music lullabies to the descending motions
of Infant- Directed Speech (IDS) or ‘motherese’ (Leech- Wilkinson 2006).
118 Music and Shape
Importantly, the yearning, increasingly chromatic, quality of the Siciliana is

equally expressive of erotic adult love, in line with the finding of psychologists
that infant-directed love is a template for older experiences. Despite mature
lovers’ increasing ability to integrate closeness and independence as they grow
older, people’s love schemata ‘are shaped by children’s early experiences and are
thus relatively permanent’ (Hatfield and Rapson 2004: 656). Finally, the relent-
less semiquaver runs of the Presto finale, cashing in the metaphor of musical
‘motion’ as physical motion across a landscape, suggests panic-stricken flight
in response to threat. Fear in music is one of the most variegated of emotions,
since it can be associated with several aspects of threat: the threat itself; its
foreboding qualities (musical ‘danger signals’ typically being soft, low sounds);
a trembling before this threat (typified by tremolando); freezing on the spot
(musical stasis, pedals, hiatus); or, as in the present case, physical flight, often
with no clear direction. The seemingly aimless vicissitudes of the Presto suggest
fleeing in the face of an unknown threat—perhaps flight from the preceding
three movements themselves. As with the Adagio, the broad formal ‘behav-
iours’ of the Fuga, Siciliana and Presto unfold patterns encapsulated in the
opening ritornello, specifically within its central Fortspinnung module.
Of the four movements, the Fuga presents the Fortspinnung’s fifth cycle
most transparently (Figure 4.7). Its four notes are articulated plainly (undec-
orated) and with rhythmic and metrical regularity (equally spaced quavers).
Kremer brings out these pitches by performing them staccato—not marked
in Bach’s score, but arguably implicit in historical performance style (Luca’s
historical interpretation does the same). If staccato articulation may be expres-
sive of fear as much as anger, then the latter emotion comes to the fore with the
dense triple-stopping at bar 3, which Kremer plays particularly aggressively.
The Amazon review commends Kremer for his aggression: ‘Kremer accenting
the repeated notes in the fugue’s subject harshly and fiercely. … [The fugue]
explodes with a palpable fury from the instrument’. However, such fury is an
outlier in Fuga interpretations: Luca and Perlman are much more subdued (per-
haps they choose to emphasize the first half of Russell’s anger schema—glaring
FIGURE 4.7 Bach, Sonata for Unaccompanied Violin No. 1 in G minor (BWV 1001), Fuga, bars 1–4
and internal tension—rather than the aggressive second half). Nevertheless,

Kremer’s outlying performance is arguably in line with the aggression implicit
in Bach’s deformed contrapuntal treatment, epitomized by his rare use of a
subdominant answer. A normative tonal answer in bar 2 would have remained
on the G, instead of descending to F, and would have resolved to F a little later
as part of a D minor (dominant) harmony at bar 3. Yet Bach supplies an anom-
alous subdominant answer instead, so as to cadence on C minor at bar 3 (in the
baroque repertoire, the other great exception to this rule is the subdominant
answer of the fugue in the Toccata and Fugue in D minor attributed to Bach).
The subdominant answer creates a powerful clash at bar 3 between the E♭ and
the D, so that the contrapuntal voices ‘fight’ with each other aggressively.
In the Siciliana, the first three notes of the cycle are clear and are rendered
with the lullaby’s typical long–short rhythmic lilt (Figure 4.8). However, F,
the fourth note, is displaced by one quaver (the listener expects it two qua-
vers after the C), stretching the length of the C, thereby stretching the lilt
like elastic. This rhythmic flexibility is congruent with the smoothness and
avoidance of sharp contrast noted of lullabies, and is markedly distinct from
the fugue subject’s (rhythmic) regularity. But there are other lullaby aspects
implicit in the voicing of the Siciliana’s fifths cycle. The pitches of the cycle
are distributed between two contrapuntal voices, whereas in the Adagio and
Fuga the cycle kept to a single voice. The impression of there being two
voices is heightened by the registral gap between the lower and upper notes,
which is much greater than in the previous movements; for instance, the
D and G are an octave-and-a-half apart. This registral separation encour-
ages the listener to ‘stream’ the pitches as distinct voices, metaphorically
suggesting a dialogue between two musical personas: one could even inter-
pret the voice-crossing of the two parts on the final F of the cycle (taken
by the lower voice, rather than, as expected, by the upper) as symbolic of
their harmonious interaction. In the Fuga, the cycle at bar 2 ‘fights’ against
the fugal answer beneath it. In the Siciliana, the dialogue is not conflictual
FIGURE 4.8 Bach, Sonata for Unaccompanied Violin No. 1 in G minor (BWV 1001), Siciliana,
bars 1–6
120 Music and Shape
FIGURE 4.9 Bach, Sonata for Unaccompanied Violin No. 1 in G minor (BWV 1001), Presto, bars 1–11
but harmonious, because the pitches of the cycle are shared between the two
voices, and indeed cross over.
The rhythmic and textural uniformity of the Presto—its continuous succes-
sion of semiquavers—makes it initially difficult to pick out the fifth cycle from the
background figuration (Figure 4.9). Interestingly, the cycle is slightly extended
by a further fifth progression: A–D is followed by G–C at bar 11. This is the only
movement where this happens. It is as if the forward-moving harmonic drive of
the music is so great that the cycle’s seemingly endless implicative potential to
rotate around the circle of fifths (B♭–E♭–A–D–G–C–F–B♭ etc.) can hardly be
contained. This harmonic drive compounds the Presto’s rhythmic speed. As well
as panic, the movement expresses another corollary of fear: shock. The Presto
unfolds a series of shocks by subverting its metrical pattern; indeed, this pat-
tern is constantly shifting in unpredictable ways, cognitively ‘wrong-footing’ the
listener (as it symbolically wrong-foots the fleeing subject, as it were). For
instance, the very start of the cycle, the B♭ at bar 6, subverts a pattern of two-bar
phrases established at the opening (Figure 4.10a and b). That is, the fast music
suggests a slower metrical grouping, whereby bars 1–2 constitute one ‘beat’ of a
‘hyper-bar’, bars 2–3 a second beat, and bar 5 the onset of a third beat. It is this
implicit three-beat hyper-bar that is interrupted by the B♭; it introduces a ‘hyper-
metrical’ disruption. Moreover, a ‘metrical reduction’ of the cycle at bars 6–8
(leaving out the semiquavers between its notes) reveals that, by accenting the sec-
ond beat of each group (the crotchets E♭, D and C), it encapsulates the preced-
ing hypermetrical disruption in miniature (Figure 4.10b). Hence not only does
the cycle arrive as a metrical shock to bars 1–5, but it is itself a series of metrical
shocks. And there is another, broader, level at which the Presto expresses fear:
the sheer speed of the music makes it difficult to follow. This literally overwhelm-
ing quality evokes the classic formula of the sublime, which is fear at its most
philosophically elevated level. The Presto evokes sublime fear both as cognitive
overload and as the behavioural reaction to fear, which is to flee.
FIGURE 4.10 (a) Hypermetrical reduction of Bach, Sonata for Unaccompanied Violin No. 1 in G minor
(BWV 1001), Presto, bars 1–6; (b) metrical reduction of bars 6–8, revealing syncopation
Bach’s cycle, then, projects four distinct emotional behaviours. The central
module of the ritornello stereotype serves as a bellwether for each behaviour.
Its atomization in the Adagio suggests the lack of goal— the lethargy—
connected with sadness or depression. Its conflictual disposition in the Fuga
enacts the aggressive conflict often linked to anger, when goals are blocked.
Its fluid and flexible disposition in the Siciliana suggests the tender dialogue
between mother and child in a lullaby, mirroring social closeness. And its
animation—and overflow into an extended fifth cycle—in the Presto evokes
a subject’s physical flight, in extreme fear or panic. All of these emotional
behaviours constitute ‘shapes’, as previously defined, shaping the ritornello
stereotype.
VECTORS
A third dimension—in addition to ‘shape’ and ‘shaping’—is the relational one,
through which the various behaviours define and articulate themselves against
each other. This dimension commutes shape/shaping into a kind of transforma-
tional ‘vector’, nudging it from the domain of emotion proper to that of ‘affect’.
Although the terms ‘emotion’ and ‘affect’ tend to be used interchangeably, in this
instance I follow thinkers such as Brian Massumi (2002), who represents a con-
stellation of ideas drawn chiefly from Deleuze, Bergson and Spinoza. Massumi
122 Music and Shape
theorizes affect as an energetics of indeterminate bodily intensity, casting feeling

as a fluid process anterior to emotion proper. Emotional signification—as in the
meaning of discrete emotional categories, such as sadness, tenderness, anger,
etc.—marks a stage where the fluid vectors of affect are stalled, frozen and ren-
dered determinate. But suggestive as the affect–emotion distinction may be, I
don’t believe it fits cleanly either with the ontology of music (which is intrinsi-
cally fluid and vectorial anyway) or with the new paradigm in emotion research
(which postdates Massumi’s seminal work). For instance, the concept of emo-
tion as behaviour is a dynamic one, particularly in music. Even so, the notion
of affect can usefully tilt the discussion towards a more processual standpoint.
Emotion dissolves into time: at the broadest level, it becomes apparent that the
very discrimination and experience of these emotions is not a fixed absolute, but
something which develops with age and expertise over a lifetime’s immersion in
these works (Spitzer and Coutinho 2014).
Emotional characteristics are not absolute but relational: with sadness,
what is at issue is not slow tempo per se, but slower or slowest; with anger, it is
not conflicted texture, but more and most conflicted. As David Huron (2011)
reminds us, low pitch does not in itself connote sadness; if so, men would
always sound sadder than women. It is low pitch relative to a corpus or group.
For Bach, that corpus is the baroque style; for Mozart’s Trio, it is the classical
style. When listeners discern tenderness in both the Siciliana and the Trio, they
refer each movement to its respective stylistic context, and read off the emotion
by translating different display rules or ‘languages’: the disjunctions of Bach’s
Fuga are tame next to, say, those in Le sacre du printemps, but extreme in rela-
tion to the sonata at hand and baroque music in general.
Compared to the circle of style, Bach’s sonata as a whole constitutes a more
focused and tightly circumscribed cycle. I tend to hear the sonata like a Calder
mobile, with ‘vectorial’ transformations happening not just between contigu-
ous movements but combinatorially across all movements. The contiguous vec-
tors are the most direct because they unfold in time. The shift from Adagio to
Fuga takes us from stasis to cadential action; from ambiguity to conflict; from
self-reflection to orientation towards another subject. Fuga to Siciliana moves
from interpersonal conflict to harmonious dialogue. Siciliana to Presto takes us
from lyrical freezing (the lullaby is a lyric standstill) to a flight response. And
were we to return full circle to the beginning, the difference between Presto and
Adagio would be that between the individual in the world and individual self-
absorption: in fear, the E♭ blocks the Presto’s drive; in sadness, the E♭ affords
energy to the Adagio’s lethargy.
Connections also cut across the movements noncontiguously, so that the
sonata—like style in general—is as much a free mobile as a cycle moving in
one direction. The Adagio’s sadness is an aversive emotion, pushing us away
from dissonance; the yearning of the Siciliana pulls us towards harmony. The
Presto’s fear is passive: we are in the grip of a stereotype; the Fuga’s anger is
deliberative, deploying the stereotype with intent.
Viewing Bach’s sonata as a combinatoire of warring passions jibes with how
emotions functioned in Shakespeare and Homer, according to the emotion his-
torian Philip Fisher. In Fisher’s words, the passions, or ‘vehement states’, define
each other by fighting each other:
When used to define and express the substance of the self, the vehement
states—anger, wonder, ambition, jealousy, shame, pity, or fear—draw
on an essentially Greek and especially Homeric theory of substance and
struggle, or, as the Greeks called it, agon. Substances mutually make
each other known, not only because of their differences but because of
moments of conflict. It is at the meeting point where combat takes place
and mutual destruction is possible that each becomes for the first time
visible as what, in itself, it is. A large rock is one substance, the water
of the sea another. At the shoreline where the sea pounds against the
rock, the rock registers in its shape nothing but the consequences of thou-
sands of years of waves cutting into it, even as each individual wave was,
in turn, stopped and broken by the rock’s resistance… The shattering
wave, the pounded rock make visible on each side the nature of sea and
rock, but they do so at the very moment that each of the two is situation-
ally flooded from without by the differences that occur as each limits the
other. (2003: 51)
The mutual definition of the passions is enshrined in their narrative pairings.

Anger is a common outcome of sadness; the rage of Achilles is consequent on
Achilles’ long depression, his grief on the death of Patroclus. Fisher describes
how the turn from one passion to the other involves a redirection of the will from
inward-facing mourning to outward-facing vengeance (ibid.: 64). Modern psy-
chologists are also aware that ‘sadness occurs in several dynamically significant
patterns’, chief of which is the sadness–anger pattern that characterizes some
low moods, such as depression (Izard and Ackermann 2004: 259). One may rea-
sonably speculate, then, that Bach’s Siciliana lullaby mourns the mother of his
children, his first wife, Maria Barbara. Formally, its lyrical standstill interrupts
the rage of the Fuga from the panic of the Presto. Ending a cycle with pan-
icked flight may seem odd—especially given that so many minor-mode finales in
western music end this way (one thinks, for instance, of Chopin’s second piano
sonata, just after its funeral march)—until one recalls Aristotle on drama. Tragic
catharsis is compounded from pity and fear. It is not that the musical persona/lis-
tening subject is fleeing from anything in particular. Rather, the emotion of terror
is expressed through its behavioural correlative, which is flight. Fear—the most
vectored and temporal of the emotions—endows Bach’s cycle with the shape of
things to come.
124 Music and Shape
References
Alloy, L. and L. Abramson, 1979: ‘Judgment of contingency in depressed and nondepressed

students: sadder but wiser?’, Journal of Experimental Psychology: General, 108: 441–85.
Clarke, E., 2003: ‘Introduction’ to Psychology of Music. §IV: Performance, Section 1,
Grove Music Online, http://www.oxfordmusiconline.com/subscriber/article/grove/music/
42574pg4 (accessed 9 April 2017).
Cumming, N., 2000: The Sonic Self (Bloomington: Indiana University Press).
Davies, S., 1994: Musical Meaning and Expression (Ithaca, NY: Cornell University Press).
Dreyfus, L., 1998: Bach and the Patterns of Invention (Cambridge, MA: Harvard University
Press).
Ekman, P., 1984: ‘Expression and the nature of emotion’, in K. Scherer and P. Ekman, eds.,
Approaches to Emotion (Hillsdale, NJ: Erlbaum), pp. 319–44.
Fabian, D., 2005: ‘Towards a performance history of Bach’s Sonatas and Partitas for
Solo Violin: preliminary investigations’, in L. Vikarius, ed., Essays in Honor of László
Somfai: Studies in the Sources and the Interpretation of Music (Lanham, MD: Scarecrow
Press), pp. 87–108.
Fisher, P., 2003: Wonder, the Rainbow, and the Aesthetics of Rare Experiences (Cambridge,
MA: Harvard University Press).
Frijda, N., 1986: The Emotions (Cambridge: Cambridge University Press).
Gabrielsson, A. and E. Lindström, 2010: ‘The role of structure in the musical expression of
emotions’, in P. Juslin and J. Sloboda, eds., Handbook of Music and Emotion: Theory,
Research, Applications (New York: Oxford University Press), pp. 367–400.
Gjerdingen, R. O., 2007: Music in the Galant Style (New York: Oxford University Press).
Hatfield, E. and R. L. Rapson, 2004: ‘Love and attachment processes’, in M. Lewis and
J. M. Haviland-Jones, ed., Handbook of Emotions, 2nd edn (London: Guilford Press),
pp. 663–76.
Heinichen, J. D., 1711: Neu erfundene und gründliche Anweisung des General-Bass (Hamburg:
Benjamin Schillern).
Huron, D., 2006: Sweet Anticipation: Music and the Psychology of Expectation (Cambridge,
MA: MIT Press).
Huron, D., 2008: ‘A comparison of average pitch height and interval size in major-and
minor-key themes: evidence consistent with affect-related pitch prosody’, Empirical
Musicology Review 3/2: 59–63.
Huron, D., 2011: ‘Why is sad music pleasurable? A possible role for prolactin’, Musicae
Scientiae 15/2: 146–58.
Izard, C. and B. Ackerman, 2004: ‘Motivational, organizational, and regulatory func-
tions of discrete emotions’, in M. Lewis and J. M. Haviland-Jones, ed., Handbook of
Emotions, 2nd edn (London: Guilford Press), pp. 253–64.
Juslin, P., 1997: ‘Emotional communication in music performance: a functionalist perspec-
tive and some data’, Music Perception 14/4: 383–418.
Juslin, P. and J. Sloboda, eds., 2010: Handbook of Music and Emotion: Theory, Research,
Applications (New York: Oxford University Press).
Karl, G. and J. Robinson, 1995: ‘Levinson on hope in the Hebrides’, Journal of Aesthetics
and Art Criticism 53: 195–259.
Kivy, P., 1989: Sound Sentiment: An Essay on the Musical Emotions (Philadelphia: Temple
University Press).
Lakoff, G. and M. Johnson, 1980: Metaphors We Live By (Chicago: Chicago University Press).

Leech-Wilkinson, D., 2006: ‘Portamento and musical meaning’, Journal of Musicological
Research 25: 233–61.
Leech-Wilkinson, D., 2013: ‘The emotional power of musical performance’, in T. Cochrane
and B. Fantini, eds., The Emotional Power of Music (New York: Oxford University
Press), pp. 41–54.
Lester, J., 1999: Bach’s Works for Solo Violin: Style, Structure, Performance (New York:
Oxford University Press).
Levinson, J., 1990: ‘Hope in the Hebrides’, in idem, Music, Art, and Metaphysics (Ithaca,
NY: Cornell University Press), pp. 336–75.
Margulis, E., 2005: ‘A model of melodic expectation’, Music Perception 22/4: 663–714.
Massumi, B., 2002: Parables for the Virtual: Movement, Affect, Sensation (London:
Duke University Press).
Meyer, L. B., [1976] 2000: ‘Grammatical simplicity and relational richness: the trio of Mozart’s
G-minor symphony’, in The Spheres of Music: A Gathering of Essays (Chicago:
University of Chicago Press), pp. 55–125.
Monelle, R., 2000: The Sense of Music: Semiotic Essays (Princeton: Princeton University
Press).
Niedt, F. E., [1706] 1721: Musicalische Handleitung, Part 2, 2nd edn (Hamburg: Benjamin
Schillers Wittwe & Joh. Christoph Kißner).
Nussbaum, C., 2007: The Musical Representation: Meaning, Ontology, and Emotion (Cambridge,
MA: MIT Press).
Oatley, K., 2004: Emotions: A Brief History (Oxford: Blackwell).
Robinson, J., 2005: Deeper than Reason: Emotion and Its Role in Literature, Music, and Art
(Oxford: Clarendon Press).
Ross, S., 2003: ‘Style in art’, in Jerrold Levinson, ed., The Oxford Handbook of Aesthetics
(New York: Oxford University Press), pp. 228–44.
Russell, J. A., 1991: ‘In defense of a prototype approach to emotion concepts’, Journal of
Personality and Social Psychology 60/1: 37–47.
Schubert, E. and D. Fabian, 2006: ‘The dimensions of baroque music performance: a
semantic differential study’, Psychology of Music 34/4: 573–87.
Sloboda, J. and P. Juslin, 2001: ‘Psychological perspectives on music and emotion’, in
P. Juslin and J. Sloboda, eds., Music and Emotion: Theory and Research (New York:
Oxford University Press), pp. 71–104.
Spitzer, M., 2004: Metaphor and Musical Thought (Chicago: University of Chicago Press).
Spitzer, M., 2009: ‘Emotions and meaning in music’, Musica Humana 1/2: 153–94.
Spitzer, M., 2013: ‘Sad flowers: affective trajectory in Schubert’s Trockne Blumen’, in T.
Cochrane and B. Fantini, eds., The Emotional Power of Music (New York: Oxford
Spitzer, M. and E. Coutinho, 2014: ‘The effects of expert musical training on the percep-
tion of emotions in Bach’s Sonata for Unaccompanied Violin No. 1 in G minor (BWV
1001)’, Psychomusicology: Music, Mind, and Brain 24/1, 35–57.
Trainor, L. and E. Hannon, 2013: Musical Development, in D. Deutsch, ed., The Psychology
of Music, 3rd edn (New York: Academic Press), pp. 423–98.
Trevarthen, C., 1999– 2000: ‘Musicality and the intrinsic motive pulse: evidence from
human psychology and infant communication’, in special issue on ‘Rhythm, narrative,
and origins of human communication’, Musicae Scientiae 3, supplement: 155–99.
126 Music and Shape
Unyk, A., S. Trehub, L. Trainor and G. Schellenberg, 1992: ‘Lullabies and simplicity: a
cross-cultural perspective’, Psychology of Music 20: 15–28.
Wegman, R., 2003: ‘Johannes Tinctoris and the “New Art” ’, Music and Letters
84/2: 171–88.
Discography
Joachim, J., [1903] 2003: The Great Violinists: Recordings from 1900–1913. (Testament
SBT2 1323).
Kremer, G., 1981: Johann Sebastian Bach: Sonatas and Partitas for Unaccompanied Violin
(Philips 6769 053; CD reissue: ECM New Series 1926–27).
Luca, S., 1992: Johann Sebastian Bach: Sonatas and Partitas for Unaccompanied Violin
(Nonesuch HC-73030; CD reissue: 73030).
Perlman, I., 1988: Johann Sebastian Bach: Sonatas and Partitas for Unaccompanied Violin
(EMI Classical CDS 7 49483 2; reissue: 0 85281 2).
Reflection
Steven Isserlis, cellist
To perform a piece of music is essentially to tell a story. The task of an inter-

preter is that of narrator and actor; he or she must relate the tale woven by
the composer, not merely portraying, but fully identifying with the characters
and their fates. Music, like fiction, needs form and shape in order to be believ-
able or moving. Needless to say, musical forms can be infinitely varied—and
perhaps the word ‘story’ is confining it too closely, when so much music might
as easily be perceived as a poem, a fantasy, a reverie; but whatever its nature, a
composition needs the discipline of a preordained structure in order to attain
the inevitability of satisfying art.
As a performer, there is no way that I can take the listener on a musical jour-
ney unless I understand (or at least attempt to understand!) the various aspects
of the work being performed. This involves several levels of comprehension.
To begin, an overall knowledge of the score is essential: just as an actor cannot
give a convincing account of a role without knowing what happens to all the
characters in the play (not just to the actor’s own character), so a musician must
be familiar with all the voices in a piece of music. This should go without say-
ing; but, strangely, it doesn’t. I have even heard of teachers discouraging their
students from delving too deeply into the score, lest it make their interpretation
less individual! Ahem.
Secondly, one needs a strong overview of the general shape of the work.
All musical forms present their own challenges. If a composer has chosen to
write in sonata form, for instance, there is no way that an interpreter can give a
proper account of the work without understanding the contrasts and similari-
ties between the (usually three) main subjects—any more than one could under-
stand a novel without knowing who the main characters are. But of course the
demands on a performer go far beyond that basic grasp of the facts. Not only
does one have to delve into the inner fabric of those main subjects, with their infi-
nite variety of light and shade, of strength and gentleness, of rhythmic alertness
127
128 Music and Shape
and languor, and so on: one has also to be familiar with their fates, with the
interplay between them, with their transformation over the course of the work.
This knowledge informs every aspect of a performance—tone colours, tempo
relationships, dynamic contrasts, etc. Again, this may sound obvious, but all
too often musicians fail to come to terms with these basic elements. The result is
boring performances—and alas, there are far too many of those! I would liken
these haphazard musicians to travellers walking through a forest, lurching from
tree to tree, appreciating the beauty of each tree, perhaps, but with no idea how
to get to the other side of the forest. Conversely, a performer who understands
the structure of a work will be blessed with the freedom of a bird flying above
that forest, perceiving each detail in all its exquisite clarity, but able at all times
to make out the overall direction of the path. Foreknowledge of the form—the
story—must inform the interpretation from the outset. The actor analogy again
seems apt: at the end of a great performance of Hamlet, the audience should
somehow feel that, despite the many unpredictable twists and turns of the play,
there has been an inevitable trajectory to the hero’s fate.
Within this overall view of the work, one must also, of course, grasp the
microstructure of each phrase. In music, as in speech, every clause, every part
of the phrase, has a centre, with notes leading to and away from it. Just as in
speaking one highlights the most important or unexpected word in a sentence,
so an equivalent event in music will require some sort of emphasis. This can
be achieved with dynamic stress, of course; it can also be done with time—the
so-called agogic accent, in which one lingers on one note, making up the time
on less important notes (‘rubato in tempo’); with colour; or in countless other
ways. If the performance is to sound truly alive, no two consecutive notes should
have exactly the same weight. And then, again as in speech, there is the question
of punctuation: music is full of commas, full stops, semicolons, full colons and
so on. For string players, the bow must be ready to leave the string at all times;
for pianists, hands and feet must similarly be employed in order to allow the
phrase to breathe; and so on through all instruments and voices. If this very basic
aspect of phrasing is overlooked, the music becomes as comprehensible as the
soliloquy of the unfortunate actor—very far from the great Hamlet described
above—who, in a panic, stumbles to the front of the stage and gabbles out:
‘Tobeornottobethatisthequestionwhethertisnobler’, etc. Not a consummation
devoutly to be wished.
In short, the musician’s challenge is to convey the sentences, the paragraphs
and the overall narrative, the personalities, utterances and destinies of the
musical characters, with as much clarity as possible. In even shorter: the aim
is to communicate the meaning of the music—what a surprise! But it requires
thought as well as feeling, study as well as spontaneity. It also requires the
technical mastery that enables the performer to shape each phrase as the music
requires. It’s not that straightforward, in fact—not as easy as it should sound.
5
Shape in music notation

EXPLORING THE CROSS-M ODAL REPRESENTATION OF
SOUND IN THE VISUAL DOMAIN USING ZYGONIC THEORY
Adam Ockelford
This chapter uses an extension of ‘zygonic’ theory to investigate how structure

is perceived in the auditory and visual domains, and how incoming data from
the two sensory modalities may be connected in the mind through different
forms of cross-modal mapping. I argue that such mappings enable patterns in
sound to be depicted coherently and consistently as two-dimensional images.
Four types of relationship that potentially exist between sound and shape are
identified, namely ‘regular’, ‘indirect’, ‘arbitrary’ and ‘synaesthetic’. The first
three of these are shown to be analogous in varying degree to the tripartite
typology of signs—icon, index and symbol—developed by the American phi-
losopher Charles Peirce, a founder of semiotics. The new, zygonic model is then
used to interrogate the nature and function of picture scores created by young
children, tactile representations of pitch created by their blind peers, western
notation based on the staff, music transcribed into braille, guitar chord symbols
and a synaesthete’s visualization of a track from Jean-Michel Jarre’s Oxygène.
Zygonic theory
Zygonic theory seeks to answer the question of how it is that music makes
sense: how, in the absence of semantic content, it is structured and forms
abstract narratives in sound that convey meaning over time. The theory is
‘psychomusicological’ in nature, in that it advances a musicological hypothesis
underpinned by psychological principles. Hence it is an epistemological hybrid,
in which the idiographic intuitions characteristic of music theory and analysis
129
130 Music and Shape
are informed by the nomothetic findings proper to cognitive psychology (Cross

1998; Gjerdingen 1999; Ockelford 2009).
Zygonic theory takes our apprehension and understanding of music to be
built up from cognitive data acquired from a number of perceptual domains
pertaining to sound, such as pitch, loudness and timbre. Each domain can
exist in different states, thereby functioning as a perceived sonic variable. In
philosophical terms such states are conceptualized as ‘qualia’, that is, units of
experience (see, for example, Chalmers 1996; Kanai and Tsuchiya 2012). Some
domains, such as duration, have a single axis of variability, while others, like
timbre, are multidimensional in nature; some gauge qualities such as loudness,
while others detail a sound’s perceived location in time or space; and some,
like pitch, pertain to individual notes, while others, including tonality, are
characteristic of a group (Ockelford 1991b). The perceived state of a domain
at any given point in time is termed, in zygonic theory, a ‘perspective1 value’
(Ockelford 2005: 10–12). In western music theory, such values are assigned
labels, which serve to define them more or less specifically, permitting iden-
tification and facilitating their replication over time. The degree of specificity
with which perspective values are defined is contingent both on the fidelity with
which they can be perceived, remembered and reproduced, and on the accuracy
demanded by a particular musical context. For example, values of pitch, which
play a prominent role in most musical design and can be discerned with some
precision (Stevens 1975), are typically labelled exactly (using designations such
as ‘a4’ and ‘f♯3’), whereas values of loudness, which generally fulfil a second-
ary structural function (Boulez [1963] 1971: 37; Ockelford 1999: 4), and whose
perception is contextually bound, are classified more broadly (using terms such
as p and f).
Perspective values can be compared in the mind, and the resulting mental
constructs—which, in the American cognitive linguist George Lakoff’s terms,
constitute forms of ‘link schema’ (1987: 283) that inhabit the mental space per-
taining to music processing (Fauconnier [1985] 1994)—can, again, be conceptu-
alized in music-theoretical terms. For example, differences in pitch are referred
to as ‘intervals’. In the nomenclature of zygonic theory, link schemata such as
these are termed ‘interperspective relationships’ (Ockelford 1991b: 133). Like
the perspects to which they pertain, these are variable in nature, potentially
existing across a range of ‘interperspective values’.
Interperspective values reflect the nature of the perspects to which they per-
tain and are of different types. For example, the interval between the onsets of
two notes gauges the perceived difference in time that separates them. Without
(necessarily) being aware of it, musicians invoke interperspective values of
this type whenever they use a phrase such as ‘the cor anglais comes in five
beats after the oboe’ (given that they have a sense of how long each beat is);
see Figure 5.1. The perspect ‘duration’ (or note-length) incurs interperspec-
tive relationships that may be heard and understood as differences or ratios
Shape in music notation 131
FIGURE 5.1 Oboe and cor anglais duet from the third movement of Vaughan Williams’ Fifth
Symphony
by musicians adopting a consciously conceptual mode of listening (forms

of comparison that are implicit in standard western notation). For instance,
the oboist performing the passage from Vaughan Williams’ Fifth Symphony
shown in Figure 5.1 may mentally calculate that the first note is a semiqua-
ver longer than a crotchet would have been (a difference) or that the dotted
crotchet in bar 2 of the excerpt is three times as long as the quaver that follows
(a ratio). Other perspects yet, such as timbre, bear values that are irreducible
to solitary coefficients (see Risset and Wessel 1999), whose interperspective
relationships are therefore typically complex too (though cf. Slawson 1985).
For example, the performers of the passage in Figure 5.1 may regard the cor
anglais as having a ‘darker’ sound than the oboe. Given this diversity, it is
perhaps inevitable that, while for a given sound the relationships between per-
spective values are bound together in a common phenomenological experience
(see Roskies 1999), all perspects have come to serve distinct functions in music,
as we shall see.
Interperspective relationships between perspective values—metaphorically,
those that are closest to the perceptual ‘surface’— are termed ‘primary’.
Relationships between primary relationships operate at a deeper level and are
said to be ‘secondary’. In some musical contexts, ‘tertiary’ relationships fulfil
an important cognitive function too (Ockelford 2002). In most circumstances
this represents the maximum level of cognitive abstraction that the perceived
relationships between musical objects attain (irrespective of style).
Interperspective relationships may be illustrated graphically using the letter I
with an arrow superimposed (see Figure 5.2). Where such relationships connect
perspective values that are extended in time—as pitch, loudness and timbre usu-
ally are—arrowheads are filled; relationships linking singular features such as
duration and onset time are open (see Ockelford 1999). To avoid ambiguity, the
perspect concerned is indicated through a superscript—a single letter such as
P for pitch, O for onset and T for timbre may suffice—and the level of the rela-
tionship is shown through an appropriate subscript. For example, in Figure 5.2,
132 Music and Shape
FIGURE 5.2 Representation of primary interperspective relationships
the primary interperspective relationship of pitch indicates the descending

major second between the d2 and c2 in the oboe part; the relationship of timbre
between the oboe and the cor anglais conveys the ‘darker’ sound of the latter;
and the secondary relationship of onset reflects the fact that the cor anglais’s
opening motive is a beat shorter than the oboe’s.
Zygonic theory holds that the cognition of musical structure is a function of
a particular class of interperspective relationships, through which one value is
heard as deriving from (or, conversely, generating) another. This occurs when
one value is thought to imitate another or to be the model that another is heard
to replicate. This is because structure equates to organization, or control, and if
one value is deemed to imitate another, then, given that a range of values would
otherwise have been possible, the second will be constrained metaphorically by
the first. The interperspective relationships through which imitative order is
perceived are of a special type that I term ‘zygonic’2 (Ockelford 1991b: 140ff.).
A ‘primary zygonic relationship’ or ‘zygon’ may be represented as shown in
Figure 5.3, which illustrates values of pitch, duration and loudness derived
through imitation from the opening of the oboe’s phrase. Note the use of com-
plete arrowheads to indicate a relationship between values that are perceived
to be the same. ‘Imperfect’ zygonic relationships, in which the value generated
differs slightly from the one it imitates, are indicated using half arrowheads,
which, as we have seen in the case of non-zygonic interperspective relation-
ships, are indicative of change (Ockelford 2006: 87). Like half arrowheads,
FIGURE 5.3 Primary and secondary zygonic relationships
134 Music and Shape
complete arrowheads may be filled (in the case of values extended in time) or
open (in the case of singularities). An example in the domain of duration is
shown in Figure 5.3, where a crotchet tied to a semiquaver in the oboe part
is imitated a beat later by a crotchet tied to a triplet quaver in the cor anglais.
Secondary zygons may be deemed to link primary interperspective relation-
ships, where one is thought to imitate another. In Figure 5.3, examples pertain-
ing to pitch and onset are shown. Once more, superscripts and subscripts are
used to indicate the perspective domain in which the relationship exists and its
level. Finally, tertiary zygons may connect secondary relationships in the mind
of the listener. These occur in the domain of perceived time, for instance, where
there is a regular accelerando or ritardando; see also Figure 5.8.
It is believed that zygonic relationships such as those depicted in Figure 5.3
offer a highly simplified representation of certain cognitive events that we may
reasonably suppose take place (typically nonconsciously) during meaningful
participation in musical activity—whether listening, performing or creating
music anew through improvisation or composition. Moreover, the single con-
cept of a zygon bequeaths a vast perceptual legacy, with many possible mani-
festations: potentially involving any perceived aspect of sound, existing over
different periods of perceived time, and operating within the same and between
different pieces, performances and hearings. Zygons may function in a num-
ber of ways: reactively, for example, in assessing the relationship between two
extant values, or proactively, in ideating a value as an orderly continuation from
one presented. They may operate between anticipated or remembered values,
or even those that are wholly imagined, only ever existing in the mind. (There
is, of course, no suggestion that the one concept is cognitively equivalent in all
these manifestations, only that it is logically so.) Even a short passage of music
comprises a large number of perspective values, potentially linked through a
vast network of relationships, whose effect would be perceptually overwhelm-
ing were it not for the fact that the mind seeks (and is able to find) groups of
relationships that give the impression of acting together in coordinated fash-
ion. This issue is considered at length elsewhere (for example, Ockelford 2005).
Extending zygonic theory to art and related forms

of visual representation
In one of the first main expositions of zygonic theory (Ockelford 1999), I hinted
that the notion of the creation and cognition of structure through imitation
need not be limited to music (and so to the perceptual domains pertaining to
sound)—potentially having application to painting, sculpture and ballet, for
example. In relation to art, I observed that ‘pictures normally display an inner
coherence that ultimately derives from the repetition of one or more of its per-
ceived aspects, such as colour, size, shape or texture. An abstract drawing that
lacked duplication of any feature—that showed no evidence of symmetry, uni-

formity or regular change—would be the visual equivalent of random bursts
of white noise’ (ibid.: 114).
Some of the wider implications of this idea are currently being worked
through as part of the ‘Sounds of Intent’ project, an investigation led by the
Applied Music Research Centre at the University of Roehampton into the
musical and wider artistic development of children with learning difficulties
(see, for example, Ockelford 2012a). In this chapter, in the context of diverse
forms of music notation, we explore how zygonic theory may have relevance to
the creation and cognition of shapes, perceived visually. As most music scores
are monotone—black images on a white, two-dimensional background—it is
on this scenario that we focus our attention.
Consider first the simplest possible image: the smallest apprehensible black
dot on an otherwise blank page (Pomerantz and Portillo 2011: 1336). This can
be completely defined in terms of its (relative) location (see Figure 5.4). Now
imagine a second dot, placed a short distance from the first (Figure 5.5). We
can model the perceived relationship between the two as an interperspective
relationship of location. Like intervals in the domain of pitch, this relationship
can be defined as a vector, as it has the two components ‘magnitude’ and ‘direc-
tion’, although here the latter is infinitely variable (and not just ‘up’ or ‘down’,
as is the case with pitch); see Figure 5.6.
FIGURE 5.4 The image of a small black dot
FIGURE 5.5 Two small black dots
Loc
(3,1)
1
FIGURE 5.6 A primary interperspective relationship of location, whose value is shown

using Cartesian coordinates
136 Music and Shape
In music, the directionality of interperspective relationships tends to be

determined by the temporal order of the events to which they pertain: that is,
the value heard second is usually compared to that apprehended first, although
in retrospect, the relative salience of the values in the musical narrative may
reverse things (Cone 1987: 249ff.; Ockelford 1999: 80). But with static visual
images, how is directionality likely to be determined, given that the eye is free
to glance from either dot to the other at will? With images that convey semantic
information in sequential form (such as texts and scores, in which the continu-
ous temporal unfolding of a verbal or musical narrative is represented a chunk
at a time on the page), the direction in which time is represented and in which,
therefore, a viewer’s gaze typically shifts will be culturally determined (the com-
plex saccadic pattern of eye movements that underpin reading notwithstanding;
see, for example, Kowler 2011). In the West, this means that interperspective
relationships typically function principally from left to right, and, other things
being equal in the horizontal dimension, from top to bottom of the page.
As we saw above (Ockelford 1999: 114), it is my contention that the under-
lying principle of zygonic theory—that the imitation of a variable in sound
equates to musical structure (since one perspective or interperspective value is
perceived as being constrained by another)—can be extended to the visual arts.
In the case of monotone dots on a page, at least three are required, since two
dots with the same location are indistinguishable from each other.3
Exact imitation of magnitude and direction of an interperspective value of
location yields three dots in a straight line (see Figure 5.7). The second can
be considered to emulate the first through a secondary zygonic relationship.
Partial imitation is also possible if magnitude or direction alone is implicated.
Constant change in a series of interperspective differences of location may be
detected and regarded as imitation at the tertiary zygonic level (Figure 5.8).
Loc
2
Loc
1 (3,1)
Loc
C
1 (3,1)
FIGURE 5.7 A secondary zygonic relationship of location reflects the fact that the difference in
location between dots B and C is deemed to exist in imitation of the difference between A and B.
Loc
3 Loc
×1.5
Loc (6.75,2.25)
2
1
Loc
×1.5
2 Loc
(4.5,1.5)
1
Loc
1 (3,1)
FIGURE 5.8 Imitation of location at the tertiary zygonic level
tal)
izon
(hor
Loc
2 (h)
Loc
1 +3x
(h)
Loc
+3x +y
1
(v)
Loc
+y 1
(v)
Loc
ical)
(vert
1 Loc
FIGURE 5.9 The perceived orderliness inherent in a straight line modelled in zygonic terms
All images can ultimately be considered as being made up of dots like those
shown in Figures 5.4–5.8. One way of modelling the orderliness of a straight
line is to regard it as describing consistent change in the horizontal and vertical
dimensions, as shown in Figure 5.9. The filled arrowheads symbolize a theoreti-
cally infinite number of relationships that are the same. The connecting (dashed)
lines indicate relationships that work in parallel. A cluster of dots perceived as
forming a single gestalt may be considered to be related to others that are the same
or similar through the compound perspect ‘shape’. Where imitation is thought
to be present, the interperspective relationships between shapes may be deemed
to be zygonic (Figure 5.10). Finally, consider that humanly created images may
be regarded as imitating the visual qualities of objects in ‘real life’ (just as music
may incorporate environmental sounds such as bird-song; see Ockelford 2012b).
138 Music and Shape
Shape
FIGURE 5.10 One shape deemed to exist in imitation of another
Analysing shape in music notation using zygonic theory
Having considered how, according to zygonic theory, structure may be created

and cognized in two-dimensional visual images, we now turn our attention to
the potential application of this thinking to music notation.
The principal function of notation is to serve as a link between composer
and performer (who may be the same person):
• To communicate musical ideas more or less precisely in visual form,
thereby enabling sight-reading, for example, and permitting varying
degrees of interpretation and improvisation, so the performer has a
role beyond mere replication;
• As an aide memoire in performance; and
• To enable instructions of how to perform a piece of music to be
stored permanently.
Observe that there is no logical demarcation between what can reasonably be

regarded as ‘scores’, comprising musical notation, and other visual stimuli,
created by composers such as Cornelius Cardew, which are intended to serve
as a stimulus for performers to determine more or less for themselves which
musical sounds to play. Accepting that there is a certain conceptual fuzziness
in its definition, our focus here is on the former category (scores), with the
acknowledgement that there are conventions of performance within a given
community of musicians who use notations that are not written down, but
to which players and singers are nonetheless expected to conform: practices
that are passed on, consciously or nonconsciously, from one generation of
performers to the next, including the use of rubato, dynamics and vibrato.
And there are other spheres of musical activity where the nature of the score
gives performers the licence to improvise within predetermined constraints
(as in the figured basses used in continuo playing, for example, or lead sheets
in jazz).
How is it, then, that sounds can be represented through visual images, and
what forms do such analogues take? Scores work on the principle that system-
atic relationships are possible between shape and sound: a shape is taken to have
a meaning such that, in the mind of the performer, the visual image dictates, to
a greater or lesser extent, which sound is to be made, and when. The zygonic
conjecture provides a theoretical framework for modelling how this process
works, and evinces three possible mechanisms through which the visual repre-
sentation of musical sounds can occur. As we shall see, these have a somewhat
convoluted relationship with the threefold Peircean typology of signs: icon,
index and symbol (Peirce [1867–71] 1984: 57; [1893–1917] 1998: 461).
The first form of representation is equivalent to Peirce’s notion of ‘icon’,
whereby a sign denotes its object by virtue of a shared quality. Zygonically
speaking, this is a ‘regular’ mapping, in which the relationship between per-
spective values in different domains involves a function that is not specific to
the perceptual domains concerned—rather, pertaining to a feature or quality
that is abstracted from the perceptual surface and is common to both. To see
this principle in action, consider that, as differences are domain-specific, so
single values of difference cannot be mapped systematically between domains
(see Figure 5.11).4 In contrast, ratios, being abstract, are not bound by the con-
text in which they occur. Hence they permit regular mapping between domains.
See, for example, Figure 5.12. Ratios may occur between differences that are
expressed through secondary relationships. In Figure 5.13 the regular inter-
domain relationships are at the tertiary level. In Peircean terms, the forms of
representation set out in Figures 5.12 and 5.13 are forms of icon that he called
‘diagrams’, ‘which represent the relations, mainly dyadic, or so regarded, of
the parts of one thing by analogous relations in their own parts’ (Peirce [1893–
1913] 1998: 274).
Pitch
+M2
494 Hz
?
P/Loc
Loc
P
Sound sources Perceived as

1
2
What would
440 Hz imitation of
pitch by location
mean in these
circumstances?
FIGURE 5.11 Single interperspective values of difference cannot be imitated between domains;
therefore, systematic mapping and iconic representation in Peircean terms are not possible.
140 Music and Shape
Length
×2
×2
Duration
Length
D/length
1
Sound sources Perceived as
2
Imitation of
the ratio between
two durations by
the ratio between
two lengths is possible
FIGURE 5.12 Domains whose perspective values are capable of conveying a sense of size can bear
cross-modal imitation of ratios at the secondary level and therefore have the capacity for iconic
representation.
Pitch
523Hz
+m2
Loc
+y
×0.5
1
×0.5
Loc
494Hz P/Loc
P
2
2
+2y
3
+M2
Loc
Tertiary zygonic
1
imitation of ratio
P
440Hz
FIGURE 5.13 Iconic representation of pitch in terms of location through tertiary-level imitation
In psychophysical terms, the notion that regular mapping between domains

requires cognitive abstraction at the level of secondary or tertiary relation-
ships (implying the appearance of at least two values in each domain) may
initially appear to run counter to Stanley Smith Stevens’ assertion that any
two stimuli can be equated through the procedure of cross-modality match-
ing, provided they have at least one aspect or attribute that varies in degree
(1975: 132). This would suggest that a single pitch, for example, could be
mapped consistently onto a given location (within a given two-dimensional

spatial universe).
But how could this be? We will explore Stevens’ contention by analysing
the work of psychologists Samuel Mudd (1963) and Michael Thorpe (2015)
which, at first blush, seems to support the notion that a value in one domain
may be felt to have some equivalence in its own right to a value in another.
Mudd investigated what he called the ‘natural potential’ of certain aspects of
sonic stimuli to evoke perceptual–motor responses by having participants place
a peg in a square matrix of holes in the position that they felt best represented
the frequency, intensity, duration or direction of a sound to which they were
exposed. In Thorpe’s procedure, which examined the connection between only
pitch (perceived frequency) and location, the peg was replaced with a dot on a
computer screen that could be manipulated using a mouse. Both experiments
produced similar results, suggesting that there is indeed a tendency to map pitch
systematically onto vertical and, to a lesser extent, horizontal axes (although
there was considerable variation between participants).
However, in both experiments, a reference pitch was linked to a predeter-
mined location that was presented before each judgement of distance was made.
In Mudd’s case, one of six different pitches then followed, three of which were
above the reference pitch and three below. In total, Mudd’s participants under-
took eighteen trials (three for each pitch). In Thorpe’s test, one of ten different
pitches was heard following the reference tone, with a total of ninety attempts
(nine for each pitch). Hence the visual mapping was to pitches that were heard
in a relative (intervallic) sense rather than just as absolute values. Even so,
this would imply that participants were attempting to equate cross-modal dif-
ferences rather than ratios, which zygonic theory suggests could not be done
consistently. Consider, though, that the experiments involved repeated presen-
tations of the reference tone in relation to a number of different pitch stimuli.
Therefore, following the first trial, listeners would be able to compare intervals
(and their visual correlates) through secondary interperspective relationships of
ratio. If this were the case, then one would expect participants’ initial attempts
to map a pitch onto a particular location to be far more variable than their
subsequent efforts, which could be gauged against previously heard intervals.
And, indeed, analysis of Thorpe’s (2015) ‘practice’ data (preliminary trials in
which participants familiarized themselves with the task) shows that this is pre-
cisely what occurred. In every case, participants’ first trials differed significantly
from the mean of subsequent mappings of the same pitch [F(9,333) = 10.30,
p < .001], between which no significant differences were found.
Moreover, it is conceivable that listeners were additionally gauging intervals
in relation to putative extremes of pitch (and, rather more precisely, of the
pegboard or computer screen). As Thorpe says in relation to his experiment:
the highest and lowest frequencies of presented comparison tones … were
octaves of the reference tone. An octave is the most recognisable of pitch
142 Music and Shape
intervals and it is reasonable to assume that all participants recognised

these as the extreme values of the variable each time they heard them.
The remaining comparison tones could then be represented by objects
positioned in order of their perceived pitch height in between these ‘book
ends’ with equal spacing. (2015: 132–3)
This potentially explains Steven’s assertion concerning the cross-modal map-

ping of individual values. Any perspective value from a familiar and (percep-
tually) finite domain can be gauged in relation to an imagined and limited
continuum of other values. Hence an isolated pitch can be deemed to be ‘high’,
for example, and mentally equated with a dot towards the top of a sheet of
paper: neither the pitch nor the dot can escape the perceptual legacy of relativ-
ity borne of previous experience that exists in the mind of every listener.
An important consideration in terms of the zygonic theory is whether or not
cross-modal relationships such as these can be deemed to be imitative: that is,
whether, in a given set of circumstances, a sense of derivation can be deemed
to cross the modal divide (in the theoretical situations shown in Figures 5.12
and 5.13 it was assumed that imitation was present). A key factor in determin-
ing the presence and, potentially, the directionality of zygonic relationships is
the context in which the notation is used, and two contrasting scenarios are
explored here.
First, let us take the case of a composer intent on using a form of nota-
tion comparable to the form of representation of pitch shown in Figure 5.13
or, indeed, Mudd’s and Thorpe’s participants as they strive to complete the
experimental assignment they have been set. In either case, the task is to create
a visual analogue of a series of pitches heard or imagined. The pitches them-
selves could not be imitated, of course, nor the differences between them, but,
as we have seen, the ratios between these could be said to generate the propor-
tions between the distances separating the dots on the page (or their coun-
terparts on the pegboard or computer screen). Nonetheless, the pitches and
the marks on paper (or their equivalents) are necessary to reify the cognitively
more abstract ratios, and since the qualia are perceptually more salient than the
relationships between them—both to the eye and to the ear—it may seem to the
composer or research participant that the sense of derivation is vested in the
stimuli themselves; that is, each pitch seems to generate a dot.
Second, we consider the position of performers, reading a score such as
Stockhausen’s Sternklang (1971), in which representations of twenty-eight con-
stellations provide visual imagery for them to interpret (see Figure 5.14). Here,
the sense of derivation works the other way round, in that the relative positions of
dots on the page are used to determine the relationships between intervals and so,
ultimately, pitches. Two horizontal guide lines, each deemed to represent a pitch
from a given set based on the overtone series, provide the necessary points of refer-
ence to enable every dot to stand for a further pitch in its own right (Stockhausen
FIGURE 5.14 Example of the derivation of pitch through imitation of a ratio between differences in
location from a constellation in Stockhausen’s Sternklang (1971) (image of der Adler © Stockhausen-
Stiftung für Musik, Kürten, Germany, 1971, http://www.karlheinzstockhausen.org; used with
permission)
144 Music and Shape
1977: 28), since in the vertical dimension two interperspective differences exist
between each dot and the lines, between which a ratio can be gauged and trans-
ferred cross-modally through a tertiary relationship. To the extent that this rela-
tionship can be regarded as imitative, so it can be classed as zygonic.
Consider that the constellations provide temporal performance information
too: according to Stockhausen ‘The points should be played/sung in a short
terse manner, corresponding rhythmically to their graphic layout’ (ibid.: 28).
Although the tempo is not prescribed, this implies tertiary imitation of onsets
as shown in Figure 5.14. And, finally, note that the size of dots in Stockhausen’s
score is linked to intensity: ‘There are points of 5 different thicknesses to which
5 degrees of loudness should correspond. The thickest point is played/sung at ff.
The smallest point corresponds to the respective tutti volume, and all other
points are relatively louder’ (ibid.: 29; emphasis in original). Because both size
and intensity can be compared through primary interperspective ratios, so
cross-modal relationships are possible at the secondary level. Again, these may
be deemed to be zygonic if the changes in musical dynamics are taken to exist
through imitation of their visual representations.
So much for ‘regular’ cross-modal relationships, which offer a coherent
way of linking visual images to musical sounds, and which may convey a
sense of derivation. There are other, ‘irregular’, possibilities too, which may
be ‘indirect’ or ‘arbitrary’. Examples of the former are to be found in the
charts that were part of some scores in the second half of the twentieth
century and beyond, which show wind players the required disposition of
their fingers in order to produce certain combinations of pitches simultane-
ously. For instance, an image such as that shown in Figure 5.15 may appear
FIGURE 5.15 Indirect connection between graphic and sound (Photographic image © Felicity
Ockelford, 2014)
in a piece of oboe music as an instruction to the player to produce f 2 and c3

together (Stone 1980: 194).
The relationship between shape and sound is indirect since it has two dis-
crete components. First, there is the connection between the chart and the
physical positioning of the player’s fingers on the instrument. In terms of
Peircean semiotics, the relationship between the two is iconic, since the loca-
tion of the fingers is represented by analogy. But there is a second, perhaps
less obvious, stage, in which the physicality of a player’s fingering serves as
an index for the sound that is subsequently (and consequently) made (Figure
5.15). Peirce describes an index as denoting something by virtue of an actual
connection between them, physically or causally, such as ‘the hand of a clock,
and the veering of a weathercock’ that can be observed or inferred ([1893–
1913] 1998: 274). Here there is a physical link between the position of the
fingers and the musical sound that is produced, and to the observer the former
can denote the latter.
With regard to zygonic theory, indirect relationships between notation and
musical sound like this one are only partly imitative—in the iconic element,
where, as far as the performer is concerned, fingering is copied from a graphical
representation, and vice versa in the case of the composer. Indices, in contrast,
are contingent or causal, rather than being generative through imitation.
‘Arbitrary’ cross-modal relationships in music notation work on the prin-
ciple of the Peircean ‘symbol’ (ibid.: 274), in which the connection between
a sign and the object it represents (here, a shape and a sound) is determined
purely by convention. That is to say, any sound can be symbolized by any shape
(Figure 5.16).
How are such relationships formed? According to Peirce (ibid.: 274), a sym-
bol only becomes so ‘in the fact that a habit, or acquired law, will cause replicas
of it to be interpreted as meaning [x, y or z]’. But whence does the habit or law
derive? And what is the mechanism for its replication? (Is it imitative?)
FIGURE 5.16 An arbitrary shape is given meaning by convention.

146 Music and Shape
It seems reasonable to assume that all arbitrary musical symbols were

originally conceived by an individual or individuals wishing to find a way
of conveying instructions to performers effectively yet efficiently. The mean-
ings of the symbols would then have been replicated through written or oral
instruction—‘meta-symbols’. In practice, teachers quite often supplement
explanation with demonstration; indeed, in teaching children for whom spo-
ken language is a challenge, such as some of those on the autism spectrum,
demonstration may even replace explanation (Ockelford 2013). A pupil may
then practise realizing the association between shape and sound by replicat-
ing it in various musical contexts. Hence, in zygonic terms, imitation may
play a part in arbitrary musical symbols becoming known and embedded in
cognition (see Figure 5.17).
For some people, a fourth type of connection is possible between visual
images and sound, occurring through ‘synaesthesia’: a neurological phenom-
enon in which the stimulation of one sense leads to involuntary experiences in
another (Baron-Cohen and Harrison 1996; Harrison 2001; Cytowic, Eagleman
and Nabokov 2011; see also Ward, Chapter 9 below). A number of well-known
musicians working in a variety of cultures, eras and genres, including Duke
Ellington, Billy Joel, Nikolai Rimsky-Korsakov and Olivier Messiaen, have
FIGURE 5.17 The meaning of an arbitrary shape learned through imitation

reported seeing certain colours in response to particular pitches, harmonies,

tonalities or timbres. However, synaesthesia is not confined to elite compos-
ers and performers, with prevalence estimates among the general population
varying from around 1 in 20 to 1 in 2,000 (see, for example, Baron-Cohen et al.
1996; Sagiv and Ward 2006). Nor is the response to sound confined to colours.
Ockelford and Matawa (2009: 52), for example, report the case of a nine-year-
old boy, Joshua, who has retinopathy of prematurity and, as a consequence,
is registered blind, with no sight in his right eye and only a small amount of
peripheral vision in his left. Joshua, who has ‘absolute pitch’ (the rare capac-
ity to identify and reproduce pitches in the absence of a reference tone; see, for
example, Deutsch 2012), describes how minor keys produce the sensation of a
‘bluey-grey tunnel’, which he is ‘rushing down’, while major keys are perceived
as an ‘orangey-red room’ in which there are ‘darker, shallow holes on the floor’.
Individual notes conjure up powerful associations too. The note B♭ elicits the
image of a light blue room with large windows in the distance, for example,
whereas B♮ is simply green. This implies a cross-modal relationship of the type
in Figure 5.18.
Neural connections between the visual

and auditory processing centres in the brain
Visual correlate
Auditory correlate 494Hz
Visual image/Auditory image
FIGURE 5.18 Cross-modal relationship engendered by pitch-colour synaesthesia

148 Music and Shape
Relationships like this are not imitative in nature, since they stem directly
from the idiosyncratic wiring of an individual’s neural circuitry; nor do they
readily fit within the threefold Peircean typology of signs. Once externalized,
however, imitation would of course be possible, and the colour, shape or other
image could function as a symbol in Peircean terms.
In anticipation of our discussion below concerning forms of music
notation that are accessible to touch readers, we now consider the extent
to which the principles of perceived structure in two-dimensional visual
images may transfer to the tactile figures used by blind people from the sec-
ond half of the twentieth century, when two main ways were developed to
convert lines and shapes into tactile form. The first, known as ‘thermoform-
ing’, constitutes a vacuum-moulding process using thin sheets of plastic,
which permit the shape and texture of small objects to be copied and stored
in a relatively easily manageable form. The second employs ‘swell-paper’,
which, when heated, produces raised lines and shapes in response to black
images (Edman 1992; Ockelford 1996a). Unfortunately, both approaches
are labour-intensive and time-consuming. However, refreshable haptic dis-
plays, akin to video screens for touch, are increasingly making analogues
of visual materials more readily accessible to blind people (Rastogi and
Pawluk 2013).
To the extent that the salient data from visual images can be detected when
they are reproduced in tactile form, so the kinds of relationships described and
illustrated above pertaining to dots, lines and shapes may be perceived in the
domain of touch (Révész 1950; Lechelt, Eliuk and Tanne 1976; Heller 1991;
Lederman and Klatzky 2009). It is my contention that these may be imitative
and therefore function zygonically. The perceptual and cognitive challenges of
assimilating information by touch alone should not be underestimated, how-
ever: readers of tactile scores can perceive what lies beneath their moving fin-
gertips only at any given point in time, which makes distances and angles hard
to judge. Moreover, two-dimensional images larger than a square centimetre or
so have to be mentally reconstructed from series of sensations gleaned through
a painstaking process of digital scanning, making considerable demands on
memory. Nonetheless, complex musical information can be encoded and
decoded in tactile form, as we shall see.
To summarize: in this section, using zygonic theory, we have identified
four ways in which qualities of musical sounds and shapes (or their tactile
equivalents) may be related systematically in cognition. Such relationships
inform the design of musical scores and have enabled them to function in
various ways to represent sounds and to instruct performers how and when
to produce them. The fourfold taxonomy of sound–shape relationships and
its connections with Peirce’s tripartite classification of signs is shown in
Figure 5.19.
sound-shape relationships
regular synaesthetic
(iconic) irregular
indirect arbitrary
(indexical and iconic) (symbolic)
FIGURE 5.19 Taxonomy of the possible types of relationship between musical sounds and visual
images
From theory to analysis: six examples of sound–shape

relationships in musical scores
In this section I explore how sound–shape relationships work in several types

of musical score using the theoretical assumptions set out above. I begin by
analysing a child’s ‘picture score’, produced in response to a rhythmic stimulus,
and intended to represent it. This is followed by a discussion of the tactile rep-
resentation of changes in pitch created by a young person who has had no sight
from birth. The semiotic properties of western staff-based notation are con-
sidered next, using a short passage of music that is subsequently transcribed
into braille and guitar chord symbols, for the purposes of comparative analy-
sis. Finally, a synaesthete’s visualization of a track from Jean-Michel Jarre’s
Oxygène is subjected to scrutiny using zygonic theory.
CHILDREN’S ‘PICTURE SCORES’
Figure 5.20 shows seven-year-old Jessica’s visual representation of a rhythm

played by one of her classmates, Henry, as observed by Jeanne Bamberger
(1995). Jessica and her friends had been set the task of putting down on paper
whatever they thought would help them remember the rhythm the following
day, or to help someone else to play it (Bamberger 2013: 10). And that is indeed
what happened: the next day, the children were able to reproduce the rhythm
assisted, at least in part, by their invented notation.
What cognitive processes can we assume are in play here? Bamberger herself
describes Jessica’s approach to rhythmic representation as ‘formal’, in that she
attempts to show the relative distances in time between the clapped events by
matching them with circles of two sizes (ibid.: 12). This interpretation implies
150 Music and Shape
FIGURE 5.20 A child’s transcription and performance of a rhythm
a consistent and coherent connection between inter-onset interval and diame-

ter, which, in terms of zygonic theory, equates to regular cross-modal mapping
(Figure 5.21). In Peirce’s nomenclature, the circles function iconically.
BLIND CHILDREN’S TACTILE REPRESENTATION

OF MUSICAL SOUNDS
Welch (1991), following Walker (1981, 1985, 1987), investigated the mental
images that congenitally blind children produced in response to auditory
stimuli by having them depict the variation in pitch of quasi-musical sounds
on a thin plastic membrane known as ‘German film’, on which is it possible
to produce raised lines using a stylus. As Walker had done before him, Welch
found that blind children systematically associated changes in pitch with the
vertical position on the page (Welch 1991: 220), in just the same way as their
FIGURE 5.21 Regular cross-modal mapping between sound and score, and score and sound
152 Music and Shape
sighted peers do. Whether this striking similarity of mental imagery, irrespec-
tive of vision, resulted from a central cognitive processing mechanism that
works across a number of perceptual modalities, or was merely a consequence
of a common musical metalanguage used in the education of both blind and
sighted children (in which notes are said to be ‘high’ or ‘low’, for example),
remained a moot point. Evidence for the former view is to be found in a range
of psychological work: for example, that of Pratt (1930) and of Roffler and
Butler (1968), whose research (including, in the latter case, participants who
were blind) led them to conclude that every tone has an intrinsic spatial char-
acter, a finding supported by the investigation of Rusconi et al. (2006), which
also showed that the internal representation of pitch is spatial in nature; the
experiments of Mudd (1963) and Thorpe (2015), discussed above, in which a
common pattern of pitch-position mapping was found in most participants;
and the empirical enquiry by Küssner and Leech-Wilkinson (2014), who
found that the majority of their research participants (particularly trained
musicians) represented pitch with height when using a real-time drawing para
digm. Other research, however, in the field of ethnomusicology—pointing to
the fact that, in certain cultures, pitch is not conceived of as high and low,
but ‘small’ and ‘large’ in Bali and Java, for example (Brinner 2008), ‘young’
and ‘old’ in the Amazonian basin (Seeger 2004), and ‘thick’ and ‘thin’ among
Farsi, Turkish and Zapotec speakers (Shayan, Ozturk and Sicoli 2011)—pro-
vides support for notion that pitch/space metaphors are not universal, but
language-based (see Zbikowski 2002: 66–76). However, the position is not
clear-cut: Walker’s comparative study (1987) involving children from indig-
enous Canadian ethnic groups including the Inuit, Haida, Secwepemc and
Tsimshian found that, overall, participants displayed a proclivity for asso-
ciating pitch with vertical placement (rather than pattern, shape or horizon-
tal length). This apparent contradiction may be explained by the finding of
Eitan and Timmers (2010), that diverse cross-domain mappings for pitch
exist latently in western participants in addition to the verticality meta-
phor—a conclusion supported by the work of Antovic (2009) with Serbian
and Romani children, and Dolscheid et al. (2013). Dolscheid’s research
team, based in Nijmegen, found that Dutch speakers’ tendency to describe
pitches as ‘hoog’ (high) and ‘laag’ (low) could be overridden through training,
whereby they could learn to conceive of pitch in the way that, as we noted
above, Farsi speakers do—as ‘naazok’ (thin) and ‘koloft’ (thick). Dolscheid
et al. took this to support the Whorfian hypothesis (Whorf [1956] 2012) that
language affects, in a fundamental way, the nature of perception and cogni-
tion. The impact of culture on perceptual salience is important too, as the
work of Athanasopoulos and Moran (2013) shows: here, a nonliterate Papua
New Guinean tribe, the BenaBena, produced iconic responses to short musi-
cal stimuli, which focused on hue and loudness rather than the variation in
pitch that proved to be most significant for western and Japanese musicians.
Clearly, the visual metaphors we intuitively use in conceptualizing music con-
stitute an area ripe for further cross-cultural psychological studies.
A typical example of the responses given by Welch’s (western-encultur-
ated) participants is shown in Figure 5.22. Here the stimulus, which had been
produced by Walker for his experiments (see 1987: 495), comprised two pitch
glides, produced by a linear sweep in frequency from 571 Hz to 800 Hz over
a period of two seconds and its reversal, separated by a second’s silence. The
sounds had been generated using a Roland CS15 synthesizer and were as pure
as it was practicable to make them, with 95 per cent of the total spectral energy
lying at the fundamental frequency. Lines produced on German film are inevi-
tably somewhat jagged due to the way in which the stylus presses into the
plastic, though inspection of the children’s efforts as a whole suggests that the
lines in Figure 5.22 were intended to be completely straight. Zygonic analy
sis indicates that there are potentially two forms of cross-modal imitation
functioning here at the tertiary level, whereby horizontal distance on the page
STIMULUS
571Hz 800Hz 571Hz
Time
(seconds)
0 1 2 3 4 5
RESPONSE
(after Welch 1991)
German film
raised lines
produced
with a
stylus
FIGURE 5.22 A congenitally blind child’s representation of pitch glides on German film
154 Music and Shape
FIGURE 5.23 Cross-modal imitation at the tertiary level assumed to underlie the representation of a
pitch g lide as a straight diagonal line
equates to perceived time, and height corresponds to pitch (see Figure 5.23).
Semiotically speaking, the child’s depiction of pitch is iconic.
STAFF NOTATION
Conventional western music notation, using staves of five lines, clefs, time sig-
natures and a range of arbitrary signs to indicate the duration of notes, rests,
articulation, phrasing, dynamics and other aspects of performance, presents,
in Peircean terms, an intriguing mix of iconic and symbolic representation.
FIGURE 5.24 Western staff notation embeds arbitrary symbols within a semi-regular framework of
pitch and time
156 Music and Shape
Pitch height is mapped approximately in the vertical dimension and time in

the horizontal, forming a visual framework in which the symbols for notes and
instructions pertaining to them are placed. The use of this scheme over sev-
eral centuries and its near universal take-up in musically literate communities
across the world in modern times is testament to its effectiveness (the potential
impact of western cultural hegemony notwithstanding), though becoming a
fluent reader demands a commitment of hundreds, if not thousands, of hours.
See Figure 5.24.
BRAILLE MUSIC NOTATION
Until the twentieth century, blind musicians learned new pieces by ear or by
having someone read out the information contained in a score—approaches
that continue to this day in many cultures and in aural traditions. However,
having to rely on others to render musical information that is freely available to
their sighted peers represents an unwelcome loss of autonomy for blind peo-
ple wishing to access music in notated form. It was Louis Braille himself—the
inventor of the literary braille code—who devised the first workable system
of reading and writing music in tactile form. His initial ideas were published
in 1829 (Lorimer 1996), though it took almost a century for braille music to
become established as the main method of making staff notation available for
blind people (Kersten 1997). A braille transcription of the excerpt from Figure
5.24 is shown in Figure 5.25.
Braille comprises cells of six potential dots, yielding 26 or sixty-four pos-
sible combinations (including the blank cell). These are arranged from left
to right in horizontal lines in the same way as print. Cells are read, one at a
time, as the index finger traces over them. Reading music through braille is not
entirely equivalent to using a print score, however, for a number of reasons.
First, the way that cells are laid out on the page means that they have to be read
in series: the two-dimensional nature of printed music scores, over which the
eye scans horizontally and vertically, is necessarily compressed into single lines.
Second, the limit of sixty-four dot-and-blank combinations makes context-
dependent meanings inevitable, with all the information required to define a
single note often necessarily being conveyed in more than one cell. Together,
these characteristics mean that the iconic elements of print notation—the
portrayal of pitch and time in the vertical and horizontal dimensions of the
page through imperfect tertiary zygonic imitation (see again Figure 5.24)—are
absent, and the visual representation of the ‘gist’ of the music is missing. All
that remains are arbitrary signs that function symbolically in Peircean terms;
this is one reason music in braille is more difficult to learn and use than its print
equivalent (Ockelford 1991a, 1996b).
FIGURE 5.25 Music Time in braille music notation (represented in print form), with explanations of the signs
158 Music and Shape
GUITAR CHORD SYMBOLS
Matrices of potential dots, ostensibly similar to those used in braille, also char-
acterize guitar chord symbols. The way that these function as signs is very dif-
ferent, however. The four chords used in the excerpt shown in Figure 5.24 may
be visually communicated to guitarists as indicated in Figure 5.26. There are
three semiotic processes at work here. First, there is an iconic link between each
graphic and the positioning of the guitarist’s fingers (Figure 5.15). Second,
each hand position on the strings functions as an index for the chord that is
produced. Third, the distance between the ‘nut’ (represented by the thick black
horizontal line at the top of each graphic) and each of the points where the
fingers press on the strings (shown by the black ellipses) is analogous with the
imaginary interval between the pitch to which the string is tuned and the note
that actually sounds (see Figure 5.27).
SCORE PRODUCED THROUGH SYNAESTHETIC RESPONSE

TO MUSIC
My final example is of a score produced by a synaesthete, Jamie Roberts, in

response to Jarre’s Oxygène, track 4. The representation of the musical segment
32‒40 seconds from the beginning is shown in Figure 5.28. Originally, each of
the two motives shown largely filled an A4 sheet of paper (297 mm x 210 mm).
At the time of creating his depiction of the piece, Jamie was seventeen and
had had chronic fatigue syndrome for more than four years, which had greatly
hindered his education. He had never had formal tuition on an instrument,
though he had enjoyed his weekly class music sessions at the selective second-
ary school he had attended, and he had started to teach himself the keyboard.
Jamie reports that when he listens to music he visualizes it as a graph, seeing
lines in two-dimensional space that he conceives as converting the energy of
sound into the energy of sight. He finds it hard to imagine that other people do
not see the images that he does. As Figure 5.28 shows, Jamie’s representation
picks out the two most salient strands in the texture, the motoric ostinato and
the main melodic motive. His visualization captures the rhythmic pulses of the
music with jagged lines, whose contour appears not to be related to pitch. The
FIGURE 5.26 The fingering for the opening four chords of Music Time presented using guitar chord
symbols
FIGURE 5.27 The three semiotic processes at work as a guitarist performs from a chord symbol
(photographic image © Felicity Ockelford, 2014)
FIGURE 5.28 Fragment of Jamie Roberts’ synaesthetically derived score of Jean-Michel Jarre’s
Oxygène, track 4
contrasts in timbre are shown in colours (see Figure 5.28 on ). Because of

the repetition in the music, which is matched visually in Jamie’s rendition,
elements could be interpreted symbolically by listeners following the score; see
Figure 5.29. Hence, although we can presume that the source of the images was
different in neurological terms from that of the circles in the score produced by
the child in Bamberger’s study (see again Figure 5.20), their semiotic status is
the same.
160 Music and Shape
FIGURE 5.29 Types of semiosis functioning in a fragment of Jamie Roberts’ synaesthetic score of
Oxygène
Conclusion
This chapter explored the function of shape in music notation and set out a
model, using the principles of zygonic theory, that aimed to show how forms of
cross-domain mapping between musical sounds and visual images may logically
occur in cognition. Four types of relationship between the perceptual domains
pertaining to hearing and vision were identified: regular, irregular (which may
be indirect or arbitrary) and synaesthetic. The connections between this think-

ing and Peirce’s threefold typology of signs—icon, index and symbol—were
investigated in the context of children’s picture scores, tactile representations
of pitch, staff notation, braille music, guitar chord symbols and a synaesthete’s
representation of track 4 of Oxygène.
In summary, the findings were as follows. First, the evidence from chil-
dren’s untutored representations of music suggests that sophisticated cross-
modal mapping between sound and shape, using tertiary interperspective
relationships (which incur three steps of abstraction from the perceptual
surface), occurs early and intuitively. This view is reinforced by blind chil-
dren’s representations of changing pitch, which are equally sophisticated and
exist in the absence of any visual model to guide them, although it seems
likely that, their absence of vision notwithstanding, such children may have
been influenced by the common musical metalanguage they share with their
sighted peers, with its implied conceptualization of qualities of pitch such
as ‘high’ and ‘low’. Despite these general similarities, though, psychological
studies have shown that the precise nature of cross-modal mapping may vary
from individual to individual and according to cultural convention. However,
the individual perceptual differences that exist do not appear to interfere with
the capacity of musicians to adopt one of the forms of standard notation
that have evolved which use (albeit approximately) the horizontal dimension
to represent time and vertical position on the page as an analogue of pitch.
While most of the symbols used in music notation are arbitrary products of
custom and practice, for a relatively few people, certain cross-modal map-
pings may be hard-wired through synaesthesia. These vary from one person
to another, although synaesthetically derived representations can be learned
and appreciated by others.
Beyond the thinking set out in this chapter, potential next steps include a
comprehensive exploration of different forms of symbolic visual representa-
tion of sound to verify the broad applicability of the model set out here or to
suggest certain modifications to it. More research is required into how cross-
modal mappings are learned and into auditory-visual synaesthesia. Finally, it
may be possible to use the model to generate new forms of audiovisual instal-
lation that have an intuitive cross-modal appeal.
References
Antovic, M., 2009: ‘Musical metaphors in Serbian and Romani children: an empirical
study’, Metaphor and Symbol 24/3: 184–202.
Bamberger, J., 1995: The Mind behind the Musical Ear: How Children Develop Musical
Intelligence (Cambridge, MA: Harvard University Press).
162 Music and Shape
Bamberger, J., 2013: Discovering the Musical Mind: A View of Creativity as Learning
(Oxford: Oxford University Press).
Baron-Cohen, S. and J. Harrison, eds., 1996: Synaesthesia: Classic and Contemporary
Readings (Hoboken, NJ: Wiley-Blackwell).
Baron-Cohen, S., L. Burt, F. Smith-Laittan, J. Harrison and P. Bolton, 1996: ‘Synaesthesia:
prevalence and familiarity’, Perception 25/9: 1073–9.
Boulez, P., [1963] 1971: Boulez on Music Today, trans. S. Bradshaw and R. R. Bennett
(London: Faber and Faber.
Brinner, B., 2008: Music in Central Java: Experiencing Music, Expressing Culture (New York:
Chalmers, D., 1996: The Conscious Mind: In Search of a Fundamental Theory (New York:
Cone, E., 1987: ‘On derivation: syntax and rhetoric’, Music Analysis 6/3: 237–56.
Cross, I., 1998: ‘Music analysis and music perception’, Music Analysis 17/1: 3–20.
Cytowic, R., D. Eagleman and D. Nabokov, 2011: Wednesday Is Indigo Blue: Discovering
the Brain of Synesthesia (Cambridge, MA: MIT Press).
Deutsch, D., 2012: The Psychology of Music, 3rd edn (Waltham, MA: Academic Press).
Edman, P., 1992: Tactile Graphics (New York: AFB Press).
odiles: cross-domain mappings of auditory pitch in a musical context’, Cognition 114/
3: 405–22.
Fauconnier, G., [1985] 1994: Mental Spaces: Aspects of Meaning Construction in Natural
Language (Cambridge: Cambridge University Press).
Gjerdingen, R., 1999: ‘An experimental music theory?’, in N. Cook and M. Everist, eds.,
Rethinking Music (Oxford: Oxford University Press), pp. 161–70.
Harrison, J., 2001: Synaesthesia: The Strangest Thing (Oxford: Oxford University
Press).
Heller, M., 1991: ‘Haptic perception in blind people’, in M. Heller and W. Schiff, eds., The
Psychology of Touch (Hillsdale, NJ: Erlbaum), pp. 129–60.
Kanai, R. and N. Tsuchiya, 2012: ‘Qualia’, Current Biology 22/10: R392–6.
Kersten, F., 1997: ‘The history and development of braille music methodology’, The
Bulletin of Historical Research in Music Education 18/2: 106–25.
Kowler, E., 2011: ‘Eye movements: the past 25 years’, Vision Research 51/13: 1457–83.
Küssner, M. and D. Leech-Wilkinson, 2014: ‘Investigating the influence of musical training
on cross-modal correspondences and sensorimotor skills in a real-time drawing para-
digm’, Psychology of Music 42/3: 448–69.
Lakoff, G., 1987: Women, Fire, and Dangerous Things: What Categories Reveal about the
Mind (Chicago: University of Chicago Press).
Lechelt, E., J. Eliuk and G. Tanne, 1976: ‘Perceptual orientational asymmetries: a compari-
son and visual and haptic space’, Perception and Psychophysics 20/6: 463–9.
Lederman, S. and R. Klatzky, 2009: ‘Haptic perception: a tutorial’, Attention, Perception,
& Psychophysics 71/7: 1439–59.
Lorimer, P., 1996: ‘A critical evaluation of the historical development of the tactile modes
of reading and an analysis and evaluation of researches carried out in endeavours to
make the Braille code easier to read and to write’ (PhD dissertation, University of
Birmingham).
Mudd, S., 1963: ‘Spatial stereotypes of four dimensions of pure tone’, Journal of Experi
mental Psychology 66/4: 347–52.
Ockelford, A., 1991a: Music and Visually Impaired Children: Some Notes for the Guidance
of Teachers (London: Royal National Institute for the Blind).
Ockelford, A., 1991b: ‘The role of repetition in perceived musical structures’, in P. Howell,
R. West and I. Cross, eds., Representing Musical Structure (London: Academic Press),
pp. 129–60.
Ockelford, A., 1996a: Music Matters: Factors in the Music Education of Children and Young
People Who Are Visually Impaired (London: Royal National Institute of the Blind).
Ockelford, A., 1996b: Points of Contact: A Braille Approach to Alphabetic Music Notation
(London: Braille Authority of the United Kingdom).
Ockelford, A., 1999: The Cognition of Order in Music: A Metacognitive Study (London:
Roehampton Institute).
Ockelford, A., 2002: ‘The magical number two, plus or minus one: some limits on our
capacity for processing musical information’, Musicae Scientiae 6/2: 185–219.
Ockelford, A., 2005: Repetition in Music: Theoretical and Metatheoretical Perspectives
(Aldershot: Ashgate).
Ockelford, A., 2006: ‘Implication and expectation in music: a zygonic model’, Psychology
of Music 34/1: 81–142.
Ockelford, A., 2009: ‘Zygonic theory: introduction, scope, prospects’, Zeitschrift der
Gesellschaft für Musiktheorie 6/1: 91–172.
Ockelford, A., 2012a: Applied Musicology: Using Zygonic Theory to Inform Music
Psychology, Education and Therapy Research (New York: Oxford University Press).
Ockelford, A., 2012b: ‘What makes music “music”? Theoretical explanations using zygonic
theory’, in J.-L. Leroy, ed., Actualités des universaux musicaux (Topicality of Musical
Universals) (Paris: Editions des Archives Contemporaines), pp. 123–48.
Ockelford, A., 2013: Music, Language and Autism: Exceptional Strategies for Exceptional
Minds (London: Jessica Kingsley).
Ockelford, A. and C. Matawa, 2009: Focus on Music 2: Exploring the Musical Interests
and Abilities of Blind and Partially-Sighted Children with Retinopathy of Prematurity
(London: Institute of Education).
Peirce, C., [1867–71] 1984: Writings of Charles S. Peirce: A Chronological Edition, vol. 2
(Bloomington: Indiana University Press.
Peirce, C., [1893–1913] 1998: The Essential Peirce: Selected Philosophical Writings, vol. 2
(Bloomington: Indiana University Press.
Pomerantz, J. and M. Portillo, 2011, ‘Grouping and emergent features in vision: toward a
theory of basic Gestalts’, Journal of Experimental Psychology: Human Perception and
Performance 37/5: 1331–49.
Pratt, C., 1930: ‘The spatial character of high and low tones’, Journal of Experimental
Psychology 13/3: 278–85.
Rastogi, R. and D. Pawluk, 2013: ‘Dynamic tactile diagram simplification on refreshable
displays’, Assistive Technology 25/1: 31–8.
Révész, G., 1950: Psychology and Art of the Blind, trans. H. Wolff (London: Longmans,
Green).
164 Music and Shape
Risset, J.-C. and D. Wessel, 1999: ‘Exploration of timbre by analysis and synthesis’, in
D. Deutsch, ed. The Psychology of Music, 2nd edn (New York: Academic Press),
pp. 113–69.
Roffler, S. and R. Butler, 1968: ‘Localization of tonal stimuli in the vertical plane’, The
Journal of the Acoustical Society of America 43/6: 1260–6.
Roskies, A., 1999: ‘The binding problem’, Neuron 24/1: 7–9.
Rusconi, E., B. Kwan, B. Giordano, C. Umilta and B. Butterworth, 2006: ‘Spatial represen-
tation of pitch height: the SMARC effect’, Cognition 99/2: 113–29.
Sagiv, N. and J. Ward, 2006: ‘Cross-modal interactions: lessons from synesthesia’, in S.
Martinez-Conde, S. Macknik, L. Martinez, J.-M. Alonso and P. Tse, eds., ‘Visual per-
ception—fundamentals of awareness: multi-sensory integration and high-order percep-
tion’, Progress in Brain Research 155: 263–75.
Seeger, A., 2004: Why Suyá Sing: A Musical Anthropology of the Amazonian People
(Champaign, IL: University of Illinois Press).
Shayan, S., O. Ozturk and M. Sicoli, 2011: ‘The thickness of pitch: crossmodal metaphors
in Farsi, Turkish, and Zapotec’, The Senses and Society 6/1: 96–105.
Slawson, W., 1985: Sound Color (Berkeley: University of California Press).
Stevens, S. S., 1975: Psychophysics: Introduction to Its Perceptual, Neural, and Social
Prospects (New Brunswick, NJ: Transaction).
Stockhausen, K., 1977: Sternklang: Park-Music für 5 Gruppen (Kürten: Stockhausen-Verlag).
Stone, K., 1980: Music Notation in the Twentieth Century: A Practical Guidebook
(New York: Norton).
Thorpe, M., 2015: ‘The cognition of pitch patterns and cross-modal spatial structure’ (PhD
dissertation, University of Roehampton).
Walker, R., 1981: ‘The presence of internalised images of musical sounds and their rel-
evance to music education’, Bulletin of the Council for Research in Music Education
66/67: 107–12.
Walker, R., 1985: ‘Mental imagery and musical concepts: some evidence from the congeni-
tally blind’, Bulletin of the Council for Research in Music Education 85: 229–38.
Walker, R., 1987: ‘The effects of culture, environment, age, and musical training on choices
of visual metaphors for sound’, Perception and Psychophysics 42/5: 491–502.
Welch, G., 1991: ‘Visual metaphors for sound: a study of mental imagery, language and
pitch perception in the congenitally blind’, Canadian Journal of Research in Music
Education 33 (Special ISME Research Edition): 215–22.
Whorf, B., [1956] 2012: Language, Thought, and Reality, 2nd edn, ed. J. Carroll, S. Levinson
and P. Lee (Cambridge, MA: MIT Press).
Zbikowski, L., 2002: Conceptualizing Music: Cognitive Structure, Theory, and Analysis
Reflection
Alice Eldridge, cellist and coder
Inside and outside shape
I see music as a very human means of creating, exploring and communicating

abstract ideas and emotions. I believe this is made possible through the capacity
of organized sound to recruit and coordinate dynamic patterns of interaction
across a network of diverse objects and processes distributed across the brains,
bodies and worldly objects of musicians and listeners. Reflecting my personal
practice as an improvising cellist and my academic interest in digital music, I
offer a particular account of some of the roles shape plays in framing and sup-
porting these processes in both acoustic and digital music-making. My own
experiences are accompanied by those of other improvisers1 to illustrate the
idea that shape provides a lingua franca to conceptualize and talk about rela-
tions between the otherwise divergent array of objects and activities tied up
in musical creation, performance and listening. In particular, I consider how
the role of shape differs during what I am calling ‘offline’ (learning, practis-
ing, composing) versus ‘online’ musicking (performance and improvisation in
particular).
Acoustic practice: integrating shapes in ears, fingers and eyes
Musicians’ daily practice is structured by patterns. On the one hand, we drill

scales into our fingers until they become automatic; on the other, we dream
up awkward bowing, breathing, fingering, timing or interval patterns to retain
focus and prevent ourselves going onto automatic pilot. By seeking alternative
165
166 Music and Shape
representations which create more easily memorizable shapes, the deliberate

choice of patterns can support memory as well as help focus attention.
When practising and learning music I definitely think of shape. In some
situations it can make the learning easier. For example practising scales
for improvisation—practising them with some sort of rhythmical pattern.
That also makes it more interesting and in that way keeps the concentra-
tion better. (Julie Kjær, saxophonist)
These patterns we concoct for practice can be seen as mental images that allow
us to integrate representations in motor, visual and sonic processes—shapes
in our bodies, eyes and ears. Initially, integration of these mappings requires
conscious effort. As a child I learned to read music by first learning Curwen’s
solfège handsigns (see Beach 1914) and songs about colourful insect characters:
‘C is for Clarence Caterpillar, D is for Dora Dragon Fly’, etc., forging links
between note names and their position in physical, pitch and visual space. As
you learn an instrument, another set of mappings is established, from the notes
on the staff to the fingerings necessary to produce the designated pitches. These
visual–motor mappings rapidly take precedence, a phenomenon neatly illus-
trated by scordatura notation. Bach’s Cello Suite No. 5 is written for scordatura
tuning as shown in Figure R.12 (top right). When I see the top interval of the
chord in bar 2 of the Prelude, I ‘know’ and hear a minor third but quite happily
read and finger a perfect fourth, suggesting that I am not reading ‘B♭’ at all, but
‘first finger in half position on the A string’.
The integral role of muscle memory in musicianship is nothing new: research
into musical implications of motor theories of perception abounds (e.g.
Godøy 2010). Anecdotes from expert musicians deftly illustrate. Improviser
Steve Beresford, speaking before a recent concert of the London Improvisers’
Orchestra, remarked that despite having moved from trumpet to piano decades
ago, he still automatically thinks about melody lines in terms of trumpet fin-
gerings. The sight of musicians (saxophonists and keyboard players espe-
cially) air-fingering in gigs to work out the thrust of a solo they’re hearing is
not uncommon. These are not just ‘air instrument’ performances, mimicking
sound-producing gestures (Godøy, Haga and Jensenius 2006): these habits of
expert musicians suggest that actual or imagined activation of motor schemata
FIGURE R.12 Opening of the Prelude of Bach’s Cello Suite No. 5 in scordatura notation
Reflection: Alice Eldridge 167
is an integral part of conscious musical comprehension. In expert instrumen-

talists, it seems, sensory-motor contingencies are so developed that fingering
patterns and sound producing gestures are not only automatic, but an integral
part of conscious musical cognition.
Representations of pattern in music software languages
If acoustic instrumentalists offload music cognition onto shapes in their fingers

and instruments, a growing community of musician–programmers is develop-
ing software languages which provide a similar cognitive scaffold. Live Coders
embrace the unique potential of software as a dynamic instrument that can be
rewritten in real time, improvising with software algorithms on the fly. Writ
large in their manifesto is a commitment to work directly with algorithms as
thoughts and the performer’s mind as instrument.2
Some live coding languages explore 2D visual environments such as David
Griffiths’ Scheme bricks (2008), in which pieces of code, representing musical
methods and parameters, can be plugged into each other much like nested build-
ing blocks. Others combine text and graphic elements, such as Alex McLean’s
Texture (McLean and Wiggins 2011), where spatial locations of the elements
have syntactic relevance, affording the live composition of musical ideas in 2D
space, or Thor Magnusson’s ixi lang (2011a), which combines graphic and tex-
tual elements, giving space and style functional roles in improvisation. More
recently, McLean’s Tidal is presented as a ‘mini-language embedded in Haskell,
for the live coding of pattern’ (McLean 2014), moving away from explicit spa-
tial metaphor and offering terse and powerful expressions for the creation and
manipulation of generative music.
In these situations, the software language itself acts to externalize and sup-
port musical cognition, much like a musical score (Magnusson 2011b): each
in its own way actively foregrounds the representation and manipulation of
patterns. Just as acoustic instruments are recognizable by their timbral charac-
teristics, so music-software languages are evolving to afford particular musical
structures: ‘Connoisseurs report that they can identify certain musical environ-
ments, not only by how they sound, but also by which musical patterning or
form they afford’ (Magnusson 2011c: 1).
Inside shape in improvisation
Guitarist John Russell talks about free improvisation as the closest he gets
to ‘what music actually is’.3 I might go further and suggest that improvising
with others distils many of the joys of being human, capturing the best bits of
168 Music and Shape
conversation, cooking, dancing and dressing up, meeting people and testing
and exchanging new ideas: instantaneously making something. At such times,
the deliberate planning, monitoring and manipulation of activities give way to
more intuitive processes.
In one improviser’s comments, the lack of conscious engagement is almost
the hallmark of ‘good’ improvising, active consideration of shape coming into
play only when musical spontaneity ebbs:
With some musicians I have a really good connection, the music flows
and I don’t tend to think a lot. I don’t think much about shape (and
sometimes not at all) but it feels like I am—and the other musicians are—
working with it on a more unconscious level. I close my eyes, using my
ears, feeling of the soundwaves in my body, the colours I get when my
eyes are closed and the mood it all puts me in to play and reflect on the
music. With other musicians it can be harder to get this flow and I tend to
start thinking more and also be more conscious about the shape and how
to do it. (Julie Kjær, saxophonist)
A similar distinction is seen through fMRI studies. Scans of professional pian

ists showed a significant deactivation of the areas of the brain associated with
self-conscious monitoring while freely improvising compared to playing learned
music (Limb and Braun 2008). Other improvisers stress a fundamentally bodily
engagement in performance:
My involvement in an improvisation is kind of visceral. But in that sense
it has shape. I’m very much aware of gesture, but felt as a kind of move-
ment of energy, or line. A kind of fluidity that morphs according to what’s
going on around it. That’s probably as well as I can explain it. It’s a bit
like the feeling I get when I watch British sign language (which I under-
stand a little) or dance, or even things moving around me—that move-
ment translates into a body feeling that feels like music. I even find myself
making internal (or external if I’m not careful!) sounds when watching
things move. (Rachel Musson, saxophonist)
If shaped representations support the conscious planning, monitoring and

manipulation of musical processes in offline musicking, in improvisation
such distinct, deliberate musical strategies are less often in play. Metaphors
of shape become a means to articulate an otherwise quite ineffably integrated
experience:
When I listen I’m outside the shape looking at it. When I’m playing I’m
inside it, traveling, with no overall sense of its size or layout. I’ve worked
with African musicians who, when we’ve been working out arrangements,
use the phrase ‘you can come inside’ when it’s your turn to play. (Stephen
Hiscock, percussionist and composer)
Reflection: Alice Eldridge 169
References
Beach, C. B., ed., 1914: ‘Curwen, John’, The New Student’s Reference Work (Chicago:
Compton).
Godøy, R. I., 2010: ‘Gestural affordances of musical sound’, in R. I. Godøy and M.
Leman, eds., Musical Gestures: Sound, Movement, and Meaning (London: Routledge),
pp. 103–25.
Godøy, R. I., E. Haga and A. Jensenius, 2006: ‘Playing “air instruments”: mimicry of
sound-producing gestures by novices and experts’, in S. Gibet, N. Courty and J.-F.
Kamp, eds., Gesture in Human-Computer Interaction and Simulation: 6th International
Gesture Workshop, Lecture Notes in Artificial Intelligence 3881 (Berlin: Springer), pp.
256–67.
Griffiths, D., 2008: ‘Scheme bricks’. Software available at https://fo.am/scheme-bricks/
(accessed 9 April 2017).
Limb, C. J. and A. R. Braun, 2008: ‘Neural substrates of spontaneous musical perfor-
mance: an fMRI study of jazz improvisation’, PLoS One 3/2: e1679.
Magnusson, T., 2011a: ‘ixi lang: a SuperCollider parasite for live coding’, in Proceedings
of the International Computer Music Conference, 31 July–5 August 2011, University of
Huddersfield (conference document), pp. 503–6. Software available at https://github.
com/thormagnusson/ixilang (accessed 9 April 2017).
Magnusson, T., 2011b: ‘Algorithms as scores: coding live music’, Leonardo Music Journal
21: 19–23.
Magnusson, T., 2011c: ‘Confessions of a live coder’, in Proceedings of the International
Computer Music Conference, 31 July–5 August 2011, University of Huddersfield (confer-
ence document), pp. 609–16.
McLean, A., 2014: ‘Making programming languages to dance to: live coding with Tidal’, forth-
coming in Proceedings of the 2nd ACM SIGPLAN International Workshop on Functional
Art, Music, Modelling and Design, Gothenburg, Sweden, 1–3 September 2014. Software
available at https://github.com/tidalcycles/Tidal (accessed 9 April 2017).
McLean, A. and G. Wiggins, 2011: ‘Texture: visual notation for the live coding of pat-
tern’, in Proceedings of the International Computer Music Conference, 31 July–5 August
2011, University of Huddersfield (conference document), pp. 612–28. Software available
at https://github.com/yaxu/texture (accessed 9 April 2017).
6
The shape of musical improvisation

Milton Mermikides and Eugene Feygelson
Contemporary enquiries into the art of musical improvisation cross a range of dis-
ciplines from cultural studies, pedagogy, psychology, neuroscience, mathematical
and computer modelling, and quantitative and qualitative analyses to ethnogra-
phy. Even a succinct survey of the state of current research would fill a volume,
as is meticulously demonstrated in Berkowitz’s The Improvising Mind (2010) and,
in the context of jazz, Berliner’s seminal Thinking in Jazz (1994). This chapter
addresses the concept of shape in musical improvisation, and offers insights into
how it might be described, identified or created. Jazz pedagogy is used to pres-
ent key improvisational mechanisms and as a foundation on which a model of
improvisation is constructed. The concept of shape in improvisation is addressed
within this model. Taking examples from jazz repertoire, we offer an approach
to categorizing improvisational shape. Finally, these concepts are adopted in the
detailed analysis of a classical cadenza, demonstrating the wide applicability (and
stylistic neutrality) of this approach to improvisational analysis.
One must acknowledge the complex continua that exist between composi-
tion, performance and improvisation (Benson 2003), as well as the role of intu-
ition, spontaneity and ‘polished improvisation’ in composition. Nonetheless,
certain musical practices rely on—indeed, stylistically require—a significant
degree of spontaneity, or at any rate a purposeful avoidance of premeditation
or prescription. Performance-without-complete-blueprint—as might be found
in a jazz solo, a Hindustani raga, a classical cadenza or ‘free’ improvisation—
may itself involve significant preparation and hard-earned skill, but there
is a relatively higher number of musical choices made ‘in the moment’ than
are found in the typical performance of a score or memorized piece. In these
contexts, the audience and (and until the last moment) the performer(s) can-
not be fully aware of what music will emerge. With this definition it might be
argued that there are no completely improvised musics and—if performed by
170 humans—no music that avoids spontaneity, or at least the possibility of small
The shape of musical improvisation 171
variation, in performance. A fully notated score is never just that, and, even if
notes and relative pitch durations are prescribed meticulously, in the moment
of performance spontaneous expressive (or unavoidable) alterations of tempo,
dynamic, timbre and inflection will emerge. There is a spectrum of allowable
spontaneity in performances of all kinds, from subtle nuance to significant
invention (although never without some type of constraint). This chapter deals
with musical shape that might emerge at the inventive edge of the spectrum,
taking examples from the conventional jazz improvisation repertoire (solo-
ing or melodic interpretation within fixed structural and harmonic outlines)
and the improvised classical cadenza (a stylistically constrained exploration of
thematic materials). Furthermore the structural components that are addressed
are distinct from (or additional to) the superficial structural contexts in which
the improvisation might take place (such as the jazz standard lead sheet with
its given form and harmonic framework, or the rhythmic cycle, tala, of a
Hindustani raga). We are in essence considering the shape that might emerge
from spontaneously selected musical elements.
The formal shaping of an improvisation is addressed widely in improvi-
sational pedagogical literature, for example in Hal Crook’s How to Improvise
(1991) or Berkowitz’s survey of nineteenth-century improvisational treatises
(2010: 15–80). Conversely, research into (usually jazz) improvisation has aimed
to reveal musical strategies and structures—however implicit or intuitive—that
occur in improvisational repertoire. These include—citing just three of many
possible examples—Markov chains in Coltrane’s improvisations on Giant Steps
(Franz 1998), phrasing schemata in Charlie Parker’s blues soloing (Love 2012)
and patterns in the transformation of those musical features less amenable to
standard notational analysis (see for example Benadon’s microtiming analysis
of early jazz improvisation in Time Warps in Early Jazz, 2009).
Despite a foundation of jazz practice, this chapter takes an abstracted model
of improvisation (proposed by Mermikides 2010, which shares important fea-
tures of Pressing’s 1988 model) through which improvisational shapes might
be identified. In the model presented here, a musical object is seen as exist-
ing at a point in multidimensional musical space (M-Space) and possessing an
array of properties available for modification. Improvisation is represented as
the artful motion through this space, and characteristics of this motion may
form larger-scale musical structures. This view of improvisation offers practical
applications for performance (and composition) in a range of styles, as well as a
framework within which to analyse and appreciate the repertoire and practice.
Chains of thought
One conceptualization of— and approach to— improvisation is as a musi-

cal object (such as a phrase, a melody or an idea) undergoing appreciable
172 Music and Shape
transformations over time (see for example Damian and Feist 2001: 12–20;
Crook 1995: 8–31; and Berliner 1994: 146–69). More specifically, this object
might be perceived as containing a set of musical properties, with each property
open to modification in future phrases. New material is thus created by alter-
ing a selected parameter, or set of parameters, from previous objects. In other
words, improvisation involves the construction of a musical train of thought
where every subsequent object relates to a preceding one in terms of a changing
set of variable parameters. This concept is best fleshed out with a real-world
example. In a section of John Coltrane’s solo in Acknowledgement (Coltrane
1965: 2:06–2:32), to pick one of countless examples from the repertoire, a
simple motive (C, E♭ and F typically in a quaver, quaver, crotchet—or more
generally short, short, long—rhythm) is manipulated in terms of chromatic
transposition, metric placement and rhythmic subdivision (more specifically,
the duration between note onsets also known as an inter-onset interval, IOI).
Even a first listen to the extract will allow a clear identification of the central
role of this musical ‘seed’ in the ensuing improvisation, with these three pri-
mary degrees of transformation. Occasionally, objects merge—for example, the
last note of one phrase becomes the first of the next—but on the whole this
passage provides an idealized, clear (and readily notated) example of this con-
cept of improvisation: the expressive variation of an established musical object.
Many improvisational passages contain patterns that are less immediately
recognizable than the linear chains-of-thought model that Coltrane provides in
the last example. Figure 6.1 shows how an opening phrase might be developed
through a number of complex transformational pathways.
In this analysis, phrases are generally identified as being related to a previ-
ously occurring phrase, and may themselves combine into larger phrases or
break off into smaller ones. They are labelled accordingly. The types of rela-
tionships between phrases are described by sets of transformational processes
in boxed text.
This improvisational methodology thereby involves the selecting of a par-
ticular subset of musical properties of a phrase, which is then either fixed or
modified by varying amounts in the subsequent phrase, to form a series of
interlinking chains. Even in this short extract, the themes of variation, trans
position and recombination (as identified by Berkowitz in the context of classi-
cal improvisational pedagogy; 2010: 39–80), the creation of logical expectations
and their potential subversion (‘rational deception’; Bach [1753, 1762] 1949:
434) and analogies with syntax (Patel 2003) are revealed.
Some further points for consideration:
• There may be many valid analyses of an improvisation, and the
performer’s conception and the listener’s interpretation of the solo
may differ.
• A single phrase may also form the impetus for any number of
subsequent phrases, along any number of transformational processes.
FIGURE 6.1 An illustration of a complex chains-of-thought improvisation methodology (Mermikides 2010)
174 Music and Shape
• Phrases may be hierarchical (e.g. B.1 contains phrases A.1.2, A.1.3

and A.1.4) and one of several reasonable analyses is presented here.
Small-or larger-scale phrase structures are all open to modification.
• Sufficient transformation may result in the formation of a new phrase

unit for further modification.
• This analysis purposefully omits the complex interactions between
performers in an ensemble, whereby musical material created by
one player may influence the improvisations of any number of
other players. The interaction between players follows a similar
methodology whereby musical material is shared within a common
ideas pool and modified between members of the ensemble (see
Sawyer 1992; Monson 1991, 1996; and Berliner 1994: 647–51
for theoretical and transcription analyses of ensemble motivic
interactions).
• A particular solo, performer’s identity or musical style may be

described not just by its melodic, harmonic and rhythmic vocabulary,
but also by the types of transformational processes and extent of
variations employed.
• Phrases G.1 and G.2 in Figure 6.1 provide a glimpse of how technology

may be employed as part of the improvisational process, and how it
may offer otherwise unavailable transformational dimensions.
• The precise demarcation of musical material into units for
transformation, which are called ‘phrases’ here, is a subjective exercise.
Furthermore, the definition of a phrase unit may change in relation
to the transformational process employed. For example, a set of
five notes might be considered a complete phrase for a sequencing
process, but a timbral modulation may also be applied through those
five notes, implying a smaller conceptual subdivision. It is tempting
(and sometimes useful) to describe musical fragments as cells, which
are combined into hierarchical phrases. But it soon becomes clear in
analysis that a cell’s autonomy is temporary, and that there are no
uniformly indivisible musical units; even a single note can be subject
to all manner of transformation and recombination, and parameters
such as timbre do not allow for such a convenient atomistic
perspective. As an illustration of the difficulty (perhaps futility) of
declaring a lower limit to the object, Curtis Road’s Microsound (2004)
offers an exploration of ‘sound particles’—lasting less than 100 ms—
and their role in the creation of line, pulse and texture.
• The example in Figure 6.1 mainly presents a traditional improvisation

where phrases occur in a strict series. Contrapuntal mechanisms,
ensemble interactions and the smearing of a phrase (through
electronics) allow phrases to coexist and their relationships to be
parallel, rather than strictly linear.
• There is a differing amount of variation from one phrase to the next,

so a solo may be characterized by the extent of relatedness between
phrases. This concept of musical proximity underpins the idea of
improvisational shape discussed later in the chapter.
The chains-of-thought model of improvisation is employed here as a form

of retrospective analysis, but it may also be used in the context of real-time
improvisational choices. For example, Figure 6.2 shows how a phrase offers the
performer a set of options for the continuation of an improvisation based on
various transformational processes.
Since there is a nontrivial number of precise transformational processes (let
alone combinations thereof), Figure 6.2 shows but the briefest glimpse into the
refracting pathway of an improvisation, which hints at the wealth of musical pos-
sibilities available. Phrase-based improvisation may be seen as the artful carving
of a pathway through this mesh of musical possibilities. Indeed, a particular solo
or style of playing may be described in terms of which transformational pro-
cesses are selected. Acknowledgement favours chromatic transposition, rhythmic
displacement, and augmentation or diminution. Other improvisational passages
may use the manipulation of other musical parameters (such as diatonic transpo-
sition, inversion, retrograde, dynamics, timbre, phrase length, etc.) as expressive
mechanisms. Whatever the parameters employed, the premise of the chains-of-
thought improvisational model is that of a series of musical objects which gener-
ally hold some appreciable relationship to a previously established object or set
of objects. In the examples presented, these prior objects have emerged in the
course of the unfolding improvisation, but they might also include objects from
other ensemble members, melodic material (say, from the ‘head’ of a jazz stan-
dard), or elements from a stylistically or personally established, wider vocabulary
or set of schemata (as in Pressing’s concepts of ‘referents’; 1984: 346).
This chains-of-thought perspective runs the danger of narrowing the con-
cept of improvisation into exclusively linear and causal relationships between
objects, with little acknowledgement of the important role of spontaneous,
novel inspiration. However, as this model is developed through this chapter,
it is shown that these ‘pristine’ moments may in fact be accommodated into a
theoretical model with the introduction of the concepts of multidimensional
musical space, ensemble interaction, proximity, surprise and shape.
Limitation and variation of musical topics
Here the discussion turns to jazz improvisational pedagogy and how it might
relate to and inform the concept of shape in improvisation generally. The jazz
pedagogical material that started to emerge around the late 1980s from such
educator/practitioners as Hal Crook, Jerry Bergonzi and Mick Goodrick
FIGURE 6.2 An illustration of musical refractions. In the course of an improvisation, a phrase is manipulated by the selection of one of
many transformational processes (1–8 present a few of countless possibilities). The resulting phrase is in turn open to further modifications.
Improvisation is seen as the realization of a pathway through the multitude of refracting musical possibilities (Mermikides 2010).
encouraged a more direct engagement with improvisation beyond the typi-

cal ‘learn your scales, transcribe, and good luck’ approach. For example,
Goodrick’s and Crook’s material (Goodrick 1987; Crook 1991) set particular
challenges for the student, which might be described as ‘guided improvisation’
or ‘improvisation within limits’. Typical exercises include:
• Improvise through the tune using only chord-tones of the harmonic
progression
• Improvise a short phrase. Rest. Improvise a short phrase. Rest.
Improvise a long phrase. Rest and repeat.
• Improvise a phrase that starts with the concluding material of the

previous phrase. Rest. Repeat.
• Improvise a solo with a prescribed dynamic, registral or ‘excitement’
contour.
• Improvise a series of phrases with a particular intervallic structure,

adjusting accidentals to negotiate the harmony.
• Improvise using only a prescribed range or area of the guitar
fretboard, or the saxophone register, etc.
The point of such ‘limiting’ exercises is to force new ideas and avenues of
exploration, liberating the improviser from the overwhelming number of
possibilities of the ‘blank canvas’, focusing on specific musical parameters
that might otherwise be overlooked, and avoiding the habitualized patterns
that arise in the absence of guidelines. In Crook’s words, ‘There is no freedom
without structure’ (1991: 55). One might think of this type of approach as the
training of a particular type of skill: the independent and artful modification,
or maintenance, of coexisting musical parameters. This type of pedagogical
literature aims to develop proficiency (conscious and intuitive) in this area to
create authentically chosen material rather than pat phrases at the moment of
improvisation or clichéd responses to a sequence of chords. Improvisational
skill and strategy are thereby developed through the fixing—or variation—of
specific musical parameters.
Alongside jazz pedagogical material, further support for this ‘limit and
vary’ approach, this time from an academic theoretical standpoint, is offered
by Pressing’s (1988) ‘Improvisation: methods and models’, which offers a
model of improvisation as variegated attention paid to selected parameters and
transformational processes. Values of various musical parameters and types
of transformation of a phrase are defined. The amount of attention paid to
each is described with the currency of cognitive strength. This is unlikely to be
more than conjecture, and it is difficult to imagine how it could be measured.
Personal and anecdotal accounts and neuroscientific reports on improvising,
tentative as they may be, seem to suggest that at any particular moment the
creative improviser is thinking actively about one or two musical goals at most
(Werner 1996; Nachmanovitch 1990; Solstad 1991; Limb and Braun 2008).
178 Music and Shape
Pressing’s model taken alone is not immediately stylistically relevant to jazz

improvisation research, nor authoritative for it. Nonetheless, by defining this
multilevel environment within which the improviser may navigate, Pressing pro-
vides a very powerful conceptual vocabulary. Despite the difficulty with the
proposed distribution of cognitive strengths, the independent defining of atten-
tion to, and values of, particular musical parameters and processes is illuminat-
ing. The staggering developments in music technology now allow the adoption
within practice of similar models for composition and real-time performance,
as well as computer-based generative improvisation systems (see for example
Bäckman and Dahlstedt 2008).
As a practical demonstration of these theoretical concepts we offer an exam-
ple of the process. Here, a simple three-note phrase (to the left of Figure 6.3)
is shown to afford various simple interpretations, which may then be used as
‘topics’ from which an improvisation might develop. Despite its simplicity, the
phrase in Figure 6.3 could be described in innumerable ways and levels of detail
including: (1) a phrase starting on beat ‘4 and’, (2) a three-note rhythmic pat-
tern with no particular rhythmic placement, (3) a melodic gesture, (4) a broken
chord implying part of a Cmin9 or E♭maj7 chord, (5) a phrase with a particular
FIGURE 6.3 Coexisting interpretations of Phrase α (Mermikides 2010)

harmonic altitude relative to the harmonic context or (6) a particular pattern

of timbral characteristics and envelope represented as amplitude over time.
In this way, the phrase can be conceived as possessing many sets and subsets
of properties, variably simple or complex, or, to adopt Pressing’s language, an
object existing in a particular point in multidimensional conceptual space. It
becomes clear how an improvisation might develop with this concept in mind.
Any number of the subsets of musical characteristics may be used as a reference
point for ensuing phrases. For example, the starting beat may be fixed for a new
phrase, the rhythmic pattern preserved with new notes or the melodic contour
maintained but transposed, and so on. Not only can the concept of isoryth-
mos (fixed rhythmic structure) be explored but also isomelos (fixed sequence
of melodic pitches; Persichetti 1961). In fact, one might coin a number of
words prefixed iso-and dis-to refer to the pertinent fixing or varying of vari-
ous musical parameters respectively, such as isotimbre (fixed timbre), isopaesi
(fixed intensity), isomodos (fixed scale implication), isokinetos (fixed gesture)
and isologos (a fixed concept or pattern applicable across multiple parameters).
From a jazz practitioner’s perspective, by limiting certain parameters one is
more able to explore ‘otherwhere’ (Pate, cited in Berliner 1994: 385). A lang
uage naturally evolves from here to describe a musical relationship defined by
the significant variation of a particular parameter (displacement, distimbre,
etc.) Figure 6.4 offers some possible improvisations emerging from phrase α,
based on this ‘limit and vary’ approach.
Each of these continuations can be seen as the limiting and varying of some
inherent sets of properties in phrase α. For example, phrase 1 takes the rhyth-
mic structure and general melodic shape of α as a constant, and uses diatonic
transposition, rhythmic displacement and micro timing as transformational
processes, whereas phrase 2 uses the strict intervallic structure of α and employs
chromatic transposition and rhythmic displacement to create an angular, dis-
sonant line. Phrases 3–6 employ the fixing and variation of sets of properties
including melodic structure, articulation, metric placement, dynamics and
scale implication. Phrase 7 might be considered an interruption where many
of the parameters are varied or abandoned, effectively restarting the continu
ation process. Figure 6.5 illustrates how the limitation and variation of musical
parameters may forge new material in an improvisation.
The importance of considering a host of available topics for modifica-
tion is reflected in the ever-growing jazz pedagogical material of the last two
decades. Some pedagogical texts give an overview of many transformational
topics within one book (see Crook 1991, 1995; and Damian and Feist 2001 for
examples of improvisational meta-views), while other writers choose to cre-
ate a series of volumes addressing each topic separately, of which Bergonzi’s
Inside Improvisation series is a clear example. To date, Bergonzi has written
seven volumes, each focusing on a topic (Bergonzi 1992, 1994, 1996, 1998,
2000, 2002 and 2004). In fact, these can be seen as studies in the fixing of these
FIGURE 6.4 Improvised continuations of Phrase α. Instances of Phrase α, and its close relations, are labelled with solid and dashed outlines respectively
(Mermikides 2010).
Fix
Fix Melodic Shape
Key Area Rhythmic Structure
Vary Vary
Rhythmic Density Diatonic Transposition
Melodic Content Rhythmic Placement
Time-Feel
Fix
Fix Intervallic Structure
Harmonic Vary
Implication Chromatic
Vary Transposition
Attack Envelope Rhythmic
Placement
Fix Fix
Fix Fretboard shape Order of Melody
The 3 Melody Notes Vary Notes
in Terms of Scale Rhythmic Vary
Vary Placement Rhythmic
Imply Parallel Scale String Placement Placement of
Melody Notes
FIGURE 6.5 An illustration of how the fixing and variation of musical topics may forge
improvisational continuations from Phrase α (Mermikides 2010)
featured topics (e.g. a particular melodic cell) and thereby exploring deeply
other variables (permutation, harmonic altitude, segmentation, etc.). A con-
temporary bibliography of jazz pedagogical material has started to resemble
a library of chess books with stacks of general-principle texts alongside titles
dedicated to every conceivable opening, variation of opening and style of end
game. The study of these differentiated skills, in both chess and jazz improvi-
sation, aims to offer the player informed options and intuition at the moment
of performance.
This section has introduced, and given some examples of, the concept of
improvisation as a mutation of preceding phrases, through the variable fixing
and variation of various musical parameters. In this way, a phrase is modi-
fied according to various parameters, and wanders from its starting origin
while maintaining a comprehensible narrative for the listener. The next sec-
tion will look more deeply into this idea of trajectories through this wealth
of possibilities, and how this might enhance an understanding of shape in
improvisation.
182 Music and Shape
Modelling and navigating musical space
The improviser, given a starting phrase, is presented with a range of choices

for continuation depending on the differential manipulation of a host of
musical subsets. Beyond the theoretical interest, this concept may be applied
pedagogically and in performance to guide improvisational practice. To return
to the extract from Coltrane’s Acknowledgement, a simple melodic fragment
is transformed in terms of three parameters: chromatic transposition, metric
placement and note separation. Although it might be possible to appreciate
these three values for each object in the context of standard notation, we can
observe them more directly in an alternative representation. Figure 6.6 shows a
three-dimensional space that represents all the possible variations of the phrase
in terms of the three specific parameters of chromatic transposition, metric
placement and rhythmic subdivision. Twenty numbered phrases in the solo are
plotted in reference to example phrases in the space, but the space represents
all possible permutations of values within those ranges. With the caveat that
important articulation, dynamic and timbral elements are set aside for now,
Coltrane’s solo may be seen as constituting a small subsection of this clearly
demarcated musical space.
Incidentally, a reference may be drawn here with the language of evolution-
ary biology. Raup’s cube (cited in Dawkins 1997: 192) illustrates an analogous
three-dimensional space of possible shell types, where three genes contribut-
ing to the shape of a shell (spire, flare and verm) are laid out in three dimen-
sions. Every possible expression of these genes is laid out in multidimensional
space, and a subset of these that have been found in nature as a product of
natural selection can be indicated. Similarly, the phrases in Coltrane’s solo rep-
resent the ‘naturally occurring’ subset of all possible phrases within a defined
musical space.
Figure 6.6 uses a simply arranged set of transformed phrases: however, the
exact layout of phrases within that musical space is debatable. One could make
a good case that potential phrases existing along a particular dimension may
not always have an easily described continuum of proximity. A semiquaver dis-
placement may actually be a more radical mutation than a minim displacement;
an octave transposition is perhaps less extreme than a semitone, and a semitone
less extreme than a quarter-tone for that matter. This nonlinear nature of musi-
cal proximity is noted in Wishart’s On Sonic Art with reference to the relation-
ship between frequency and consonance (1996: 71–3).
This problem of defining proximity may be approached tentatively. For
example, phrase α may be imagined as existing at a particular point on an axis
of rhythmic placement within the concept of a cyclical bar. Because of the pat-
tern of strong and weak beats, a minim displacement is to be considered the
‘nearest’ displacement (despite it being the furthest in terms of beat placement).
FIGURE 6.6 Coltrane’s cube: some possible phrases of Coltrane’s Acknowledgement plotted in the three-dimensional musical space of metric placement, rhythmic
separation and chromatic transposition, with a few coordinates illustrated with standard notation (Mermikides 2010)
184 Music and Shape
A crotchet displacement is more distant; the D for example would now fall on
beats 2 or 4, rather than 1 or 3, a more significant change in character. Any qua-
ver displacement alters the phrases yet more extremely, removing the upbeat
and interfering with any swinging of quavers that may be going on. Semiquaver
shifts, and yet finer rational subdivisions (if appreciable), alter the phrase still
more radically. To complicate matters further, the layout of these phrases is not
static: once a rhythmic displacement has been made, phrases are reordered in
terms of proximity. For example, if phrase α is displaced by a crotchet, its ‘near-
est’ neighbour is now a minim away. Incidentally, this representation can more
readily adopt micro-timing features which ‘fall between the cracks’ of standard
notation. With the concept of musical proximity in mind, more dimensions
may be added and a new musical space may be constructed for exploration
(Figure 6.7).
Orthogonal to this rhythmic placement axis, a note separation axis may
also be postulated, representing the progressive elongation and contraction of
phrase α, with wider note separation in one direction and shorter in the other.
This axis might be arranged with the emphatic top D used as a rhythmic anchor
about which the outer two notes are stretched or compressed. The individual
notes may compress until they form a chord and then extend beyond that point
to form a retrograde transformation of phrase α. In addition to rhythmic place-
ment and note separation, another axis may be added that represents all possi-
ble diatonic transpositions of a phrase (diatonic to the key of C Dorian), higher
in one direction and lower in the other. Chromatic transposition within a tonal
harmony creates a nonlinear pattern of musical distance. However, within a
modal setting, from which phrase α is derived, the hierarchical nature of scale
degrees is less clear. The subjective decision has therefore been made to arrange
the proximity in terms of diatonic transposition very simply, so that proxim-
ity in this dimension is equivalent to similarity of melodic register. Given the
definition of these parameters, variations of phrase α exist in three dimensions,
with potential mutations of the phrase existing side by side in conceptual space.
A sense of musical proximity within these constraints may also be perceived.
Now that the concept of proximity has been established, one might also
imagine additional transformational dimensions emerging from phrase α, such
as a chromatic transposition dimension, axes of various timbral characteristics
(including those achievable only through electronic manipulations), points of
symmetry, intonation, segmentations and so on. An impression of how a musi-
cal phrase exists in multiple simultaneous dimensions of transformation, here
termed M-Space, is shown in Figure 6.8.
The coexistence of these multiple dimensions is possible to conceive, but
difficult to illustrate precisely in one diagram. A conceptual model, whose
precise demarcation can be delegated to a computer, might serve better than
two-dimensional illustrations. Regardless, a logical visualization of a phrase
existing within a radiated sphere of closely related musical material may readily
FIGURE 6.7 Phrase α existing at the centre of a three-dimensional musical space with variously
proximate neighbouring phrases. Phrase α is indicated at the origin of the axes and the musical
distance between it and various close neighbours is shown. The boundary of the grey sphere describes
a boundary of equal proximity and contains phrases within this musical distance. The lower part
of the diagram shows an impression of Phrase α existing at a point within this musical space
(Mermikides 2010).
FIGURE 6.8 An impression of M-Space: phrase α sits at the centre of many simultaneous dimensions of musical transformation. Twelve of these are represented in four
three-dimensional subsets (some of which are continuous rather than discrete values) with some proximate phrases indicated. A phrase may move along any number
of such transformational axes during the course of improvisation. In the top right of the diagram, a phrase shows the result of a small move in all of these subsets
simultaneously (the modification is marked as a grey disc in each transformational subset) (Mermikides 2010).
be adopted and, as will be demonstrated, a concept of relative musical proxim-

ity is intuitively accessible, despite the mathematical complexity on which it
is constructed. In fact, many musicological terms such as repetition, motivic
development, sequencing, antiphony, call and response, ternary form, exposi-
tion, etc. rely on a shared intuition of musical similarity or difference (i.e. musi-
cal proximity).
As more axes of transformation are added, an idea emerges of a particular
musical object living at a particular point in a conceptual space of all its pos-
sible variations, from which the improviser may explore in any direction, or to
use saxophonist Evan Parker’s description of improvisation, ‘take a note for a
walk’ (cited in Borgo 2005: 36).
Thus, proximity is simply distance in M- Space. These fields, grouping
related musical objects together, are shown later as cloud-like structures, but
the actual shape is harder to grasp. Since we are conceiving this boundary as a
multidimensional object of a particular radius, its shape cannot be conceived
as a simple three-dimensional sphere. Multidimensional musical proximity
would imply that the exact repetition of a phrase, with a major alteration in
a single dimension, such as a substantial timbral modification, may be equally
proximate as a repetition of the phrase with many slight alterations. Electronic
dance music could be said to illustrate this idea clearly; a continuously repeated
phrase may be subjected to extraordinary timbral manipulations while retain-
ing an intelligible relationship to the original phrase. (For one of numerous
examples of this, listen to the extreme, and tolerated if not relished, timbral
manipulations of the bass ostinato in Phuture’s Acid Tracks from 1987).
If an improvisation is considered a series of jumps or flights in musical space
from one object to the next, this raises a question of the existence of an opti-
mal ‘musical’ distance or flight path. It seems that the skill-set of the proficient
improviser includes the ability to control musical proximity for expressive effect,
balancing predictable, similar objects with novelty. In The Sound of Surprise
(1959), Balliett’s posited ‘aural elixir’ is the result of perfectly selected surprises;
and, as Borgo notes, effective improvisation is not a random stumble through
musical space, nor is it always a dainty, careful and predictable movement
through it: ‘Randomness does not produce a sense of surprise, but rather confu-
sion, dismay, or disinterest. And small departures from an orderly progression,
if insufficiently interesting or dramatic, will pass without much notice. Surprises
are by definition unexpected, and yet those that most capture our interest or
delight have a feeling of sureness about them once experienced’ (Borgo 2005:1).
Some approaches to finding this ideal middle ground between predictabil-
ity and randomness (a ‘rational deception’) are addressed later in this section.
Regardless, the concept of proximity allows an awareness of a rarely identified
expressive medium. Irrespective of the specific vocabulary, a series of phrases
with only subtle changes creates a different musical effect than do a series of
wildly disparate phrases. In other words, M-Space distance and velocity are, in
188 Music and Shape
themselves, avenues of expression, as is acceleration, the rate with which prox-

imity between phrases alters.
Rather than improvisation as a meandering drunken walk through this M-
Space, large-scale improvisational strategies are possible and occur often in the
hands of skilled practitioners. Listening to Jimmy Smith’s solo on The Sermon
(1958: 1:58–2:50) with the M-Space model in mind, it is easy to hear the separ
ation of phrase fields (Figure 6.9).
Even a first listen sorts these phrases into five main fields (A–E) with two
to five phrases in each (A1–3, B1–4, etc.) There are common features within each
group, which form a strong gravitational force between the phrases. This prox-
imity means that they can tolerate and indeed draw attention to any subtle
transformations, including editing of notes and detailed variations of micro
timing and inflection. The creation of proximate phrases fixes groups of
musical dimensions and thereby frees up other musical dimensions for effec-
tive expression. This multilevel hierarchical structure of phrases and fields in
Field B
Field A B1
A1
B2
A2 B4
A3
B3
Field C
C2 C1
Field E Field D
C3
C4
D1
E1 E2 D2
D3
FIGURE 6.9 A multi-level depiction of Smith’s solo on The Sermon. Improvisation is seen as a
configuration of fields at varying distances and trajectories in M-Space, with each field containing a
constellation of phrases. Phrases, in turn, may be broken down into a nexus of smaller phrase units as
is shown in reference to E2. Phrase E2 has been placed closer to Fields A and C than B and D, to reflect
features of E2.1. Fields themselves are linked together in terms of timbral, registral and temporal
components and may coexist in a yet greater nexus of relationships with other performers or musical
objects (Mermikides 2010).
M-Space is illustrated in Figure 6.9. Fields A–E co-exist as part of the same
solo, but their relative proximity is also due to registral, timbral and temporal
considerations. Not shown in Figure 6.9 is the yet more complex interaction of
fields and phrases between other performers and musical objects.
In essence, this solo extract might be described as a ‘field series’ in that a
motive is explored in a number of objects (phrases) before jumping to another
area of M-Space for expressive exploration. Although Figure 6.9 is illustrated
in two dimensions, one must be reminded that the relative distances between
fields, and between their constituent phrases, is the cumulative result of their
relative positions in multidimensional space (through variations of many
coexisting musical parameters). Once the concept of M-Space structures and
their relative positions has been grasped, the listening and analytical process
becomes far clearer. From the straight-ahead to the most avant-garde contexts,
it becomes possible to classify improvisations in terms of which parameters are
fixed and varied, and what types of trajectories through M-Space occur.
Other improvisational structures (or strategies) might also exist. One can
hear, for example, in Wes Montgomery’s solo on No Blues (1965: 1:32–2:02),
one very narrow phrase field being used repeatedly as a pivot to other fields in a
‘call-and-response’ manner. A sharply defined F (in octaves) is used as a motive
central to various other phrases (which might belong to several identifiable
fields). This interjected figure is so clearly defined that the other ensemble mem-
bers are compelled to mark it with their own musical material. In other words,
the soloist’s M-Space structures have infiltrated those of the accompanists, as
should occur in any responsive ensemble environment. One might name this
type of improvisational structure a ‘pivot’, where one narrow area of M-Space
is used as a frequent and significant launching point for other satellite objects.
Alternatively, Coltrane’s Acknowledgement (discussed earlier) displays the
strategy of identifying a narrow field and then furtively exploring that space for
extended periods: one might refer to this improvisational shape as ‘nuclear’. Pat
Metheny’s approach on Unquity Road (1976) is less clearly delineated: phrase
fields exist, but the transitions between them are often blurred, and motion
through M-Space relatively slow, so that the result is a ‘merged’ improvisational
structure. Finally, an improvisation may be characterized by repeated distant
jumps in musical space, an ‘unbounded’ improvisational shape, so that there
is little appreciable relationship between successive musical objects (Sheffield
Phantoms (Bailey 1975) may—to many listeners—provide an example).
As analyses of improvised solos are revealed, it becomes possible to sort con-
stituent passages into these kinds of broad category. Note that these are grouped
in terms of the relationships between the phrases rather than according to the
vocabulary. A pictorial comparison of five improvisational strategies is pre-
sented in Figure 6.10—nuclear, field series, pivot, merged and unbounded—of
which Acknowledgement (Coltrane 1965), The Sermon (Smith 1958), No Blues
(Montgomery 1965), Unquity Road (Metheny 1976) and Sheffield Phantoms
1. Nuclear 2. Field Series 3. Pivot
4. Merged 5. Unbounded
FIGURE 6.10 Five improvisational structures: 1) ‘Nuclear’: phrases, with only occasional small anomalies, fall within one close field with only minor variances. 2) ‘Field Series’:
close phrases are played a few times with variances before repeating the process at a different point in M-Space. 3) ‘Pivot’: one particular narrow field is played often, acting as
a springboard to various satellite fields. 4) ‘Merged’: fields are merged by the use of a transitional phrase of otherwise distinct phrase fields. 5) ‘Unbounded’: a series of phrases
with little proximity of one phrase to any other (Mermikides 2010).
(Bailey 1975) are respective examples. The categorization of these strategies

involves some subjectivity (one listener’s nuclear may be another listener’s
unbounded improvisation) and there may be borderline cases, but a clear ter-
minology and framework in which to analyse, compare and contrast a range of
improvisations regardless of style is presented.
Identifying improvisational strategies such as these informs a practical
approach to improvisational performance and ‘guided’ score instructions, as
well as creating a broad analytical foundation. Furthermore, an appreciation
of M-Space structures may act readily as a supporting mechanism to com-
positional practice and employment of electronics. The analyses so far have
focused on the transformation of material over linear time, one object moving
to another particular location in space. However, jazz (and classical practice, as
is demonstrated in the next section) also involves a transformation of synchro-
nous (albeit unheard) material. That is to say, the (often adventurous) interpre-
tation of a melody in a jazz ‘head’ involves the spontaneous transformation of
the (usually quite skeletal) melody. Rhythmic adjustments, interpolated phrases
and playful transpositions of the written (but rarely exactly rendered) melody
are all potential avenues of expression for the skilled performer—and informed
listener—and this can be seen as the playful dancing around the composed
musical object in musical space.
Through a pedagogical exploration of jazz improvisational method, this sec-
tion has crossed paths with an elaboration of Pressing’s event-cluster model
(1988). This meta-view of improvisation is made more powerful with the sup-
port of a practical stylistic understanding of the jazz idiom, which provides
an appreciation of time-feel, harmonic altitude and the extrapolation of the
concepts of proximity and velocity.
The view of improvisation as transformations in multidimensional musical
space is so broad that it connects the mechanics and pedagogy of jazz practice
with a diverse range of compositional and analytical research. These include
(among others):
• Xenakis and Kanach’s formal modelling (Xenakis and Kanach
2001) which, with its consideration of stochastic functions over
multiple—albeit discretely valued—musical parameters, has parallels
with the concept of proximity and improvisational strategies in
M-Space;
• Wishart’s ‘gestures’ in electronic music (1996)—a taxonomy of
continuous sonic modifications—which may be considered analogous
to expressive contours with respect to timbre;
• The multidimensionality of Pressing’s improvisation model (1998)
which is, as discussed earlier, related directly to M-Space;
• Moles and Schaeffer’s prescient graphical representations of l’objet

sonore (Holmes 2008: 45–48), conceived readily in respect to three
synchronous expressive contours, or three dimensions in M-Space;
192 Music and Shape
• Methods within Schillinger’s compositional systems (1978) which may

be described as employing isokinetos, isorythmos and isologos; and
• Dreyfus’ detailed studies (1996) of motivic transformation in the

music of J. S. Bach, readily adopted in terms of a chains-of-thought
methodology.
A cadenza in musical space
These same concepts of transformation and proximity in a musical space—

defined by formal expectations set up by Beethoven’s ordering of his mate-
rial—may be used to examine a written-out classical cadenza from the later
nineteenth century. Although, for the most part making cadenzas was the pre-
rogative of performers, there are some exceptions. For example (focusing on
violin music), Mozart includes original cadenzas for his Sinfonia Concertante
for violin and viola K364/320d and the Concertone for two violins K190/
186e. Even though a practice of publishing ornamentation was established
well before (for example, in the northern European editions of Arcangelo
Corelli’s Op. 5 Sonatas for Violin and Basso Continuo, circa 1710), composing
and publishing cadenzas was uncommon until well into the romantic period.
From then on, cadenzas of highly regarded performers were printed (prob-
ably to sell to the growing market for amateur music-making), and these may
or may not have reflected the stylistic interests of the concerto’s composer.
At the time this was no more surprising than in modern jazz practice, where
improvisation on a standard such as Ain’t Misbehavin (1929) is neither predi-
cated on nor held back by the specific practice of its composers, Fats Waller
and Harry Brooks. While a sense of what distinguishes that particular era of
jazz from bebop or the avant-garde is clear to most jazz professionals and expe-
rienced listeners, when the standard is reworked in an improvisation performers
might, but need not, incorporate ‘appropriate’ or ‘historically a ccurate’ perfor-
mance practices.
A written cadenza is in and of itself problematic, even oxymoronic: experi-
enced improvisers in various genres will relate that over time their retention of
material within improvisation increases. Some report that they can eventually
write down all or most of their improvisation immediately after performance
(Nooshin 2003: 246). Some, contrastingly, report a type of short-term memory
loss, including the loss of the entirety of their improvisation just after perfor-
mance (Berkowitz 2010: 121–4). In either sense, the modern idea that a writ-
ten text based on an improvisation is categorically different from a performed
improvisation is likely based on a dichotomization of improvisation and com-
position that makes little sense before modern times.
With these provisos, and using a closely related approach, we can now turn
to an extant cadenza for Beethoven’s Violin Concerto Op. 61 by the Belgian
violinist Hubert Léonard (1819–90) first published in the 1880s (Léonard
c. 1883). Léonard was an established violinist who had studied with Henri
Vieuxtemps and François Habeneck at the Paris Conservatoire. Beginning
his tours of Europe in 1844, he succeeded Charles Auguste de Bériot as prin-
cipal professor of violin at the Bruxelles Conservatoire in 1847, becoming the
tutor of celebrated violinists of the next generation including Martin Pierre
Marsick, Henry Schradieck and Henri Marteau (Stowell 1992: 65). Léonard
is possibly best known today as an inspiration for Gabriel Fauré’s Violin
Sonata. In 1875, on a long visit to Sainte-Adresse, near Le Havre, Léonard
advised the young Fauré on how to make his composition ‘more playable
and effective’ (Nectoux 2004: 23). Léonard, then, was well respected in his
field, not on a par with extraordinary or esoteric violinists such as Eugène
Ysaÿe (in the following generation) or Niccolò Paganini (of the previous
generation), and better remembered, in the long run, as a pedagogue. It was
in this capacity that he provided an unusual second-violin accompaniment
for the Beethoven concerto, exclusively for teaching purposes, reissued in a
1909 arrangement for violin and piano edited by his student Henri Marteau
(Beethoven 1909).
To fill out some of the context in which Léonard’s cadenza was fashioned,
we should note briefly the types of values embodied in classical-era cadenza
improvisation/ composition as exemplified by Daniel Gottlob Türk ([1789]
1982: 301). Like the ‘limit and vary’ exercises illustrated above, Türk describes
some basic starting-points for cadenza creation, although his presentation is
rather more prescriptive than the examples offered from Goodrick (1987) and
Crook (1991).
• [T]‌he cadenza … should particularly reinforce the impression the
composition has made in a most lively way and present the most
important parts of the whole composition in the form of a brief
summary or in an extremely concise arrangement…
• The cadenza … must consist not so much of intentionally added
difficulties as of such thoughts which are scrupulously suited to the
main character of the composition…
• Cadenzas should not be too long…
• [M]‌odulations into other keys … either do not take place at all …
or they must be used with much insight … only in passing…
[O]riginally the harmony of the six-four chord and in any case the
triad that follows it were the basis of the cadenza, but in our time
these harmonic confines are probably too narrow. One can modulate;
only one should not remain in neighbouring keys so long that the
feeling for the main key is extinguished.
194 Music and Shape
• Just as unity is required for a well-ordered whole, so also is variety

necessary…
• No thought should be often repeated in the same key or in another…

• Every dissonance which has been included … must be properly
resolved…
• A cadenza does not have to be erudite, but novelty, wit, an
abundance of ideas and the like are so much more its indispensable
requirements…
• The same tempo and metre should not be maintained throughout the
cadenza; its individual fragments … must be skillfully joined to one
another…
• A cadenza which perhaps has been learned by memory with great

effort or has been written out before should be performed as if it were
merely invented on the spur of the moment.
Keeping Türk’s guidelines in mind, together with the basic principles outlined
in our discussion of jazz improvisation, we can now look at the relationship
of Léonard’s cadenza to the text of Beethoven’s Op. 61 following a chains-of-
thought model. The cadenza is rich in the standard techniques of cadenzas of
the period, principally motivic and melodic development and transposition.
Yet, looking at his traversal of musical space, we see Léonard playing with the
structural relationships expected from Beethoven’s composition, creating musi-
cal leaps in terms of formal proximity, as illustrated below.
An analogy is to be found in the concept of transformational grammar, as
drawn on already above. As Chomsky (1988) theorized, the ‘transformational’
properties of language arise from the language user accessing a body of know
ledge (‘lexicon’, which is the linguistic analogue to Pressing’s 1984 ‘knowledge
base’) to generate novel sentences, or sequences of words or phrases. The cadenza,
similarly, is a kind of real-time interaction between the source material and the
improviser/composer’s musical abilities. To use terms from Pressing (1984), the
‘referent’ is clear: the thematic, motivic, rhythmic shapes or any musical element
from the written text of the concerto, with a clear emphasis on the main thematic
material. The ‘knowledge base’ is derived from the virtuosity of the performer/
composer and thus dictates the execution—or ‘generation’ in Chomskian terms—
of the cadenza, a novel set of phrases and sequences based on the referent.
Thus, Léonard created a novel reconception of Beethoven’s given material,
passing elements of the original text through the filter of his individual musi-
cal personality and creative interests. He uses generative methods of musical
creation common to the period—variation, transposition, recombination and
‘rational deception’—to reenvision Beethoven’s material (Berkowitz 2010). He
also inserts shapes—conceived here as motives or characteristic passages of
material in the process of change—that seem to come directly from his know
ledge base, perhaps inherited from his composition/improvisation education,
perhaps quoted from his own composed/improvised work, perhaps newly syn-
thesized with material from Beethoven, although it may be hard to discern an
exact origin. In some ways, perhaps this latter category is the most significant
outcome of the cadenza creation process: the fusion of the composer’s and
the performer’s musical identities through which shapes survive although their
obvious identifying features have changed.
Figure 6.11 represents the first section of Léonard’s cadenza, divided into
L1 and L2 according to their relationship to material from Beethoven’s original
text (the corresponding passages are B1 and B2). L1 illustrates the opening of
Léonard’s cadenza, which extends the first seventeen bars of Beethoven’s text
(B1) by one bar (circled). Structurally Léonard keeps the identical thematic
order, employing isologos (to continue with our earlier terminology). The types
of transformation in this section are harmonic, rhythmic and ornamental: the
first statement of the main theme (bars 1–9 of B1) is harmonized with two-,
three-and four-note chords, where the first half of the second section (bars 10–
13 of B1) is extended by one bar (bar 14 of L1) and embellished by arpeggiated
flourishes in A major and Amaj7 (bars 11 and 13). The statement is thus intensi-
fied (e.g. the opening forte dynamic marks an example of dispaesi), departing
from the sweet character of the original melody but keeping the original line
(isomelos). The descending melodic figure in bars 14–15 of B1 is broken into
chords and rhythmically varied while the final two bars, 16–17, are repeated
nearly identically to Beethoven’s text with the minor addition of a ‘g’ pick-up
for supporting harmony (bars 17–18 of L1).
L2 shows us a slightly more adventurous departure from Beethoven’s
motives. A reinvention of bars 32–34 of B2 starts the passage, keeping the
rhythmic gestures of the original section (isorhytmos). However, melodically
the phrase peaks, not troughs (bar 22 of L2), as Léonard reconfigures expecta-
tions (examples of dismelos and diskinetos). Further, he borrows the rhythmic
detail of what follows in B2 (bars 35–41) creating a transitional harmonic sec-
tion that leads into a very unexpected revision of B3, departing from the related
material in B1 and B2.
Illustrated in Figure 6.12, L3 begins with a rather unexpected harmonic and
ornamental reshaping of the melodic and harmonic material of B3, one of
the most ‘expressive’ moments of the concerto as reported widely by theorists
and embodied in performance across many recordings (Stowell 1998; Fabian
2006). This section acquires greater significance later in the cadenza. Léonard
reuses the highly recognizable opening suspensions, g2 to f ♯ 2 and c3 to b♭2, in the
original harmonic progression (B3, bars 332–35), but transforms the texture
and setting entirely with supporting d2 ostinato trills. Here Léonard is extract-
ing part of a musical shape (isokinetos) but shifting the conceptual outcome
dramatically and in terms of original affect (dislogos). The trills are then used
to modulate further towards a section that is so highly generic as transitional
material (bars 37–42) that it is hard to call it Beethovenian, or even Léonardian.
FIGURE 6.11 Opening section of Léonard’s cadenza (L1/L2) and corresponding sections from Beethoven’s Violin Concerto Op. 61 (B1/B2). L1 is subdivided into two
phrases, with their respective transformations connected by lines in the figure. The motivic elements of B2 are also then connected to their reshaped versions in L2.
FIGURE 6.12 Second section of Léonard’s cadenza (L3/L4A/L4B) and corresponding sections from Beethoven’s Violin Concerto Op. 61 (B3/B4). A harmonic relationship is
indicated by showing the use of the C–B♭ and G–F♯ suspensions in B3 as a motivic shape in L3. The modulating sequence of L3 is then separated by a different box, lower
in the figure. The B4 melody is boxed in the middle of the figure, with corresponding melodic embellishments indicated above and below in L4A and L4B.
198 Music and Shape
L4A brings a virtuosic reworking of the secondary theme (B4) that is highly
evocative of chromatic contrapuntal late eighteenth-or early nineteenth-cen-
tury virtuoso violin technique (embodied variously in the études of Pierre
Gavinies or the caprices of Niccolò Paganini). Fluctuating, like the original
section, from major to minor, this passage offers nothing inherently unusual.
What follows, however, is quite unexpected. As illustrated in Figure 6.13, L5
is an arpeggiated section that denies the expectation that the melodic material
in B4, bars 57–60, will be continued (dismelos). Instead, we hear a harmonic
progression with a pronounced bass melody line, indicated by accent marks in
the score. The melodic line seems familiar yet is difficult to recognize, especially
given the denial of the expected continuation of B4’s melodic material. This
is, in our opinion, the most profound moment of musical surprise or ‘rational
deception’, as theorized by C. P. E. Bach ([1753, 1762] 1949).
Possible sources for this passage can be found among Beethoven’s harmonic
suspensions in the orchestral parts: the inner string writing of bars 51–63
(Beethoven 1968, not depicted here); the bassoon and horn parts in relation to
the principal violin part, bars 304–29 (not depicted here); and, most convinc-
ingly, bars 351–57 (shown in Figure 6.13: B5), where we touch on the A♭–B♭
and C tonalities evident in Léonard’s cadenza. Note particularly the melodic
suspensions in the violin part and the almost Khachaturian-like semitone har-
monic modulations in bars 348–69.1
Yet even this suggestion of derivation is dubious. More likely, Léonard’s
material is the generative outcome of synthesized information, the improviser/
composer mixing the referent and his own knowledge base so thoroughly that
the listener can no longer be sure about the origin of the material. Therefore,
instead of an etymological pursuit through analysis, it seems wiser to propose
that our auditory perception is rationally deceived, both through the harmonic
language and in supposing a specific source for the material: it is impossible
to tell whether we are listening to Beethoven or Léonard, and perhaps that is
the point. What matters is not the amount of influence that is present, nor the
success of Léonard’s transition—which might even be seen as sloppy, depend-
ing on one’s sense of compositional values. The important point is that we
have reached something mystical, inexplicable, the joining of two minds who
lacked the opportunity to meet in person but who meet—in a metaphysical
sense—through sound and transfer of text. The synthesis of these two streams
of consciousness, which crystallizes throughout the cadenza, is made particu-
larly evident in this transitional passage.
Returning to Figure 6.12, let us note that the section succeeding this transi-
tion, shown as L4B, revisits the melodic content of B4 in an even more robust
virtuosic texture, now in the original key areas of D major and D minor.
Léonard provides a resolute confirmation of our expectations in the continu
ation of the melody previously denied (isomelos), embellishing the melodic con-
tent in a homogeneous fashion through to the melodic material of bars 57–60
FIGURE 6.13 Final section of Léonard’s cadenza (L5/L6A/L6B) and corresponding sections from Beethoven’s Violin Concerto Op. 61 (B5/B6). Both L5 and B5 indicate progressions that
embody pronounced melodic properties, using secondary dominant and common-tone modulations, semitone movement in the bass, and consistent fluctuations between major and
minor harmonies. The relationship is considered tenuous. The shapes material from B6 is indicated, the opening section corresponds to L6A, while the last sequence of notes in B6,
circling around ‘a’, corresponds to an extended retransition sequence in L6B.
200 Music and Shape
of B4, subsequently compressing it into a generic harmonic transition to the

next section of the cadenza.
Illustrated in Figure 6.13, L6A and L6B are extensions of the original open-
ing material of the principal violin’s entrance into the concerto, B6, which fol-
lows the orchestral exposition. Rhythmically compressing Beethoven’s m aterial,
Léonard quickens the material (disrhythmos), heightening the virtuosity further
and provoking performers to rhythmically play the descent and rise in L6A
as quickly or flexibly as they dare. The melodic, gestural and dynamic con-
tent remains the same (isomelos, isokinetos and isopaesi). Gradually, Léonard
brings us closer to the expected Amaj7 leading us to the D major orchestral
playout of the first movement, extending the dominant harmony with the con-
trapuntal texture of L6B.
Figure 6.14 provides a graphic representation of several relationships of
Léonard’s cadenza to Beethoven’s concerto. The x axis represents bars in the
Beethoven, illuminating the fact that Léonard borrows materials from different
sections of the concerto, not always in the original order. The y axis represents
a sense of musical proximity to the original material, where, for example, L1 is
L1-6 Section of Léonard’s Cadenza L5

B1-6 Corresponding sections of
Beethoven’s compostion
Cadenza shape
L3
Musical proximity to referenced material
L4
L2
L1
L6
textural
shift
isorytmos isokinetos
dislogos
Transformations
dismelos
harmonic diskinetos disrhytmos
embellishment harmonic tenuous
isomelos embellishment harmonic
isokinetos isokinetos relationship
dispaesi isomelos dislogos
isokinetos disrhytmos
(extended) dispaesi
B1 B2 B4 B6 B3 B5
1 17 28 43 64 89 96 331 357
Section bar length and relative position of
referenced Beethoven material
FIGURE 6.14 Graphic representation of Léonard’s cadenza illustrating the relationship of musical
proximity to Beethoven’s original score. The closer the vertical distance between each corresponding
B and L section, the greater the musical proximity of corresponding material. Bar lengths for each
section correspond to horizontal width. A ‘field series’ organization is indicated by the arrows
connecting each L section, while transformations are briefly indicated in the rounded boxes
connecting each B and L section.
closer and L3 further away in its transformation of Beethoven’s text. Rounded

boxes connecting respective ‘L’ and ‘B’ sections indicate specific types of trans-
formations employed by Léonard in each corresponding cadenza section.
Relative sectional length is indicated by the width of each square box. The
relationship of L5 and B5 is tenuous, as mentioned above: a vague relationship
(indicated by dotted lines in the diagram) at the greatest distance in terms of
musical proximity between cadenza and concerto material. Finally, if we fol-
low the arrows connecting the ‘L’ sections at the top of Figure 6.14, we can see
a pattern indicating one of the five improvisational strategies shown earlier in
Figure 6.10, a field series model, underlining the utility of the M-Space model
in musical analysis regardless of genre (Mermikides 2010).
Conclusion
This chapter has approached the concept of shape in improvisation with the
use of four conceptual stepping-stones, subdivided into four sections.
1. Using jazz improvisation (and its pedagogy) as a reference point,
an improvisation might be seen as a chain (or series of chains)
whereby newly created musical objects hold an appreciable musical
relationship (a general motivic similarity) to a previously established
musical object (or set of objects). The use of the word ‘chain’
suggests a strict linear set of relationships, but this model allows an
intricate nonlinear set of connections, whereby relationships might be
apparent among a wide set of parameters, and relationships between
objects might ‘leap-frog’ over interpolated objects.
2. The links in this chain were more keenly examined, and it was
suggested (with a basis in jazz improvisational pedagogy) that they
can be usefully described in terms of which musical parameters are
pertinently fixed and which are varied. Terms like distimbral, dismelos
and isoplacement emerge quite naturally alongside well-established
concepts such as isorhythm and displacement. This ‘limit and vary’
concept of improvisation allows a way to identify improvisational
mechanisms beyond the ‘surface’ vocabulary employed, as well as
having direct practical applications.
3. Since a series of musical objects in an improvisation might be
identified as having varying degrees of similarity, one might
imagine a multidimensional space of musical proximity (M-Space)
surrounding any musical object with closer objects being more
recognizably similar and further objects being more musically distant
until they have little or no recognizable similarity. Since this musical
space occupies many musical dimensions, it follows that musical
202 Music and Shape
distance may be manifested asymmetrically over various parameters,

and a moderate change over several parameters may feel as distant as
extreme changes over a few parameters. With the concept established
of a musical improvisation as a journey through musical space,
the concepts of velocity, acceleration and shape through musical
space can be applied. Five general (and subjective) improvisational
shapes were suggested (with real-world examples provided) which
might describe an improvisation or section of an improvisation.
These include (a) ‘nuclear’, an exploration of proximal musical
space; (b) ‘field series’, the serial exploration of several M-Space
areas (‘fields’); (c) ‘pivot’, an improvisation where one narrowly
defined area of M-Space is used as a frequent landing point
between other satellite areas; (d) ‘merged’, where there is a slow drift
through M-Space making clear delineation of fields difficult; and
(e) ‘unbounded’, an improvisation where there is little appreciable
relationship between objects.
Although these categories of shape relationships have emerged from improvisa-

tional research, a similar approach might be used to describe musical structure
generally including a range of compositional forms.
4. The concepts presented above (alongside historical, lexicological and
analytical frameworks) were employed in the detailed analysis of a
classical cadenza, where the concepts of chains-of-thought, ‘limit
and vary’ and M-Space structure were revealed in this (compared to
jazz) stylistically distant example. Finally, the cadenza discussion led
to an additional concept of shape as encapsulating and transmitting
common characteristics through a process of transformation in
which nothing easily identifiable survives intact and yet resultant
material remains perceptually related to its starting point.
These concepts of improvisational shape, while born of jazz pedagogy and

abstract mathematical modelling, may be useful across diverse styles, not only
as analytical tools and ways to appreciate the craft, but also, potentially, as
a mechanism to develop improvisational (and indeed compositional) practice.
References
Bach, C. P. E., [1753, 1762] 1949: Essay on the True Art of Playing Keyboard Instruments,
trans. W. Mitchell (New York: Norton).
Bäckman, K. and P. Dahlstedt, 2008: ‘A generative representation for the evolution of jazz
solos’, Lecture Notes in Computer Science 4974: 371–80.
Balliett, W., 1959: The Sound of Surprise (New York: Dutton).
Beethoven, L. v, 1968: Violin Concerto in D major Op. 61 (New York: Kalmus).
Beethoven, L. v, 1909: Violin Concerto in D major Op. 61 (Leipzig: Steingräber).

Benadon, F., 2009: ‘Time warps in early jazz’, Music Theory Spectrum 31/1: 1–25.
Benson, B. E., 2003: The Improvisation of Musical Dialogue: A Phenomenology of Music
Bergonzi, J., 1992: Inside Improvisation, Vol. 1: Melodic Structures ([Rottenburg:] Advance
Music).
Bergonzi, J., 1994: Inside Improvisation, Vol. 2: Pentatonics (Rottenburg: Advance Music).
Bergonzi, J., 1996: Inside Improvisation, Vol. 3: Jazz Line (Rottenburg: Advance Music).
Bergonzi, J., 1998: Inside Improvisation, Vol. 4: Melodic Rhythms (Rottenburg: Advance
Music).
Bergonzi, J., 2000: Inside Improvisation, Vol. 5: Thesaurus of Intervallic Melodies
(Rottenburg: Advance Music).
Bergonzi, J., 2002: Inside Improvisation, Vol. 6: Developing a Jazz Language (Rottenburg:
Advance Music).
Bergonzi, J., 2004: Inside Improvisation, Vol. 7: Hexatonics (Rottenburg: Advance Music).
Berkowitz, A., 2010: The Improvising Mind: Cognition and Creativity in the Musical Moment
Berliner, P. F., 1994: Thinking in Jazz (Chicago: University of Chicago Press).
Borgo, D., 2005: Sync or Swarm: Improvising Music in a Complex Age (London and
New York: Continuum).
Chomsky, N., 1988: Aspects of the Theory of Syntax (Cambridge, MA: MIT Press).
Crook, H., 1991: How to Improvise: An Approach to Practicing Improvisation (Rottenburg:
Advance Music).
Crook, H., 1995: How to Comp: A Study in Jazz Accompaniment (Rottenburg: Advance
Music).
Damian, J. and J. Feist, eds., 2001: The Guitarist’s Guide to Composing and Improvising
(Boston: Berklee Press).
Dawkins, R., 1997: Climbing Mount Improbable (London: Penguin).
Fabian, D., 2006: ‘The recordings of Joachim, Ysaÿe and Sarasate in light of their recep-
tion by nineteenth-century British critics’, International Review of the Aesthetics and
Sociology of Music 37/2: 189–211.
Franz, D., 1998: ‘Markov chains as tools for jazz improvisation analysis’ (MSc thesis,
Virginia Polytechnic Institute and State University).
Goodrick, M., 1987: The Advancing Guitarist: Applying Guitar Concepts & Techniques
(Milwaukee, WI: Hal Leonard).
Holmes, T., 2008: Electronic and Experimental Music: Technology, Music, and Culture, 3rd
edn (New York: Routledge).
Léonard, H., c. 1883: Cadenza pour le Concerto de Violon de Beethoven (Mainz: Schott).
Limb, C. J. and A. R. Braun, 2008: ‘Neural substrates of spontaneous musical perfor-
mance: an fMRI study of jazz improvisation’, PLoS ONE 3/2: e1679.
Love, S. C., 2012: ‘ “Possible paths”: schemata of phrasing and melody in Charlie
Parker’s Blues’, Music Theory Online 18/3, http://mtosmt.org/issues/mto.12.18.3/
mto.12.18.3.love.html (accessed 9 April 2017).
Mermikides, M., 2010: Changes Over Time: Theory, http://miltonline.com/phd (accessed 9
April 2017).
Monson, I. T., 1991: ‘Musical interaction in modern jazz: an ethnomusicological perspective’
(PhD dissertation, New York University).
204 Music and Shape
Monson, I. T., 1996: Saying Something: Jazz Improvisation and Interaction (Chicago:
University of Chicago Press).
Nachmanovitch, S., 1990: Free Play: Improvisation in Life and Art (New York: Tarcher/
Penguin).
Nectoux, J., 2004: Gabriel Fauré: A Musical Life (Cambridge: Cambridge University Press).
Nooshin, L., 2003: ‘Improvisation as other: creativity, knowledge and power—the case of
Iranian classical music’, Journal of the Royal Musical Association 128/2: 242–96.
Patel, A. D., 2003: ‘Language, music, syntax and the brain’, Nature Neuroscience 6: 674–81.
Persichetti, V., 1961: Twentieth-Century Harmony: Creative Aspects and Practice (New
York and London: W. W. Norton).
Pressing, J., 1984: ‘Cognitive processes in improvisation’, in W. R. Crozier and A. J. Chapman,
eds., Cognitive Processes in the Perception of Art (Amsterdam: Elsevier), pp. 345–67.
Pressing, J., 1988: ‘Improvisation: methods and models’, in J. A. Sloboda, ed., Generative
Processes in Music: The Psychology of Performance, Improvisation, and Composition
(Oxford: Oxford University Press), pp. 129–78.
Roads, C., 2004: Microsound (Cambridge, MA: MIT Press).
Sawyer, K., 1992: ‘Improvisational creativity: an analysis of jazz performance’, Creativity
Research Journal 5/3: 253–63.
Schillinger, J., 1978: The Schillinger System of Musical Composition (New York: Da Capo Press).
Solstad, S. H., 1991: ‘Jazz improvisation as information processing’ (MPhil thesis,
University of Trondheim).
Stowell, R., 1992: The Cambridge Companion to the Violin (Cambridge: Cambridge University
Press).
Stowell, R., 1998: Beethoven: Violin Concerto (Cambridge: Cambridge University Press).
Türk, D. G., [1789] 1982: School of Clavier Playing or Instructions in Playing the Clavier
for Teachers and Students, trans. R. H. Haggh (Lincoln: University of Nebraska Press).
Werner, K., 1996: Effortless Mastery: Liberating the Master Musician Within (New Albany,
IN: Jamey Aebersold Jazz, Inc.).
Wishart, T., 1996: On Sonic Art, ed. S. Emmerson (Amsterdam: Harwood Academic).
Xenakis, I. and S. Kanach, 2001: Formalized Music: Thought and Mathematics in
Composition (New York: Pendragon).
Discography
Bailey, D., 1975: The Advocate (Album, Tzadik TZ 7618).

Coltrane, J., 1965: A Love Supreme (Album, Impulse! A-77).
Metheny, P., 1976: Bright Sized Life (Album, ECM L1073).
Montgomery, W., 1965: Smokin’ At The Half-Note (Album, Universal V6-8633).
Phuture, 1987: Acid Tracks (Single, Trax records TX-142).
Smith, J., 1958: The Sermon! (Album, Blue Note BLP 4011).
PART 3
Shapes performed
Reflection
Max Baillie, violinist
3D Bach . . . and the harmonic comet
I tend to translate the abstract stuff of imaginatively conceived sound into

other media: pictures, shapes or structures. As long as these translations are
also imagined, they remain in a sense abstract, but they give me a bearing, a
way to spatially conceive the real-time experience of music and even the mem-
ory of it. Whether I am in it, or travelling with it, or through it, is determined by
the qualities of a given piece or genre, and the sense of my surroundings in this
imagined space varies as much as the soundworlds themselves and the contexts
in which I hear them. Whichever the journey, there’s no doubt that my role as a
performer influences my sense of music and shape.
I want to deal here with one specific relationship, the crucial role of musician
as translator of shapes on the written page into the abstract realm of sound. It
seems one of the unfortunate realities of the western tradition in music that a
visual medium usually intervenes between the physical act of playing music and
its sounding. The page provides a distraction for the senses, robbing the ears
and body of more complete dedication to sound and the physicality of making
it. Of course, we are grateful to ink and paper as our only inheritance of what
went on in, for example, Beethoven’s mind. But unlike a sculptor or a painter
whose work is visible and exists with a physical constancy we can rely on, he left
us with but a tantalizing potential for something that is spun into existence only
at the ever-elusive present. As performers we are responsible for the spinning,
and must find a way to connect the imagination of the creator, captured in its
two-dimensional code, back into its full sonic glory!
But doing this successfully relies on more than the ability to play our instru-
ments both accurately and expressively. Starting from the page, we need to make
sense of what is in one way a whole world of information and in another a quite
minimal set of instructions as to how notes should be sounded according to the
207
208 Music and Shape
original inspiration that birthed them in a particular combination. Although

what we look at is abundant with shapes, it is in many ways an ungenerous and
often misleading code. Here I want to lead the reader through some of the basic
steps I take towards translating it convincingly and imaginatively. I’ll use Bach’s
solo violin music as an example because I have a strongly spatial sense of it. It
strikes me as essentially harmonic music realized in a (mostly) melodic form, and
it is the combination and balance of these two elements which gives it its shape
and provides the basis for the journey on which the ear travels while hearing it.
To explain what I mean, I first describe my visualization of harmony as a
traveller. I then briefly go through some of the steps I take to translate what’s
on the page into a form which plays out the spaces in this travel and the way
the melodic content interacts with it. In this way I trace the creation of a three-
dimensional imaginative journey from its beginnings on the page.
The harmonic comet
Picture an anchor embedded into the ground. A comet ignites and takes off
from the anchor up into the air and in its wake leaves an expansive arc which it
completes into a circle when the comet returns to its starting point. In continu-
ous motion, it travels up and into a second revolution. On a third rise from the
anchor the comet seamlessly curves out into a new trajectory creating a second
circle suspended in the air. As you watch, the comet builds an entire structure of
suspended rings in the sky, exploring the unknown space and also returning to
familiar orbits. Eventually, its travels take it back to the anchor where it began.
This image is a metaphor for the journey the ear might travel while hearing
a piece of music—specifically, music that both is tonal and modulates from
one key to another. The comet is its real-time flow, the anchor is the home key,
and the circle that emanates from it describes a short harmonic journey from
stasis towards tension and back. The points at which the comet breaks into a
new orbit are the pivot chords, and if a feeling of movement comes from the
changes between chords within one key then it is the pivots that create the
sense of travel: rather than occupying the same space through a modulation,
I imagine the music as travelling from one space to another. The character of
this travel (which may be anything from a sublime cruise to a frantic search or
even a joyous ramble depending on the piece) is down to everything else in the
score: the metre, tempo, voice-leading, bowing, and so on, but the movement
itself comes from travel between harmonic orbits.
But where do we start in our flat forest of notes (Figure R.13a)? Unlike, for
example, a classical portrait whose structure is clearly assumed, the overwhelm-
ing visual impression in the figures here is of something uniform. At a glance
the image flipped over looks more or less the same (Figure R.13b). Where are
the structural shapes?
Reflection: Max Baillie 209
(a)
(b)
FIGURE R.13 The opening of the Allemande from J. S. Bach’s Partita in D minor for solo violin: (R.13a)
as usual and (R.13b) upside down
Harmonic cartography
For both performer and teacher, playing Bach requires some detective work.
I sometimes find it helpful to think of the solo repertoire as a distillation (as
distinct from a reduction). Clearly, it’s not that there’s anything missing, but
there is an invisible hierarchy; the notes don’t all occupy the same function in
the harmonic scaffold. If we dig this structure out (Figure R.14), the shape of
the harmonic journey begins to emerge from a more or less uniform trail of
notes on the staff.
FIGURE R.14 Allemande, bars 1–8, with a harmonic analysis of tonal centres and harmonic rhythm
FIGURE R.15 The passage in R.14 represented as a physical journey through space between related
tonal orbits
This kind of simple analysis can be made intuitively or more technically.

(I tend towards the intuitive but occasionally sit down with my violin and
strum chords while singing the written line to inform what my inner ear tells
me.) The crucial thing that emerges is a rhythm independent of the note values
themselves, and this rhythm dictates the spaces in our imaginative journey, the
distances covered, the passing places, the explorations and the return home.
Figure R.15 presents the same passage as the depiction of a harmonic traveller.
As performers, we are bound to respect certain constraints of discipline
delineated by the score, but I doubt Bach or any other great composer past or
present believes that the code we read is anything more than a practical aid (at
best) or a distraction (at worst) to real music-making. Think, for example, of
that most severe and unmusical—and yet necessary—of features, the barline,
with its ruthless and rigid carving-up of the phrase; or the militaristic group-
ings of notes held together by beams, often completely at odds with the way a
phrase groups notes together. What the harmonic map allows us to do is to gain
a sense of the underlying structure of the music, of its inherent shape indepen-
dent of its presentation. We can then phrase it accordingly, which along with
knowing where to give emphasis, informs us also where not to give it.
Figure R.16 offers an example where awareness of the harmonic rhythm
results in a shape at odds with the figuration and the general visual impression
of the music. In this opening, bars 5 and 6 are visually distinct from each other
212 Music and Shape
FIGURE R.16 Allegro assai, bars 1–8, from J. S. Bach’s Sonata in C major for unaccompanied violin
not only in figuration but in Bach’s original phrase markings, as are bars 7 and 8.
From the beginning the harmonic rhythm swings boisterously from tonic to
dominant in each bar, and yet despite the visible differentiation between bars 5
and 6 we are going up a gear on our rustic C major ride: the harmonic rhythm
halves. Bars 5 and 6 belong very much together by virtue of sailing across one
harmony, and the same is true of bars 7 and 8.
The shape the listener receives, if the performer emphasizes this harmonic
rhythm (Figure R.17), is totally different than if either the figurations or the
visual impression of each bar as a separate entity guides the performer. It’s
unquestionably more convincing to my ears: it’s as though the music has gone
up a gear; the trajectory is the same but the arc is bigger. The listener hears
the lower timescale expand while the upper, the flow of semiquavers, remains
constant, and it’s as though two timescales of music are bound together: magic!
But there are other layers too. To use another metaphor, if the harmony is
the skeleton, what of the flesh and blood? The next stage in building up our 3D
image is to look at how the harmonic layer interacts with the melodic rhythm,
and I’ve found this equally illuminating.
Multilayered shaping: harmonic versus melodic rhythm
The passagework in Figure R.18 suggests a three-to-a-bar feel, and as such

seems on first hearing quite natural: the notes that pop out are the ones that
don’t belong to the middle register’s noodling accompaniment, the first of
each group of four. But the harmonic rhythm dictates just two beats, the first
and the third, which is infinitely more groovy in giving momentum towards
the harmonic exploration that follows this excerpt. Thus the harmonic rhythm
𝅗𝅥 𝅘𝅥 𝅗𝅥 𝅘𝅥 underpins an upper melodic rhythm 𝅘𝅥 𝅘𝅥 𝅘𝅥 𝅘𝅥 𝅘𝅥 𝅘𝅥. Having dug down into the har-
monic groundwork, the performer who plays with a sense of the underlying
harmonic rhythm allows the listener to experience a dialogue between comple-
mentary layers which interact.
Summary
These are examples of what is in the music but not spelled out by the
score: shapes embedded in the text but not immediately visible. To my ears they
FIGURE R.17 The passage in R.16 showing the harmonic rhythm
FIGURE R.18 The Allegro assai, bars 13–16, showing melodic rhythm
play an essential role in making sense of what Bach intended: we must remem-
ber that, although an accomplished violinist, he wrote from the keyboard, with
all its richness of harmony, counterpoint and voice-leading expressed through
the medium of an essentially melodic instrument in these works for solo violin.
As a listener I feel I want to be led on the deeper path, the harmonic journey,
while enjoying these layers in dialogue; and that is also the way I aspire to bring
them to life as a violinist. The result is often that the phrasing becomes clearer
and also simpler: the performer makes longer lines where passages belong
together harmonically, and the music gains its multilayered quality where
melodic and harmonic rhythms interplay. Then there is also the whole world of
melodic contrapuntal writing (as opposed to counterpoint between harmonic
and melodic layers) embedded in Bach’s solo violin music; here the challenge
(and joy) as a performer is in the sense of spinning these as dialogue while also
playing one line of music with a single coherence.
Being sensitive to, and inquisitive of, the depth of the text should be a nat-
ural ingredient in a loving realization of this music. Playing with compelling
sound and presence is not enough: if we don’t dig below the surface, and if
we don’t detach ourselves imaginatively from the staff, our performance will
ultimately be one-dimensional because it will miss out the embedded propor-
tions in the music. The idea of a harmonic comet is one possible metaphor that
enables a player to assign an imagined shape to these proportions and embraces
the idea of the ear as a harmonic traveller. It suggests music as an agent with
a will to explore and with a physical form independent of its undifferentiated
representation on the page. As a way spatially to conceive harmonic patterns
it is an imaginative tool, and when combined with the melodic and rhythmic
layers around this harmonic framework it allows us to bring Bach’s music to
life in 3D.
7
Shape as understood by performing musicians

Helen M. Prior
This chapter presents findings from a study of performing musicians and focuses
on some of their practices and beliefs related to musical shaping. Musical per-
formance has been studied in myriad ways, and with a wide range of aims
in mind (for a useful overview, see Gabrielsson 2003). Preparation for perfor-
mance (especially memorized performance) has been examined in considerable
detail, with researchers finding expert practice to be a highly structured activity
in which performers focus on three dimensions of a composition: the basic
dimension, which includes all aspects of the music requiring attention simply
to play the notes of the piece, and which therefore includes technical decisions;
the interpretative dimension, involving decisions about phrasing, dynamics
and tempo; and the performance dimension, which involves every aspect of
the piece that requires attention during performance, including basic, interpre-
tative and expressive performance cues (Chaffin, Imreh and Crawford 2002;
Chaffin et al. 2010). Experts often work on small sections of a piece of music,
determined by the musical structure, before joining these chunks together to
create larger sections as the piece becomes more familiar (Chaffin et al. 2002).
Decisions involved in musical performance preparation have also been exam-
ined, with three main types of performance decision being identified: intuitive,
deliberate and procedural, procedural being previously deliberate decisions
that have become intuitive over time (Bangert, Fabian et al. 2014). Bangert,
Schubert and Fabian (2014) propose a spiral model of musical decision-
making, in which a musician’s decisions switch from being intuitive to deliber-
ate and from there become procedural: the proportion of intuitive decisions
thus increases with expertise. As a performer focuses on new musical features,
the cycle is repeated.
Many of the decisions made by performers concern expressive performance,
the teaching and nature of which has been examined extensively (Brenner and
216
Shape as understood by performing musicians 217
Strand 2013; Davis 2009; Fabian, Timmers and Schubert 2014; Juslin 2003;
Juslin, Friberg and Bresin 2002; Juslin, Friberg and Schoonderwaldt 2004;
Juslin and Madison 1999; Karlsson and Juslin 2008). Particularly useful is the
GERMS model (Juslin 2003), which identifies five essential components of
musical expression. These are: Generative rules, which serve to clarify the musi-
cal structure through timing, dynamics and articulation; Emotional expression,
in which a range of parameters is used by a performer to convey an intended
emotional expression; Random variations, which are unavoidable and essential
for a performance to sound as though it is produced by a human being; Motion
principles, which incorporate the representation of intended and non-intended
biological motion in sound; and Stylistic unexpectedness, which involves the
creation of tension through the violation of expectations. Though these com-
ponents may not all be considered consciously by performers in their decision-
making, this division does provide some understanding of what performers are
doing in order to create an expressive performance.
Some studies examine the use of particular types of language, such as
metaphors, in relation to music performance preparation (Barten 1998;
Woody 2002), but few studies examine the use and meaning of only one word.
Usually, such an exercise would be rather futile, as much of the terminology
employed by musicians has a reasonably well-established definition. Shape,
or shaping, however, appears to have resisted formal definition in relation to
music, and yet seems to be a useful term for performers, as well as for other
musicians. A recent questionnaire study (Prior 2012c) revealed that perform-
ers use the notion of shaping when practising, in rehearsals, when teaching
and when playing music from a wide range of genres. The term was used in
relation to several ideas, from musical structure to musical expression, emo-
tion and tension; and in relation to specific musical features such as phrasing,
melodic line and dynamics. Overall, shape was found to be highly versatile
and multifaceted. This was an interesting finding in itself, but there was no
way in which an in-depth understanding of shaping could be gained through
these data, gathered as they were in an online questionnaire. A subsequent
interview study allowed greater interaction with a small number of partici-
pants and allowed the development of a model of the ways in which musical
shaping may be used by performers and understood by those studying them.
This study, and the model arising from those data, are presented and refined
within this chapter.
Aim and method
The aim of the interview study was to understand how performing musicians
use the idea of musical shape or shaping.
218 Music and Shape
PARTICIPANTS
Ten professional musicians were interviewed; five were violinists and the other
five harpsichordists. The choice of instruments was carefully considered, in
terms of both the researcher’s background knowledge and experience as a
musician and the potential this gave for insight into the techniques discussed by
the musicians, and also in terms of the instruments’ different capabilities, which
seemed likely to prompt interesting variations in the musicians’ conceptions
of musical shaping. Specifically, the differences between the instruments’ abil-
ity to sustain a sound, to produce sounds with a varied dynamic range and to
play chords were noted. A further difference between the instrumentalists was
the violinists’ close knowledge of their own instrument, in contrast with the
harpsichordists’ unfamiliarity with the harpsichord used in the study, a double
manual by Michael Johnson.
Details of the participants can be seen in Table 7.1. They ranged from eigh-
teen to fifty-four years of age, and their experience playing their instrument
ranged from less than ten years to forty years. They were all resident in the
UK, though some participants were originally from Australia, South America,
TABLE 7.1 Participants in the interview study
Name Instrument Age Group Years Playing Birthplace Place(s) of Study
Tina* Violin 25–34 11–20 UK Manchester University

(undergraduate); RNCM
(postgraduate); Sheffield
University (PG)
Bridget* Violin 25–34 21–30 UK TCM (UG)
Elsie* Violin 25–34 21–30 Australia Sydney Conservatorium
of Music (UG); RCM (PG)
Victor* Violin 35–44 31–40 Uruguay Privately (UG); RAM (PG)
Darragh Morgan Violin 35–44 21–30 Ireland GSMD (UG); Hong Kong
Academy of Performing
Arts (PG)
Yoshi* Harpsichord 25–34 11–20 Japan Queensland
Conservatorium (UG,
PG); RAM (PG), TCM
(PG); University of York
(PG)
Katharine May Harpsichord 45–54 21–30 UK RCM (UG, PG)
Jane Chapman Harpsichord 45–54 31–40 UK RCM; Sweelinck
Conservatory,
Amsterdam
Julian Perkins Harpsichord 25–34 21–30 UK University of Cambridge
(UG); RAM (PG); Schola
Cantorum, Basel (PG)
Nathaniel Mander Harpsichord 18–24 10 UK RAM
Note: Names with an asterisk are pseudonyms; other participants wished to be named.

Ireland and Japan. Many of them had studied performance at universities and
conservatoires, often to postgraduate level. They were all established profes-
sional performers, the majority of their earnings coming from performance,
though some of them also taught or had research interests.
THE INTERVIEWER
The personal experiences and attributes of the researcher are acknowledged to

have an influence on all stages of the research process. For the sake of trans-
parency, specific details are provided here, similar in nature to those provided
above for the participants. The interviewer was female and (in terms of the clas-
sifications used with the participants) in the twenty-five to thirty-four age group
and the twenty-one to thirty years’ experience group. She had studied music at
a university as an undergraduate before studying music psychology as a post-
graduate, but had maintained involvement in practical music-making in vari-
ous spheres throughout this time. The researcher did not disclose her musical
experiences explicitly to the participants unless they asked specific questions.
PROCEDURE
The participants were asked to attend an interview at King’s College London,

and to bring some music with them that they knew well or had been working
on recently. Violinists were asked to bring their instrument. All participants
were given a consent form and a brief demographic questionnaire to complete
before the main interview began. The interview schedule had been developed
using the findings of a previous questionnaire study (Prior 2012c) but was also
designed to incorporate practical music-making. At the beginning of the inter-
views the participants were asked to play a brief musical extract selected for
its potential for musical shaping and its probable unfamiliarity.1 Participants
were asked to play the extract as they would normally approach a new piece
of music, and then to describe what they were thinking about as they were
playing, as they might to a student. After this discussion, they were told that
the study was about musical shape or shaping, and they were asked to play the
extract again, while thinking about the shape, or their shaping, of the music.
They were then asked to describe their thoughts once more. Some participants
were also asked to play an extract without musical shaping and to describe
their thoughts again. Although this procedure could not be expected to allow
direct access to participants’ thought processes (Ericsson 2006; Ericsson and
Simon 1993), it did elicit helpful, descriptive responses that had some degree of
ecological validity.
This task was used as a prompt for further discussion. Participants were
asked how this compared to their usual experiences of shaping music, what they
meant when they referred to musical shape, and about shaping pieces they had
220 Music and Shape
brought with them or knew well. The schedule contents and order were flexible
to ensure that the interviews felt natural and comfortable for the participants.
At the end of the interview, participants signed the consent form and were
compensated for their time. The interviews were recorded using a Panasonic
SD700 HD Camcorder and a Sony ICD-UX200 Digital Voice Recorder.
DATA ANALYSIS
The interviews generated verbal, musical and gestural data, all of which were
analysed to some extent (Prior 2012a). This chapter focuses mainly on the ver-
bal data, with reference to some of the musical data. The verbal data were ana-
lysed with Interpretative Phenomenological Analysis (IPA). This approach has
been widely used in health psychology, but has also been found to work well in
research in music psychology (McPherson, Davidson and Faulkner 2012: 92) as
it allows participants’ thoughts and experiences to be examined idiographi-
cally and in detail. In particular, IPA is appropriate for situations in which
researchers are conducting exploratory studies investigating how individuals
are making sense of their personal and social world and the processes within
that world (Smith and Osborn 2003). The use of IPA was particularly appro-
priate here because of the complex and potentially idiosyncratic ways in which
expert musicians perceive and understand their work, as well as the potential
for emotional involvement in their practices. What constitutes a ‘good’ musical
performance is, in part, socially constructed, determined not only by techni-
cal expertise but by the tastes of both the individual and the period in which
they are performing (Leech-Wilkinson 2009). It therefore seems appropriate to
examine the processes of musical shaping with a method such as IPA that was
developed within the framework of social constructionism.
Data analysis proceeded according to the guidelines for IPA provided by its
pioneers (Smith, Flowers and Larkin 2009; Smith and Osborn 2003). Following
each interview, the recording was listened to in its entirety and initial notes
were made. The data were then transcribed verbatim, but the recording was
used alongside the text throughout the coding process. Initial coding focused
on a phenomenological approach to the data, identifying the main concerns of
each participant and the meaning these concerns had for them. A second stage
of coding followed with an interpretative approach which attempted to iden-
tify how and why the participant had those concerns and to link the phenom-
enological codes to more abstract ideas. The coding was validated by another
member of the research team. Themes were generated from the coded data,
and a summary was written for each participant in relation to each theme. A
summary diagram was also created for each participant. Each interview was
analysed completely before moving on to the next participant’s data.
During the interviews, participants frequently demonstrated their thoughts
about musical shaping on their instrument or by singing. These data were seen
as of equal importance to their verbal descriptions; indeed, some musicians

seemed to feel that they could communicate more effectively through musi-
cal demonstrations than through verbal description. Moreover, these musical
demonstrations can be seen to circumvent, to some extent, the limitations of
the representational validity of language (Willig 2001). Where these musical
examples were seen to shed light on a particular discussion, they were analysed
using Sonic Visualiser (Cannam, Landone and Sandler 2010). Musical demon-
strations were never considered in isolation; rather, multiple examples from one
or more participants were compared in conjunction with the verbal descrip-
tions participants provided. In this way, the musical demonstrations informed
the analysis of the verbal data and the verbal data provided explanations for
particular features of musical shaping that were demonstrated.
During their verbal explanations and musical demonstrations, participants
often used gestures. These gestures are currently the subject of further analysis
and are not considered fully here.
Results and a proposed model
Although the data gathered provided scope for the consideration of musi-
cal shaping in considerable detail, within this chapter a broad view is taken,
with the aim of creating a data-led model of the use of musical shaping by
musicians. The model (which also acts as a summary of the data) is shown
in Figure 7.1 and is available as well on the companion website , complete
with tables showing examples of each component. On the far left of the model
is the concept or idea of a musical level that can be controlled (or for some
participants, ‘shaped’). Next to this is a column of musical triggers for shap-
ing: features of the music that participants identified as influencing their
shaping decisions. On the far right of the model is the change in sound that
results from the musical levels being controlled or shaped in performance.
These three columns are arranged in approximate size o rder, with the larg-
est features at the top and the smallest at the bottom. One of the remaining
two columns in the model outlines the technical modifications that are used
to create this changed or shaped sound on the two instruments studied. The
separation of the two instruments within this column allows for the fact that
each instrument has limitations that restrict a performer’s ability to control the
changes in sound represented in the final column. Although these technical
approaches could be the participants’ main focus of attention, many partici-
pants appeared to ‘skip over’ these detailed decision-making processes, using
more or less metaphorical ideas like shape heuristically to help them to create
a musically expressive performance, a notion that is represented by the central
column in the model. Because many of the heuristics seemed to be applicable
at multiple levels, they are arranged not in size order, but alphabetically. Each
222 Music and Shape
Musical level Trigger Heuristic Technical Change in sound

modification
View of the score Audience Instrumentation
Concert Violin
Breathing Bow pressure,
Musical structure speed, angle, Programme
Composer contact
Whole piece Words on the Direction
LH contact, Tempo
score
position,
Emotions
movement
Harmony
Gesture Ornamentation
Movement
Polyphony Imagery
Harpsichord Timbre
Importance
Melodic contour Registration
Section Instrument
Attack speed/ Timing
Rhythm
Line weight
Patterns Attack/release Dynamics
Natural
Phrase synchrony -
Dynamic Shape spreading/
markings over-holding Vibrato
Singing
Articulation or Release speed
Note phrase markings Style Articulation
FIGURE 7.1
Model of musical shaping. In the online version, each component is numbered, and
numbered examples of each component are presented in linked tables. See the companion website:
column and component of the model may be active independently or in com-

bination with any of the other components at any point in time; components
are not tied to other components of the model, be they similar or different in
nature or scale. Each component of the model is explored here before quota-
tions are used to highlight the ways in which multiple dimensions of the model
may interact.
MULTIPLE LEVELS OF SHAPING
The first column of the model focuses on the musical levels discussed by par-
ticipants in relation to musical shaping. It became apparent through the inter-
views that shape was a very flexible term in many ways, not least in the scale at
which it could be applied. Table 7.2 shows the participants who discussed using
shaping at each level. Examples from all participants may be found on the com-
panion website, and some of these are discussed later in the chapter; here, the
focus remains on a few specific quotations from participants discussing shape
at multiple scales.
Several participants discussed the use of the term ‘shape’ at more than one
level. Elsie commented that ‘Every note should have some kind of shape. And
every phrase needs to have a shape’,2 and she also explained that her understand-
ing of the large-scale shape of the music affected the ways in which she shaped at
TABLE 7.2 Participants who discussed each musical level (see Table 7.1 for their names, repre-
sented here by initials)
Musical Level Violinists Harpsichordists Grand Total
B D E T V Total Ja Ju K N Y Total
Concert 0 ✓ ✓ ✓ 3 3
Whole piece ✓ ✓ ✓ 3 ✓ ✓ ✓ ✓ 4 7
Movement ✓ ✓ ✓ ✓ 4 ✓ ✓ ✓ ✓ 4 8
Section ✓ ✓ ✓ ✓ 4 ✓ ✓ ✓ ✓ 4 8
Phrase ✓ ✓ ✓ ✓ ✓ 5 ✓ ✓ ✓ ✓ ✓ 5 10
Note ✓ ✓ ✓ 3 ✓ ✓ ✓ ✓ ✓ 5 8
smaller levels.3 Victor, too, saw ‘shaping’ as a flexible term that could apply to
several levels of the music. When asked to define shape, Victor used metaphors
of language and narrative:
RESEARCHER: So if I asked you to sum up ‘shaping’, what is it, in a

nutshell?
VICTOR: I think it’s making sense, saying a sentence that makes sense.
And it has a starting point, and a development and a climax, and a
resolution, and a stop.
RESEARCHER: OK. And is that on a large scale, or a small scale, or both?
VICTOR: I think both.4
In contrast, Julian used more technical language to describe the slight variation
in meaning that he felt occurred with the use of shape in different contexts:
JULIAN: I suppose if someone said to me . . . ‘What shape does the
music have to you?’ I’d think instinctively they were talking about the
structure. . . So structure and shape sort of overlap in that capacity. If
you’re talking about a phrase, and you said the shape, I’d be thinking
about, as a player, the sort of technical way you might play it, in
terms of grouping of notes, and the articulations . . . what degrees of
staccato or legato do we want . . . in a particular given phrase. But . . .
with baroque music, one note can have shape, a messa di voce, so you
can just, you know, if you were talking to a violinist or particularly
a singer, and you said ‘What shape does that note have?’ you might
immediately think of the swelling and diminuendo of one note.’5
For Darragh, however, the term ‘shape’ applied specifically to the phrasing
level, with other words being more appropriate for larger or smaller levels of
shaping that were discussed by other participants:
RESEARCHER: Some people sometimes use shape and structure

interchangeably; would you agree with that, or do you think shape
is different?
224 Music and Shape
DARRAGH: I think shape is to do with, again, this thing of tessitura, of

line, of actual line, whereas the structure, yeah of course, you know,
a bigger question, a bigger picture. . . Point A to point B, to point C,
whatever you want to call it . . . but it’s from there, to there, to there,
to there. And that’s the piece of music.
RESEARCHER: OK. So your shaping is on a relatively small scale?
DARRAGH: Exactly . . . it’s more under a micro-magnifying glass,
whereas the other is looking at the big picture, isn’t it, probably.
RESEARCHER: OK. . . Are you thinking about shape when you’re playing
a single note, on its own?
DARRAGH: N-no. You’re thinking possibly about colour, about quality
of sound, about length, because of the bow.6
These ideas could be seen to operate on a spectrum of specificity and scale, with
Elsie and Victor at one extreme, using the term ‘shaping’ flexibly at all levels,
Julian in a more central position acknowledging the slight variation of mean-
ing in the word between small and large scales, and Darragh at the opposite
extreme, reserving the idea for the phrasing level and using other terminology
for variations in sound at other levels. Specific examples of shaping at each level
are discussed later in the chapter.
TRIGGERS FOR MUSICAL SHAPING
The second column within the model shows the score-based triggers identified
by participants as influencing their shape-related decision-making. Table 7.3
shows the participants who reported using each idea. In the model and in the
table, the triggers are shown in the order that relates approximately to musi-
cal scale, with large-scale ideas at the top and small-scale ideas at the bottom.
Some of the titles of these ideas may seem self-explanatory; however, others
are more complicated, and therefore the categories are discussed briefly and in
order, with a few examples. Full examples are provided online.
View of the score
This particular musical trigger was usually an overarching philosophical stance
adopted by the musicians relating to how they felt they should use the informa-
tion provided on the score by the composer or the editor. Participants discussed
the idea of ‘shaping as being anything that you’re doing to get the music off the
page, and to the listener’,7 with the score providing clues as to how this might be
achieved.8 Other participants discussed the score as their only tangible connec-
tion with a composer and that composer’s intentions, with Elsie commenting,
‘it’s just you and the composer again’,9 and Victor describing the score as a code
that he has to interpret.10
TABLE 7.3 Participants who discussed each trigger
Musical Trigger Violinists Harpsichordists Grand Total
View of the ✓ ✓ ✓ ✓ 4 ✓ ✓ 2 6
score
Musical ✓ ✓ ✓ ✓ 4 ✓ ✓ ✓ 3 7
structure
Words on the ✓ ✓ 2 ✓ 1 3
score
Harmony ✓ ✓ ✓ ✓ ✓ 5 ✓ ✓ ✓ ✓ ✓ 5 10
Polyphony ✓ ✓ ✓ 3 ✓ ✓ ✓ ✓ ✓ 5 8
Melodic ✓ ✓ ✓ ✓ ✓ 5 ✓ ✓ ✓ ✓ ✓ 5 10
contour
Rhythm ✓ ✓ ✓ 3 ✓ ✓ ✓ ✓ ✓ 5 8
Patterns ✓ 1 ✓ ✓ 2 3
Dynamic ✓ ✓ ✓ 3 0 3
markings
Articulation ✓ ✓ ✓ ✓ 4 ✓ ✓ ✓ 3 7
or phrase
markings
Musical structure
Some of the interviewees discussed musical structure as something that had an
influence on their musical shaping, with Elsie commenting that she is always
aware of her position within the musical structure as she plays11 and confirm-
ing that this influences her shaping on a smaller scale.12 Others described the
ways they would highlight structural boundaries13 or create a sense of structure
through their playing.14 Darragh discussed structure as something his fellow per-
formers frequently liked to be aware of before making interpretative decisions.15
Words on the score
Performance directions,16 words provided by the composer to convey a pro-
gramme or appropriate imagery for a piece,17 and the lyrics of a vocal piece18
were all reported to have a direct bearing on the musical shaping used by the
performers.
Harmony
Harmony was one of only two triggers to be discussed by all participants,
though not all of them felt comfortable in using this trigger themselves, with
Bridget suggesting that she found other methods more intuitive.19 All four of
the other violinists, however, described how harmony could influence their
shaping decisions, with Victor arguing that much of the expressiveness of a
226 Music and Shape
performance can be lost if a performer is unaware of the harmony underly-

ing the melody they are playing.20 Elsie felt that harmony was central to her
shaping of the music,21 and she, Tina, Victor and Darragh provided numer-
ous examples of how this could occur. The harpsichordists, too, were focused
on harmony for many of their shaping and expressive decisions, with Julian
describing it as one of the first things he looked at in the score;22 and all five
harpsichordists discussed harmonic features of the music in relation to their
shaping decisions.23
Polyphony
Participants discussed musical parts played by others in ensembles influencing
their musical shaping,24 as well as their awareness of ‘voices’ within their own
parts, and of the shaping decisions they made to try to highlight those voices
for their listeners.25
Melodic contour
Like harmony, melodic contour was discussed by all participants. Many dis-
cussed mirroring melodic contours with dynamics26 or highlighting the top of a
phrase through timing.27 Others discussed descending melodic lines or tessitura
more generally.28
Rhythm
Participants discussed the ‘shape of the rhythm’,29 the hierarchical relationships
between beats in a bar,30 and the link between those relationships and bowing
patterns.31 Others discussed the appropriate grouping of particular rhythmic pat-
terns32 and how the shaping of a phrase related to its rhythmic (and other) constit-
uents.33 Tina discussed decisions relating to the musical shaping of syncopation.34
Patterns
Victor, Jane and Katharine all discussed patterns (such as harmonic sequences)
in the music that influenced their shaping decisions.35
Dynamic markings
None of the harpsichordists discussed dynamic markings, probably because
there were none present in their scores. Bridget, Tina and Victor discussed
dynamic markings as a trigger for their musical shaping, though Victor sug-
gested that he did not feel he needed to think consciously about applying them.
Rather, he suggested, ‘I think dynamics fall into place’.36
Articulation or phrase markings

Four violinists and two harpsichordists discussed articulation (or bowing)
markings and how these influenced their musical shaping decisions. Nathaniel
discussed the influence of phrase markings on his musical shaping in some
detail, specifically noting the phrasing indicated by the composer and what this
meant for him as a performer.37
Summary of musical triggers

Overall, a range of musical triggers seemed to prompt the participants in their
shaping decisions. These were rarely used in isolation; instead, they were used
in combination with others in specific ways that were appropriate for the piece
of music being discussed and the instrument on which it was to be performed.
Participants appeared to have preferences for particular triggers, with some
particularly favouring harmonic features and others focusing more on details
such as dynamic or articulation markings or rhythmic features. The next two
sections focus on the means by which participants used these triggers to modify
the sound they produced, namely through technical modifications and through
heuristics for musical expression. These are deliberately discussed in the oppo-
site order to that in which they are presented in the model, so that the most
tangible and concrete concepts are raised before less specific ideas that may, on
occasion, replace them in conscious thought.
TECHNICAL MODIFICATIONS
As shown in Table 7.4, participants discussed a range of instrument-specific

technical modifications that they could apply in relation to musical shaping.
The categories shown in the table reflect the comments made by partici-
pants in relation to the intertwined nature of the technical modifications they
were able to make. When a participant discussed a change in bow pressure,
for example, they frequently mentioned other changes, such as a modification
in bow speed. Although they also would couple these with changes in the left
hand, such as movements required for clean shifting or for vibrato, these were
sometimes considered separately from bowing considerations, and even if not,
TABLE 7.4 Participants who discussed each technical modification
Violinists Harpsichordists
Technical B D E T V Total Technical Ja Ju K N Y Total

modification modification
Bow pressure, ✓ ✓ ✓ ✓ ✓ 5 Registration ✓ ✓ ✓ 3
speed, angle, Attack speed / ✓ ✓ ✓ ✓ ✓ 5
contact weight
Left-hand ✓ ✓ ✓ ✓ ✓ 5 Attack/release ✓ ✓ ✓ ✓ ✓ 5
contact, synchrony
position, (spreading /
movement over-holding)
Release speed ✓ ✓ 2
228 Music and Shape
can be separated conceptually simply because of the physical independence of

the two hands. All violinists offered features within these two categories.38
The harpsichordists were slightly more varied in their discussions, with only
some of them mentioning registration. There is no doubt that this is something
carefully considered by all harpsichordists in relation to a performance, but in
the interviews with Julian, Katharine and Nathaniel, it was raised in relation
to musical shaping, and clear indications were made that the registration had
an effect on both the overall shaping of a suite or other set of pieces39 and that
the registration had implications for the shaping of phrases and notes.40 All the
harpsichordists talked about the ways in which they varied the attack speed and
weight of a note, and the synchrony of attack and release (i.e. the spreading
of chords).41 Only two participants discussed the speed at which they would
release the keys.42
Specific examples of technical modifications described and executed by par-
ticipants are discussed later in the chapter.
HEURISTICS FOR MUSICAL EXPRESSION
Although the technical approaches discussed above could be the participants’

main focus of attention, many participants appeared to ‘skip over’ these
detailed decision-making processes, using more or less metaphorical ideas,
like shape or shaping, to help them create a musically expressive performance.
Participants would sometimes find it hard to discuss some of the technical
approaches mentioned above,43 and would prefer to use terms that were less
specific but perhaps more useful. It seemed that participants were using words
like ‘shape’ or ‘direction’, or ideas about ‘where the music was going’ as heu-
ristics, or ‘short-cuts based on experience that solve problems too complex to
resolve quickly enough using analytical thought’ (Leech-Wilkinson and Prior
2014: 36). These heuristics were often metaphorical, and participants seemed
to use them to consider ways of playing the music expressively without having
to focus on specific technical aspects of their playing. Yoshi commented, ‘I
think there’s a lot of things I do, sort of, naturally, that I don’t . . . consciously
think about.’44 A particularly helpful example of this was provided by Victor:
RESEARCHER: I noticed as well, here, . . . because you were ‘heading for

the top’, you were moving up towards the heel of the bow, and
lengthening your bow stroke. Is that a conscious thing that you’re
thinking about, or is it more that you’re just thinking that you’re
heading to the top, and that’s . . . shorthand for all the technical
things that are going on?
VICTOR: Absolutely. No, I didn’t think about that at all. Um, it probably
just happened because it’s integrated. Now maybe subconsciously.45
Some of the metaphorical ideas concerning the music and the appropriate
musical shaping seemed to be expressed through gesture, exposing participants’
multimodal understanding of musical shaping. Participants often discussed
ideas of direction, movement and gesture when talking about their musical
shaping; and, while they did so, they often used gestures in conjunction with
their descriptions or demonstrations of the music. Participants used height to
represent pitch, and vertical gestures to indicate rhythmic features. Arch-shapes
were used to indicate the shape of a phrase, and larger arches, wave patterns or
circular gestures to indicate the shape of an overall piece (Prior 2012a, 2012b).
Further analysis is intended to investigate whether or not there are specific dif-
ferences between the gestures used by violinists and harpsichordists, as well
as correspondences between the gestures used by participants, their verbal
descriptions and their musical demonstrations.
Specific examples of heuristics used by participants are discussed in more
detail later in the chapter; however, Table 7.5 shows the use of a range of heuris-
tic terms in the interviews and their distribution among participants. Because of
their holistic and nonspecific nature, the terms are not listed in order of size, as
other components of the model have been. Instead, they are listed alphabetically.
CHANGES IN SOUND
Participants discussed a range of changes in sound that correspond to those

used in expressive performance, namely vibrato, dynamics, timing fluctuations,
timbre, ornamentation and tempo. In addition, participants also discussed the
TABLE 7.5 Participants who discussed each heuristic
Heuristic Violinists Harpsichordists Grand Total
Audience ✓ ✓ ✓ ✓ ✓ 5 ✓ ✓ ✓ 3 8
Breathing ✓ ✓ ✓ 3 ✓ ✓ ✓ 3 6
Composer ✓ ✓ 2 ✓ ✓ ✓ 3 5
Direction ✓ ✓ ✓ ✓ ✓ 5 ✓ ✓ ✓ ✓ ✓ 5 10
Emotions ✓ ✓ ✓ 3 ✓ ✓ ✓ ✓ 4 7
Gesture ✓ ✓ ✓ 3 ✓ ✓ ✓ ✓ 4 7
Imagery ✓ ✓ 1 ✓ ✓ ✓ 3 4
Importance ✓ ✓ ✓ 3 ✓ ✓ ✓ ✓ 4 7
Instrument ✓ 1 ✓ ✓ ✓ ✓ ✓ 5 6
Line ✓ ✓ ✓ 3 ✓ ✓ ✓ 3 5
Natural ✓ ✓ ✓ ✓ 4 ✓ ✓ ✓ ✓ 4 8
Shape ✓ ✓ ✓ ✓ ✓ 5 ✓ ✓ ✓ ✓ ✓ 5 10
Singing ✓ ✓ 2 ✓ ✓ ✓ 3 5
Style ✓ ✓ ✓ ✓ ✓ 5 ✓ ✓ ✓ 3 8
230 Music and Shape
programme of a concert and instrumentation. Examples of these are high-

lighted later in the chapter. Before this, selected overall situational factors that
may influence the working of the model are briefly discussed.
OVERALL SITUATIONAL FACTORS
Each column of this model is affected by some overall situational factors, such
as whether the performance decisions are made in private practice, in rehearsal
or in performance; or whether the music involves other performers who influ-
ence the shaping decisions made. Several participants noted the value of per-
formance for generating ideas about musical shaping, an idea supported by
some existing research (Doğantan-Dack 2013). Yoshi suggested that she was
‘more alert’ during performance than when practising, which enabled her to
notice new features of the music and to have new ideas concerning the shaping
of those features. If Yoshi considers those ideas to be ‘too risky’ she ‘saves them
for later’, but there are times when she tries new ideas during a performance.46
Both Tina and Katharine valued spontaneity in their ensemble performances,
and noted how other performers would influence their own shaping during a
concert. Tina discussed the ‘communal’ and ‘spontaneous’ shaping of a Haydn
string quartet, stating that ‘the ideal is that at any point, really, one person
might help to guide it in a particular way, so that . . . the contour of the piece
changes’.47 Similarly, Katharine reported that she particularly enjoyed perform-
ing with ‘someone who takes a few risks, and does something spontaneously
that you can then react to: . . . that’s really nice music-making’.48 Hence, it is
anticipated that the situation in which musical shaping is considered will influ-
ence the extent to which each aspect of the model is used.
TRACING PATHWAYS THROUGH THE MODEL
The value of the model described above lies not only in its ability to outline
various aspects of shaping as discussed by the participants in the study, but also
its potential for showing the combinations of factors used by participants in
specific situations. With this in mind, specific quotations and musical examples
from the interviews are discussed alongside presentations of the model that
highlight which components are active in each situation. Examples showing
shaping at each musical level are offered.
Shaping at the concert level

Katharine considered the varying of instrumentation and programme for an
audience at a concert:
KATHARINE: I’ve just been doing some concerts over the
weekend . . . basically accompanying, you know, a small chamber
group, but on Sunday I actually played the suite, as a break in
the programme from having violin sound or singer. Just so that

there’s . . . variety within the . . . listening experience, I suppose.
You’re not listening to string players the whole time, or whatever, a
little bit of time for something a bit different.49
This could be seen as shaping a programme and the instrumentation of that

programme to suit a listener. It can be represented by the model as shown in
Figure 7.2, available online . The shaping level is designated as the level of
the whole concert, and the programme and instrumentation change as a result
of the consideration of the audience. All these components of the model are
shown in black, whereas the rest of the model is shown in grey.
Shaping at the level of a whole piece or movement

Bridget commented on shaping at the levels of phrasing, movement and the
whole piece. She was using ideas related to the ‘direction’ heuristic (‘where’s this
movement going’) and also the ‘importance’ heuristic (‘where’s the . . . high-
light of this’). The comment does not encompass the effect of these ideas on
the musical sound produced, and therefore this uncertainty is represented by
a question mark over this area of the model in Figure 7.3, available online .
BRIDGET: the main thing that I come across, talking about shape,
whether it’s just on my own, or in rehearsals, in an ensemble, is about
phrasing, like, the small phrases, . . . one line, or the bigger shape of
the whole piece, or the whole movement, so, instead of where’s the,
where’s this phrase going, where’s this movement going, where’s the,
you know, the highlight of this, where’s the whole, all these phrases,
where are they going to?50
Shaping at the movement level

Elsie discussed shaping at the movement level in some detail, describing how
her awareness of the larger-scale structure of a piece of music would affect her
shaping of the piece:
ELSIE: For example, if this was a slow movement, in between two outer
movements, . . . I’d be very, very careful to create a mood in which the
music could just sing. And it would also give the audience a chance
to relax, in between the two outer allegro movements, or presto, or
whatever it is. You have to time pieces throughout the whole thing,
and you have to know when to back off, and when to really, you
know, go for it, I suppose.51
She continued, playing the music in two ways, and describing her thoughts
about what she was doing:
ELSIE: I’ll play it in two different ways. If . . . I was playing this, within
the context of a larger piece of music, and I played it sort of, um
232 Music and Shape
[plays]. That might sound, sort of OK, but I’m still so involved with
it, do you know what I mean? Um, why not just let it go? [sighs] and
give the audience a chance to go, ‘Oh, that’s really nice’ you know, in
between having been gripped for the first thing, you know, so I could
just [plays] and just [plays]. It could give something, just completely
different. And it’s all to do with where the music lies within the whole
thing.52
These sound examples are available on the companion website, and it is pos-
sible to examine them for differences between the two versions. Using Sonic
Visualiser (Cannam et al. 2010), we identified the main beats of the excerpts
and exported the data for statistical analysis (Table 7.6). The two versions
differed in tempo: the ‘involved’ version ( ) had a shorter mean beat length
and was therefore faster than the ‘letting go’ version ( ). In an interview sit-
uation, the significance of this is difficult to assess; however, the variance of
the beat length also differed, with the ‘involved’ version having a significantly
larger variance than the ‘letting go’ version (Levene’s test of homogeneity of
variance: F (1, 26) = 13.9, p = 0.001). The two versions also differed in Elsie’s
use of dynamics. Although the mean power of each excerpt cannot be judged
reliably from this interview source, the variance of the power showed a con-
siderable difference between the two versions, with the ‘involved’ version hav-
ing significantly greater variance than the ‘letting go’ version (Levene’s test
of homogeneity of variance: F (1, 1739) = 15.9, p < 0.001). Some of this was
achieved by using less bow pressure, though Elsie did not specify any other
technical modifications. When listening, one can hear a slight difference in
the vibrato used in each version, with the ‘involved’ version seeming to have
a slightly faster vibrato that begins more promptly after the start of the note
than the ‘letting go’ version.
When representing the whole of this quotation with the model, we can see
that Elsie is considering the shaping of a piece and movement as a whole, and
that she is considering the musical structure of the whole work as a trigger for
her musical shaping. She is using the heuristics of ‘audience’, ‘emotions’ and
TABLE 7.6 Differences between Elsie’s two versions of the

extract [00:11:00]
Involved Letting Go
Mean beat length 0.562 0.625

Variance of beat length 0.010 0.003
Mean power −15.8 −15.5
Variance of power 48.5 31.6
‘singing’, and employing technical modifications using both hands to modify

the overall tempo, the timbre, the timing fluctuations, the dynamics and the
vibrato.
Interestingly, Julian discussed a similar idea to Elsie’s ‘letting go’:
JULIAN: sometimes if you make things, if you emote things too much,
it can become a bit wearisome to listen to. Sometimes just sort of,
stating simplicity is beautiful in itself. . . And I think that might
apply here. So I mean, I was sort of suggesting a lot of things, and
I think, if I was performing it in a concert, I might sort of throw a
lot of those, not throw them away, but, just make them a secondary
consideration, just for the, just going for a simple reading . . . So it’s
not too . . . convoluted.53
His performance preparation has involved ideas of shaping, but he some-

times approaches a performance with the desire to give a simpler performance.
Perhaps, as Elsie’s data suggest, a ‘less shaped’ performance can sometimes be
desirable.
Shaping at the section level

Nathaniel discussed the shaping of phrases over longer sections of the music,
noting the harmony, melodic contour and patterns such as repetition, but also
using heuristics of the composer, suggested by his use of ‘he’ (the composer)
rather than ‘the music’ or ‘it’; direction, indicated by terms such as ‘goes’,
‘going’, ‘that way’ and ‘all the way to here’; emotions, suggested by the words
‘amazing’, ‘miraculous’, ‘incredible’ and ‘defeated’; and style, indicated by his
comment about the significance of repetition in baroque music. He does not
discuss the technical means by which he modifies the sound, but he mentions
changes in timing. This quotation is represented on the model in Figure 7.5,
available online .
NATHANIEL: So at the end of that, um let’s just see [plays] then you
start again [plays] and then, this time, he goes up [plays]—isn’t that
amazing?—and back down, before he goes that way. Then he goes
on, B♭ minor [plays] and on, C minor [plays] and then he keeps
going with an extended phrase, all the way to here and here, all the
way to the dominant, and then when he gets here, it’s miraculous,
[plays] and then we get [plays] that chord [plays] which is just
incredible, isn’t it? [plays]. So we get [plays] and away, and
there it is again [plays] which, I suppose, in the baroque, repetition,
it’s, it’s all about something more. So actually [plays] this time,
I think it’s even more defeated [plays] so I get slightly slower
[plays].54
234 Music and Shape
Shaping at the phrasing level

Participants gave many examples of shaping at the phrasing level, and so sev-
eral examples are considered here. Jane discussed shaping phrases using vivid
imagery:
JANE: But I’m thinking of . . . not just the shape that’s up and down, . . .
I was thinking of shapes that swell. Again, it’s my three-dimensional
thing, something that swells out, like a kind of serpent with
swellings in its body! [laughs] . . . So it not just a slippery snake that
goes like that, it’s something that kind of opens out and expands. . .
RESEARCHER: Can you tell me where?
JANE: Where it is, I suppose again, it would come to the harmonic thing
[plays]. That’s a sort of [plays], that’s a ‘here I am’ [plays], a sort
of visible [plays]. That to me is where he’s swelling out . . . puffing
himself up, but still he’s got energy to carry on [plays]. Now that
could be either [plays]; that could be just going away to nothing so,
I suppose the shape of that, thin shape, fat bulbous shape, starting
fairly bulbously, getting thinner, more bulbous as it comes down
again, and then going off to, just disappearing off. . . Which is . . . the
way the harpsichord works; you could do it completely the opposite
on the piano, because of . . . the dynamics, so in a way, the lack of
dynamics, . . . means that you have to follow, what the instrument’s
telling you. . . While on the piano, I could play that [sings] at the end,
but on the harpsichord I can do the [plays]. Some holding, but, could
do that I suppose [plays]. . . And the fact that he’s put er, lines over
each one, shows a kind of gestural [plays], gestural shape [plays].
Slightly rounded at the end there.55
When represented on the model, this quotation highlights the phrase level,
triggers of harmony and melodic contour, heuristics of gesture, imagery,
instrument and shape, technical modifications relating to over-holding, and
timing as a change in sound (see Figure 7.6, available online ).
Victor discussed shaping a phrase in slightly more prosaic terms, though
he too was frequently emotionally invested in the music he was discussing
and playing. He noted the musical triggers of harmony, melodic contour
and rhythm, using the heuristic of the audience (listener) and the metaphor-
ical imagery of communication to convey his ideas. The following quote is
represented in Figure 7.7, available online :
VICTOR: it’s about how the listener will receive something that makes
sense. So how the melody, how the phrase is made up, is completely
unique, and it’s made up of technical considerations of rhythm,
pitch, harmony, of where the top point is, where it’s going, how
fast it’s getting there, . . . how slowly or fast it unravels, how it does.
I think phrasing’s about being able to see that, from this [indicates
score].56
Tina discussed wide-ranging parts of the model when talking about shaping
a phrase. She discussed musical triggers of melodic contour, harmony and
dynamic markings, and heuristics of audience, direction, imagery and line, as
well as changes in sound relating to timing and dynamics. Her quote is repre-
sented in Figure 7.8, available online .:
TINA: Yes, I suppose the shape of a phrase, whether it goes up
or . . . down, for example, the first line, thinking of it generally,
growing up to the top, and down again . . .
RESEARCHER: So is it the pitch you’re thinking about, in terms of the
shape, or—
TINA: Pitch, and, well, the dynamic, which is written in anyway. And
direction, so . . . some sort of forward movement towards the higher
point of it, so sort of trying to reach the top of it and then perhaps
away, and relaxing on the way back down again.
RESEARCHER: Do you mean forward movement in terms of tempo, or a
combination of things, or—
TINA: Um, not exactly tempo, not an accelerando, but a sense of it.
Someone I know describes things as, you play them either in the
present tense, or the future, or the past, so I s’pose, if you play
something in the future, you’re sort of looking forwards . . . um,
which doesn’t exactly mean you play . . . faster, . . . it means you’re
sort of on the front edge of maybe, of what you think the tempo is,
rather than the back edge.
RESEARCHER: Yeah, OK. . . Would you describe that as rubato, or is it
not quite as much as that?
TINA: It’s not, no, not as much as that, just a general sense, I suppose
a sense of ‘line’ through . . . some kind of thread that, your, sort
of, intention, that comes across. . . I suppose if you’re speaking, if
you’re reading something out loud, you make sure the words within a
sentence carry on, even though you have to articulate each word and
things, but you don’t [pauses] pause [pauses] until you get to the end
of the sentence, you make sure you’ve got there, I suppose.57
Darragh discussed a technically and perceptually complex passage from
Bach’s E minor Partita which contains implied polyphony. He noted how for
this particular passage, little conscious shaping was required, an approach
which is supported by recent research (Davis 2009), whereas at the end of the
passage he would begin shaping the music once more. It was apparent from
his playing that he was referring not only to the timing fluctuations within
his performance, but also the dynamic range, the timbre or tone colour, and
vibrato.
236 Music and Shape
DARRAGH: So, you know, in a sense, you know, here we are talking

about music and interpretation, and shape, and phrasing, and all
that, but you know, music’s always been there, because it’s Bach,
and it’s genius, and it’s probably the best composer ever, but, if you
just kind of can play well, in tune, for that particular passage, and,
you know, technically be in control of what you’re doing, already
a certain amount of the battle is won, isn’t it? . . . he’s written the
genius into it, hasn’t he? . . . all I’m saying is you don’t, there’s not
anything, in one sense, you need to add to this particular passage, for
instance, right from [plays]. You know, if I was just really struggling
with that, [plays] you wouldn’t have the right flow. . . But because I’m
lucky enough, and I gradually worked it out, uh, where my bow’s
meant to be, uh, [plays] then at the end you start, yeah, shaping
again, or, taking time for the music to breathe again.58
When represented on the model (see Figure 7.9, online ), this quotation high-
lights shaping at the phrase level, with musical triggers of melodic contour and
(implied) polyphony, heuristics of breathing, composer and shape, technical
modifications in both hands (the left hand has complex fingering patterns and
shifts, and the bow moves in changing patterns relating to string crossings),
affecting the timbre, the timing and probably the dynamic variation and vibrato.
Shaping at the note level

Elsie gave a very clear example of shaping a single note. Her quote features the
musical triggers of harmony and rhythm; heuristics of the audience, emotion,
instrument (specifically, the baroque bow) and shape; technical modifications
undertaken by both hands; and a change in sound in timbre, timing, dynamics
and vibrato (heard in her demonstrations). The following quote is represented
on the model in Figure 7.10, available online :
ELSIE: Well what I would say, is, um, look at the shape of the bow, and,
and how long you have to do that note, you know [plays] it’s about
[plays] two seconds’ worth, I suppose, if we’re going to be really
analytical about it, and . . . the note needs to have a shape, so where’s
the . . . middle of that note going to be? . . . And um, [plays] basically,
it’s the bar lines, ’cos you know [plays] to get the maximum emotional
impact, I suppose you have to time the middle of that note to
coincide with this [plays] the clash [plays]. Then you’re really gonna
get the audience going, ‘Oh wow!’ you know? ’cos dissonances are
much, much more interesting sometimes than consonances.59
Yoshi, too, described shaping a single note, providing considerable detail about
the physical interaction between her arms and fingers and the keyboard of the
harpsichord, and about the resulting differences in the sound produced:
YOSHI: I think sometimes, it’s the way you drop. [plays] If you just let
the weight of your fingers drop, or if you do it a little bit more [plays]
instant, not force, but just a little bit of ping on your finger, and then
you get more of a clear start to the sound. And if you, you can use
the flat bit of your finger, then it’s a little bit [plays] um, milder, a
little bit more sort of gentle, sort of plucking. . . I think the weight,
the speed, and also the angle . . . of the fingers will sort of, I think,
[plays] I guess you have more control [plays] when it’s flatter . . .
[plays] rather than that, but then, and then you sort of, sometimes,
just give it a little kick, and that’s a little bit . . . uh, it’s a little bit
more clear at the beginning, and somehow louder as well. [plays]60
When represented on the model, the heuristics of gesture, imagery and instru-
ment are highlighted, reflecting Yoshi’s discussion of the movements she is
making, the metaphorical ideas surrounding those movements (‘ping’, ‘little
kick’, etc.), and the technicalities of the instrument she is playing. She is dis-
cussing the attack of a single note, and therefore this is highlighted in the
musical level and technical modifications areas of the model. The change in
sound discussed concerns the timbre and dynamic of the note produced. This
can be seen in Figure 7.11, available online .
Discussion
It is clear from the examples shown that multiple components of the model are
frequently used by participants at once. Each broad category can be thought
about in isolation or considered in relation to another. Often, technical modifi-
cations may not be thought about on a conscious level, with performers thinking
instead of heuristics to achieve their desired change in sound. Nor are musi-
cal triggers always thought about consciously. Different participants seemed
to favour particular components, suggesting that, over time, performers may
develop their own preferred means of thinking about musical shaping that are
represented in numerous areas of the model. It is worth bearing in mind, how-
ever, that the model was built from data gathered in one interview with each par-
ticipant, and is unlikely to represent the full scope of the shaping experience. It
does, however, provide a picture of some of the ways in which these performing
musicians conceptualize and use the notion of musical shaping.
The model provides a new perspective on performance preparation, partly
because it is focused on musical shaping, rather than on performance prepar
ation in general. Some components seem to correspond with aspects of exist-
ing research findings. In relation to research in expert practice (Chaffin et al.
2002; Chaffin et al. 2010), many of the musical triggers, some of the heuris-
tics, and many of the technical modifications may be involved in the formation
of interpretative performance cues. Some of the heuristics also seem likely to
238 Music and Shape
be involved in the formation of expressive cues. Many of the participants dis-

cussed the musical structure (and harmonic goals within the music, which are
necessarily related to musical structure), around which expert practice is often
organized. Parts of the model may also be considered in the light of research
in musical decision-making discussed earlier (Bangert, Fabian et al. 2014;
Bangert, Schubert and Fabian 2014). While musical triggers appear to prompt
all types of decisions concerning musical shaping, deliberate decision-making
seems likely to involve technical modifications, while more intuitive decisions
may involve heuristics.
It may be possible to relate particular components of the model to Juslin’s
(2003) GERMS model of musical expression. Many of the musical triggers are
likely to be generative features, and therefore are highlighted with some of the
changes in sound in the right-hand column. The heuristics of audience and emo-
tions are likely to aid the performer in generating an intended emotional expres-
sion. While random variations are not intended by the performer, this component
might perhaps be related to technical modifications. Motion principles may be
created with the aid of heuristics such as breathing, direction, gesture, line, natu-
ral and shape. Finally, stylistic unexpectedness may be aided with the heuristic
of style. Future research could use the research methods commonly employed in
performance-preparation studies to ascertain whether or not the above specula-
tions hold true, or whether there may be ways of combining this model of musical
shaping with other aspects of performance preparation to c reate an overarching
model. It might also be possible to develop ways of representing individuals’ per-
sonal preferences in their understanding of musical shaping, or the particular
components used within one practice session or rehearsal. These could be used in
studies of performance preparation to examine participants’ shaping focus and
how this changes over time. Such representations could also be used in a study
of ensembles to investigate the dynamics of musical shaping in a group setting.
Does the shaping within a rehearsal switch between the preferred modes of the
members, or does it remain more constant? Are the dominant shaping modes
related to the music performed or the members of the group, or both?
Although the model represents the findings of the ten interviews discussed,
it may have the potential to be applied to other performers, and this would
be desirable in the search for an overarching conceptual model of shape or
shaping for musicians. When assessing the model’s generalizing potential, we
need to take several considerations into account. The two sample instruments
are technically very different in how they generate sounds, in the techniques
required of performers, and in the sounds themselves. There were, however, few
(if any) systematic differences between the responses of violinists and harpsi-
chords in terms of the levels at which musical shaping could be applied or dis-
cussed, musical triggers for shaping, or the heuristics for performance. Rather,
it seemed as though participants had individual preferences for these features of
the model. These similarities suggest that another sample of classical musicians
would discuss shaping in the ways suggested here. A future study might look at
wind or brass players, or singers. Another interesting group might be players
of untuned percussion instruments: we could hypothesize that they might be
focused on rhythm, but to what extent do they shape what they play according
to the melodic and harmonic features of other parts?
Further studies might also establish whether or not the model has the poten-
tial to be generalized to western performers who are less reliant on a score, such
as musicians within the broad popular genre or jazz musicians. Within Chapter
8 of this book, Greasley and Prior argue that the performers of popular music
share responsibility for the shaping of the final sounds of the songs with others,
such as sound engineers, and indeed, classical musicians in recording settings
and certain live performance situations may also recognize this idea. The model
might therefore need to be extended to encompass the performers’ awareness
of and interaction with these other contributors; this is something that neces-
sitates further empirical study.
In its current form, this model offers an understanding of musical shaping
from the perspective of classical performing musicians. While the terms ‘shape’
and ‘shaping’ are commonly used by performers, their meanings have not previ-
ously been defined in relation to music. This model confirms the flexibility of
the term, highlighting its ability to be used in relation to all levels of the musical
structure; the influence of an array of musical triggers on performers’ shap-
ing decisions; the use of shape as one of a number of heuristics for expressive
performance; the technical modifications required to shape a note, phrase, sec-
tion, etc.; and the change in sound that results. At the very least, the data and
the resulting model have allowed some understanding of the commonly used
phrase ‘That was a beautifully shaped performance’, and that understanding
may perhaps help others to achieve that elusive goal.
Acknowledgements
This work was supported by the AHRC Research Centre for Musical
Performance as Creative Practice (grant number RC/ AH/D502527/ 1). The
author is most grateful to David Mackin of Greengate Publishing Services for
producing the figures for this chapter.
References
Bangert, D., D. Fabian, E. Schubert and D. Yeadon, 2014: ‘Performing solo Bach: a case
study of musical decision-making’, Musicae Scientiae 18/1: 35–52.
Bangert, D., E. Schubert and D. Fabian, 2014: ‘A spiral model of musical decision-mak-
ing’, Frontiers in Psychology 5/320, https://doi.org/10.3389/fpsyg.2014.00320 (accessed
9 April 2017).
240 Music and Shape
Barten, S. S., 1998: ‘Speaking of music: the use of motor-affective metaphors in music
instruction’, Journal of Aesthetic Education 32/2: 89–97.
Brenner, B. and K. Strand, 2013: ‘A case study of teaching musical expression to young
performers’, Journal of Research in Music Education 61/1: 80–96.
Cannam, C., C. Landone and M. Sandler, 2010: ‘Sonic visualiser: an open source applica-
tion for viewing, analysing, and annotating music audio files’, paper presented at the
ACM Multimedia 2010 International Conference, Firenze, Italy, 25–29 October 2010.
Chaffin, R., G. Imreh and M. Crawford, 2002: Practicing Perfection (Mahwah, NJ: Erlbaum).
Chaffin, R., T. Lisboa, T. Logan and K. T. Begosh, 2010: ‘Preparing for memorized cello
performance: the role of performance cues’, Psychology of Music 38/1: 3–30.
Davis, S., 2009: ‘Bring out the counterpoint: exploring the relationship between implied
polyphony and rubato in Bach’s solo violin music’, Psychology of Music 37/3: 301–24.
Doğantan- Dack, M., 2013: ‘Familiarity and musical performance’, in E. King and
H. M. Prior, eds., Music and Familiarity: Listening, Musicology and Performance
(Aldershot: Ashgate), pp. 271–88.
Ericsson, K. A., 2006: ‘Protocol analysis and expert thought: concurrent verbalizations
of thinking during experts’ performance on representative tasks’, in K. A. Ericsson, N.
Charness, P. J. Feltovich and R. R. Hoffman, eds., The Cambridge Handbook of Expertise
and Expert Performance (Cambridge: Cambridge University Press), pp. 223–41.
Ericsson, K. A. and H. A. Simon, 1993: Protocol Analysis: Verbal Reports as Data
(Cambridge, MA: MIT Press).
Fabian, D., R. Timmers and E. Schubert, eds., 2014: Expressiveness in Music Performance:
Empirical Approaches across Styles and Cultures (Oxford: Oxford University Press).
Gabrielsson, A., 2003: ‘Music performance research at the millennium’, Psychology of
Music 31/3: 221–72.
Juslin, P. N., 2003: ‘Five facets of musical expression: a psychologist’s perspective on music
performance’, Psychology of Music 31/3: 273–302.
Juslin, P. N. and G. Madison, 1999: ‘The role of timing patterns in recognition of emotional
expression from musical performance’, Music Perception 17/2: 197–221.
Juslin, P. N., A. Friberg and R. Bresin, 2002: ‘Toward a computational model of expres-
sion in music performance: the GERM model’, Musicae Scientiae (Special Issue
2001–2): 63–122.
Juslin, P. N., A. Friberg and E. Schoonderwaldt, 2004: ‘Feedback learning of musical expres-
sivity’, in A. Williamon, ed., Musical Excellence: Strategies and Techniques to Enhance
Performance (Oxford: Oxford University Press), pp. 247–70.
Karlsson, J. and P. N. Juslin, 2008: ‘Musical expression: an observational study of instru-
mental teaching’, Psychology of Music 36/3: 309–34.
Leech-Wilkinson, D., 2009: The Changing Sound of Music: Approaches to the Study of
Recorded Musical Performances, http://www.charm.kcl.ac.uk/studies/chapters/intro.html
Leech- Wilkinson, D. and H. M. Prior, 2014: ‘Heuristics for expressive perfor-
mance’, in D. Fabian, E. Schubert and R. Timmers, eds., Expressiveness in Music
Performance: Empirical and Cultural Approaches (Oxford: Oxford University Press),
pp. 34–57.
McPherson, G. E., J. W. Davidson and R. Faulkner, 2012: Music in Our Lives: Rethinking
Musical Ability, Development and Identity (New York: Oxford University Press).
Prior, H. M., 2012a: ‘Methods for exploring interview data in a study of musical shap-
ing’, paper presented at the 12th International Conference on Music Perception and
Cognition (ICMPC) and 8th Triennial Conference of the European Society for the
Cognitive Sciences of Music (ESCOM), Thessaloniki, Greece, 23–28 July 2012.
Prior, H. M., 2012b: ‘Multi-modal understandings of musical shape: a comparison of
violinists and harpsichordists’, paper presented at the SEMPRE 40th Anniversary
Conference, Institute of Education, London, UK, 14–15 September 2012.
Prior, H. M., 2012c: ‘Shaping music in performance: report for questionnaire participants
Smith, J. A. and M. Osborn, 2003: ‘Interpretative phenomenological analysis’, in J. A. Smith,
ed., Qualitative Psychology: A Practical Guide to Research Methods (London: Sage),
pp. 51–80.
Smith, J. A., P. Flowers and M. Larkin, 2009: Interpretative Phenomenological Analysis
(London: Sage).
Willig, C., 2001: Introducing Qualitative Research in Psychology: Adventures in Theory and
Method (Maidenhead: Open University Press).
Woody, R. H., 2002: ‘Emotion, imagery and metaphor in the acquisition of musical perfor-
mance skill’, Music Education Research 4/2: 213–24.
Reflection
Simon Desbruslais, trumpeter
Expressive freedoms in trumpet performance
During a rehearsal of Johann Sebastian Bach’s Cantata BWV 51 in Oxford in

2008, a colleague, whom I had invited to listen and to observe, advised quite
simply: more shapes. The aim, I believe, was to lift the notes further from the
page to create a more nuanced and stylish performance. This suited both the
contrapuntal edifice of Bach’s music and the period instruments that we were
using. On this occasion I was leading the ensemble and therefore in posses-
sion of greater authority than usual. I have nonetheless had similar subsequent
experiences of this piece, and of similar repertoire, where I have possessed artis-
tic licence to create a microcosm of musical shapes not found in the notated
score. Indeed, I have found that such practice continues to be strongly encour-
aged within this genre.
The emancipation of expression in performance has been a focus of my
early career, significantly contrasting with my formative musical training in
symphony orchestras where such freedom was at a greater premium. Certain
trumpet repertoires, roles and trumpet types invite more creative expression
than others, often as a consequence of notational detail, style and function.
Drawing on my experience, in this Reflection I introduce a selection of reper-
toires from baroque to contemporary music to examine these notions.
Trumpet sound and character
John Wilbraham (1944–98), a prominent British trumpeter, once remarked that

‘the trumpet is an inanimate object. It will only make a sound if you drop
it’. These words ring true for every trumpet (and brass) player, who is typi-
cally required to spend many years developing and maintaining an efficient
242
Reflection: Simon Desbruslais 243
embouchure to ‘manufacture’ sound. The physical instrument merely acts as

a device for resonance, and while trumpets do have individual sound quali-
ties (particularly dependent on their alloy) the final sound product is primarily
determined by the physical and technical attributes of the performer. There
are limitless possibilities: the shape of the mouth and tongue, embouchure
strength, type of instrument and size, and mouthpiece depth and density can
have a direct impact on the tone quality.
Many trumpeters aspire to play in an orchestra. Perhaps this is due to the
empowering position a trumpeter commands at the helm of a large group of
instrumentalists, or, more circumstantially, to a lack of solo repertoire (George
Enescu’s Légende of 1906 was the first solo composition by a prominent com-
poser since Johann Hummel’s Trumpet Concerto in 1803). For a trumpeter
wishing to perform professionally as a solo or chamber musician there are very
few available opportunities; these are generally self-made, resulting in orches-
tral performance being, for many, the only realistic option. In an orchestra the
ability to play in a ‘section’ is paramount, which requires an awareness of other
performers and, most importantly, an ability to mirror the style of the princi-
pal player. The sound of a trumpet section should be homogeneous, meaning
that many forms of artistic creativity are sidelined. Particularly when playing
second trumpet, the role is to blend in: one must not ‘stick out’, but follow the
intonation, style and sound of the rest of the section as closely as possible,
often to the point of mirroring instrument and mouthpiece types.
However, while homogeneity is expected within a section, it is not an inher-
ent characteristic of the instrument. Different trumpeters produce different
sounds.1
When the role of a section is not required—such as a prominent orchestral
solo, concerto, recital or chamber music performance—the expressive oppor-
tunity to shape and colour the sound is presented. Accuracy (while important)
is no longer the overarching concern; originality and distinctiveness are para
mount, tailored by vibrato, tone, phrase shape, dynamics, articulation and
rubato. The following examples represent creative independence in a variety
of repertoire to illustrate how certain types and roles engender greater freedom
than others.
Expressive freedoms
Bars 57–61 from Gloria II of Bach’s B minor Mass illustrate an approach

to shape that has been influenced by the physicality of the baroque trumpet
and the HIP (historically informed performance) movement. The scarcity of
expressive markings, save the slurs in bars 57–58, encourages expressive free-
dom. I approach this passage in five ways that conflict with the approach of
many trumpeters using modern instruments: (1) ‘phrasing-off’, (2) diminuendo
244 Music and Shape
(a)
(b)
FIGURE R.19 Bach, B minor Mass, Gloria II, bars 57–61: a) Gesellschaft edition, followed by b) a
notated interpretation
towards high notes, (3) upper-note trill, (4) no vibrato and (5) slurred semi-
quaver couplets. Figure R.19a recreates the original markings from the Bach
Gesellschaft edition, followed by a transcription of one possible interpretation
(Figure R.19b). Notably, both second and third trumpets also have creative
roles in this style of writing.
This interpretation could be related to the physical baroque instrument.
‘Phrasing- off’ slurred couplets (emphasizing the first note) helps stamina,
accords with extant treatises (such as Quantz [1752] 2001) and creates a
nuanced and layered musical character. The upper (clarino) register is quiet; we
know this both from experimentation with surviving instruments and from the
orchestration of works such as Bach’s Second Brandenburg Concerto, where
trumpet must balance with the concertino of violin, oboe and recorder. Upper-
note lip trills, it may be argued, are easier to play. Particularly when using a
large baroque mouthpiece, vibrato is very hard to create and unnecessarily
exhausting. Finally, semiquavers are easier to perform when slurred in couplets.
While HIP informs performances (the approach described here is represen-
tative of many period trumpeters) the performer is still freer to choose than in
a detailed contemporary score. And although I have reached something of a
norm in my performance of this extract, I remain free to change; the scarcity of
dynamic markings encourages a creative approach to phrase shape.
The next passage (Figure R.20) is something that I have heard hammered out
in modern groups, where the triads are interpreted as loud articulations against
Bach’s complex counterpoint. However, this approach, perhaps influenced by
FIGURE R.20 Bach, B minor Mass, Cum sancto spiritu, bars 111–17
the sound of the piccolo trumpet, misses the opportunity to shape this phrase.
A diminuendo towards the final top C (a sounding D) provides an elegant alter-
native, emphasizing instead bar 113 as the dynamic climax.
Though it creates an attractive musical shape, this approach is harder to
perform. Some shapes, however, can make a trumpeter’s life easier. I have heard
many times (and I am sure this is true for other instrumentalists) that one
should make a long note ‘travel’ or ‘go somewhere’. This can be a psychological
tactic to encourage the performer to breathe or bow ‘through’ a note—to work
harder as the note progresses—in order to sustain a long note where the effect
would otherwise be static. However, this can also form dynamic and colour
shapes. The example in Figure R.21 is taken from the third volume of Güttler’s
collection (1970) of Bach’s trumpet music, the definitive text for professional
performers. It is my personal copy, which I have used for many live perfor-
mances on the natural trumpet. I find that the imaginary slurs help to remind
me of the overall shape and direction of the phrase (it is no coincidence that
I am supportive of Schenkerian analysis). I want to see how the pitches relate
to each other: rather than symphonic technique, where pitches are accurately
punctuated with a uniform character, I want to understand, and remind myself
in performance of, the larger musical line.
The extended melodies of the nineteenth century, enabled by the invention of
the chromatic, valved trumpet, encouraged and lent themselves to more extensive
colouration. This period introduced the modern orchestral solo, which became
a platform for performers to show both technical assurance and the individual-
ity of sound colour. However, it was originally the cornet that was assigned the
freedom to perform expressive solos, while the baroque trumpet was demoted
to a less creative role with a function mainly to articulate (Lawson and Stowell
FIGURE R.21 J. S. Bach, Complete Trumpet Repertoire, Vol. III with my annotations (used by kind
permission of Breitkopf & Härtel, Wiesbaden).
246 Music and Shape
FIGURE R.22 Tchaikovsky, Swan Lake Suite Op. 20a, ‘Intrada’, rehearsal mark 13
FIGURE R.23 Pritchard, Skyspace (2012), third movement, notated for piccolo trumpet in A, bars 1–8
(used with permission).
1999: 130–2). This was due in part to the loss of high baroque trumpet technique.
Nineteenth-century orchestral solos often have a sense of expressive freedom, of
which the famous cornet solo in Pyotr Tchaikovsky’s Swan Lake Suite is exem-
plary (Figure R.22). These moments highlight the individuality of the principal
trumpeter’s sound, line and musical shapes. Generally, tied notes encourage both
a small dynamic change and a change in vibrato and colour.
The often complex styles of contemporary music have encouraged a
‘straighter’ approach which mirrors that of much baroque performance.
Playing with vibrato is more exhausting, requiring greater stamina, and the
high demands of contemporary music are more easily met with a straighter
sound. This is not to say that vibrato is completely avoided, but it is not the
primary colour. Similar physical strength is required to play the clarino trum-
pet and contemporary repertoire.
Individual nuances in musical phrase, however, are as important in many
contemporary works as they are in the baroque. While Deborah Pritchard’s
piccolo trumpet concerto, Skyspace (Figure R.23), contains meticulous atten-
tion to notational detail, I add my own character (indeed, I do not think that it
is possible to notate every expressive detail in a score). In the third movement,
although this is not notated, I ‘lift’ the second of each quaver-pair in a manner
related to my experience of baroque music.
Concluding remarks
Owing to the nature of trumpet sound production, shape is of crucial impor-

tance to the way in which its performers think; indeed, I believe that shape is one
of the most important concerns of trumpet performance, after basic technical

competency has been secured. Furthermore, the infinite array of characters
and styles of trumpet playing means that the freedom to shape an ‘individual’
sound is highly valued. Such individuality is encouraged by function, genre and
notational practice, and is most prominent in baroque trumpet repertoire (par-
ticularly Bach and Handel), orchestral trumpet solos and contemporary music.
When the role of the trumpet is to articulate orchestral textures, or to conform
to a homogeneous section, individual shapes are subordinate to accuracy.
While vibrato is found in the performance of all trumpet contexts, it is note-
worthy that straight players—those whose base sound does not use vibrato, yet
who add it on occasion as a colour—tend towards baroque and contemporary
repertoires. It is no coincidence that several trumpet soloists have specialized in
period performance and contemporary music, such as Reinhold Friedrich and
Gabriele Cassone. I too aspire towards this approach. This is due not only to
the lack of solo repertoire in the nineteenth century, but also to the compat-
ibility of baroque and contemporary styles. The physicality of the trumpet also
makes expressive shapes particularly important and can reduce performance
anxiety; a focus on character helps to direct the mind away from accuracy alone.
The various roles of the trumpet strongly influence freedoms to create and
to express shapes. The variety of distinctive characters and sounds of trumpet-
ers should be valued rather than homogenized. Though there is less freedom
for the second trumpet in a symphony orchestra than a solo performer in a
trumpet and piano recital, conductors would do well to value this freedom to
shape. It would also encourage and nurture national styles of orchestral perfor-
mance, which are become increasingly standardized. The trumpet itself makes
no sound, but the physiques and personalities of trumpeters make for infinite
shaping possibilities.
References
Bach, J. S., 1970: Complete Trumpet Repertoire, ed. L. Güttler, 3 vols (Wiesbaden: Breitkopf
& Härtel).
Lawson, C. and R. Stowell, 1999: The Historical Performance of Music: An Introduction
Quantz, J. J., [1752] 2001: On Playing the Flute [Versuch einer Anweisung, die Flöte traver-
siere zu spielen], trans. E. R. Reilly (London: Faber & Faber).
Reflection
Malcolm Bilson, fortepianist
Defining musical shape
In a casual conversation in 2001 a very famous pianist asked me, ‘Why is there
no dot on the upbeat to the first movement of the Beethoven Piano Sonata in
F minor, Opus 2 No. 1’ (Figure R.24). I was taken aback that anyone could
ask such a question, as every eighteenth-century source clearly states that all
upbeats are short and light unless otherwise marked. One doesn’t put an expres-
sive marking on notes that are akin to articles in speech (the, an, of, by, etc.).
I then listened to some twelve recordings by world-famous artists of the last
decades and was astonished to find that almost none played the note short or
light. Most played a heavy, long, even slurred upbeat (clearly not indicated by
the composer). Astonishing though it may seem, very little instruction if any is
given in conservatories around the world concerning the most basic expressive
devices used by composers: How long is a crotchet to be held that has no mark
of any kind (a slur, a tenuto)? What is the meaning of a slur? (Mozart never
wrote a sketch without indicating the slurs—they are the real soul of the music,
realized through the notes.) I made a video on these subjects called Knowing the
Score, which was released in 2006.1
In addition to the more basic questions of notation, one of the topics touched
on in Knowing the Score was performance information that can be gleaned from
composers who have recorded their own works (Bartók, Prokofiev, Elgar and
others). But rather than listening to the recording to see how the composer inter-
prets the score, we can assume that the score represents what these composers
heard, hence: How did they write it down? One of the recorded examples featured
in Knowing the Score was Sergei Prokofiev playing his little Gavotte, Op. 32 No.
3, in what is generally considered a personal, highly idiosyncratic manner. I made
the claim that if we know what a Gavotte is, and follow Prokofiev’s markings
carefully, his rendition will be clearly revealed in his notation. (See Video 1 at .)
248
Reflection: Malcolm Bilson 249
FIGURE R.24 Beethoven, Piano Sonata in F minor, Op. 2 No. 1, first movement, bars 1–9
Proper realization of rhythmic conventions is essential for revealing the char-

acter of any genre of music. It is obvious that no notation will be completely
up to the job, and good rendition will be impossible without prior acquaintance
with the particular idiom. For this the advent of recording represented a major
revolution (for example, Bartók recording folk music in Romania, New Orleans
jazz reaching millions in the United States and Europe, and so on). François
Couperin’s notes inégales are meticulously described by the composer, but imag-
ine trying to describe the jazz rhythms of the early twentieth century rather
than hearing their transmission by recording. One cannot help but wonder if
Couperin would recognize many of today’s realizations of his descriptions.
Yet at the same time Prokofiev, as we have shown, gives a remarkable amount
of quite detailed performance information, and the sources in the late eigh-
teenth and early nineteenth centuries convey far more precise details on proper
execution of the expressive markings of the period than is normally realized.
Since rhythmic notation is limited (there is no notational possibility between a
crotchet and a quaver, for instance) it is generally assumed, erroneously in my
opinion, that flexible inflection can at best be only implied.
The first task of any serious musician is to determine the character and par-
ticular idiom of the work in question. Carl Philipp Emanuel Bach tells us, in
his chapter on Vortrag (Performance), ‘What comprises good performance? In
nothing other than the ability through playing or singing to make the ear sen-
sible of the true content and affect of musical thoughts. By altering these one
can change the ear to such an extent that one will hardly recognize it as the
same thought.’ Bach continues: ‘The subject matter of performance is the loud-
ness and softness of tones, touch, the snap, legato and staccato execution, the
vibrato, arpeggiation, the holding of tones, the retard and accelerando. Lack
of these elements or inept use of them makes a poor performance’ (Bach [1753,
1762] 1949: 148). And Daniel Gottlob Türk, in his Klavierschule, divides his
chapter on Vortrag into two sections: Ausdruck (Expression) and Ausführung
(Execution; Türk [1789] 1967: 347–65). The meaning of any passage of music,
therefore, is revealed through its execution, and the tutors of Bach, Türk, Leopold
250 Music and Shape
Mozart and others of the time instruct us how to realize the performance indi-
cations in the scores of the time in order to play in the kind of inflected manner
we observed in the Prokofiev example. There is no music anywhere in the world,
from the simplest folk tunes to the most sophisticated art music, that is played
in an even, uninflected manner, yet such a manner of playing is often accepted
and even cultivated today, as evinced by many recordings of important artists.
I was on the jury of the Leeds International Pianoforte Competition in
2000. What I heard was a phenomenal level of piano playing, and I was often
moved by very beautiful and insightful playing in a variety of repertoires. But
there was no Mozart or Haydn that came even close to the beauties I associate
with that music; it was all smooth and even, virtually uninflected. One work we
heard five or six times was the Haydn Sonata C Major, Hob. 50. In Video 2
you can hear the first few bars performed by the fine Hungarian pianist Dezsö
Ránki. I chose his performance as emblematic of what I heard several times in
Leeds. His performance is by no means unmusical, and represents beautifully
a typical rendition of this score. But in the video I look at the detailed perfor-
mance indications in the score to demonstrate that Haydn’s expressive mark-
ings are at least as important for revealing the musical thoughts as the notes,
yet in Ránki’s, as in most performances today, they are simply glided over in
a smooth, uninflected manner. Musical shape—inflected, passionate and flex-
ible—is inherent in this music, and it is my belief that it can be regained by a
clear understanding of the expressive marks ubiquitous in music of the late
eighteenth and early nineteenth centuries that are today misunderstood or sim-
ply neglected. No one applauds more than I the wonderful new scholarly edi-
tions appearing, giving us access to every little aspect of Mozart’s or Schubert’s
notation. There are now at least seven so-called Urtext editions of the Mozart
Piano Sonatas; but are any of his clear articulations, the essence of his musical
language, being taught in those music schools and conservatories that insist on
their use?
Musical shape is defined by properly inflected realizations of rhythmic
motives as we have shown in this short Haydn example. But two further aspects
of time in musical performance are equally important: tempo fluctuation and
tempo rubato. I am often told that in Beethoven’s music no tempo fluctuation
is allowed, that one must keep a strict beat. Not only is there no basis for this
widely held assumption, but we know that Beethoven changed tempo a great
deal, and indeed much in his music virtually demands it. And tempo rubato,
prized by Caccini in the seventeenth century, Mozart in the eighteenth and
Chopin in the nineteenth, involves independence between a steady accompani-
ment and a freely flowing upper voice, be it violin, the voice or the right hand
at a keyboard. This feature as well is generally discouraged in today’s conser-
vatories and music schools, yet is often the very soul of moving performances
heard on earlier recordings.2
Reflection: Malcolm Bilson 251
Note
This Reflection derives from a lecture given at the Liszt Academy, Budapest, on
4 March 2014. A video of the lecture is available from the companion website
References
Bach, C. P. E., [1753, 1762] 1949: Essay on the True Art of Playing Keyboard Instruments, trans.
W. Mitchell (New York: Norton).
Türk, D. G., [1789] 1967: Klavierschule, facsimile reprint (Kassel: Bärenreiter), pp. 347–65.
8
Shaping popular music

Alinka E. Greasley and Helen M. Prior
Much of the research concerning musical shaping in performance has focused

on the traditions of western classical music. Although this has increased our
understanding of musical shaping, it is questionable whether all the findings
may be directly applicable to western popular music or whether this broad
genre may engender other conceptions of the notion of shape or musical shap-
ing in performance. There are fundamental similarities between the musical
practices of popular and classical western musicians, such as musical materials,
instruments, and processes of collaboration and collective creativity, but there
are also many differences, one of which is the greater prominence of electronic
technology in popular music (Théberge 1997). This chapter investigates notions
of musical shaping from the perspectives of popular musicians performing
with a variety of purposes in mind. First, we discuss performers’ perspectives
on musical shape in live performance, drawing on evidence from popular musi-
cians who responded to a questionnaire study on musical shaping (Prior 2010,
2012b) and on work in the popular music field. Second, we examine the roles of
performer, producer and technology in shaping music in the recording studio,
drawing on existing literature in the field, which includes accounts provided by
professional popular musicians and music producers (Bayley 2010; Blake 2009;
Frith and Zagorski-Thomas 2012; Negus 1992; Théberge 2001; Toynbee 2000).
This includes an investigation of how popular music recordings are shaped by
recording techniques and technological practices more broadly, drawing on
the work of authors such as Katz (2004), Théberge (1989, 2001) and Warner
(2003), among others. We then discuss the ways in which popular music record-
ings are used in performance, with a focus on the perspectives of DJs (disc-
jockeys) using the idea of musical shaping in their work (Greasley and Prior
2013). A final section summarizes the varied notions of musical shaping that
arise from these different perspectives and explores their implications, as well as
252
Shaping popular music 253
the limitations of examining within a single chapter a flexible and widely applic
able metaphor such as shape in a genre as diverse as popular music.
SETTING THE BOUNDARIES
The term ‘popular music’ has been so widely used and defined that it is essen-
tial to begin with a brief discussion of the scope of the term as it pertains to
our work. In this chapter we are referring to popular music in contemporary
Britain, Europe and North America, mainly because most of the research to
date has been carried out in these contexts. In distinguishing between folk, art
and popular music within western culture, Philip Tagg (1982) observes that
popular music tends to be produced and transmitted primarily by professional
musicians; is mass distributed mainly through recorded sound;1 is a commod-
ity in an industrialized society; and tends to name composers or authors. Tagg
also notes the general lack of written theory and aesthetics, though this has
since developed (Bennett, Shank and Toynbee 2006; Brabazon 2012; Frith and
Goodwin 1990; Moore 2001; Negus 1996; Scott 2009). A useful definition of
popular music, and one that we will be adopting in the current chapter, is pro-
vided by Shuker (2013: 6):
In sum, only the most general definition can be offered under the general
umbrella category of ‘popular music’. Essentially, it consists of a hybrid
of musical traditions, styles and influences, with the only common ele-
ment being that the music is characterised by a strong rhythmical compo-
nent and generally, but not exclusively, relies on electronic amplification.
Indeed, a purely musical definition is insufficient, since a central charac-
teristic of popular music is a socioeconomic one: its mass production for
a mass, still predominantly youth-oriented market. At the same time, of
course, it is an economic product that is invested with ideological signifi-
cance by many of its consumers.
There is a common historical tendency to snub popular music (Middleton

1995), which may explain why popular musicians have responded with writings
with titles beginning with ‘The Art of . . .’, encompassing topics such as record
or music production (Burgess 2001; Frith and Zagorski-Thomas 2012; Gibson
2005; Moylan 2007), sound engineering (Horning 2004; Zak 2009) and DJing
(Broughton and Brewster 2002; Katz 2012). Shuker (2013) argues that there is
an ideological tension between the essential creativity of the process of making
popular music and its commercial nature, but most commentators agree that
considerable skill is required by all contributing parties for commercial success.
What is highlighted by these titles is not only the array of technology used in
the production of popular music (Théberge 1997), but also the number of peo-
ple and variety of skills required for success in popular music (McIntyre 2012).
254 Music and Shape
These two features are highlighted throughout in relation to their implications

for shaping popular music.
It may be pertinent at this point to describe the lines of enquiry that we do
not pursue in this chapter. First, there is the question of the extent to which
music videos shape the listener or viewer’s perception of the music. Music
videos may consist of a mixture of small-scale film-like narrative, videoed musi-
cal performance and dance, while functioning both as music advertising and as
a product (Brabazon 2012). Cohen (2009) notes the strong effects of music
on various aspects of the perception and cognition of film (see also Reuben’s
and Mitchell’s Reflections later in this volume); however, fewer studies explore
the effects of watching a film on perception, cognition and other responses to
music (Boltz 2013). Various studies in musical performance have highlighted
the importance of visual information for the judgement of performance qual-
ity (Davidson 2006; Griffiths 2010; Tsay 2013) and for emotional responses to
music (Krahé, Hahn and Whitney 2015). Research on dance has concentrated
mainly on perceived congruency between dance and music, focusing on per-
formance art for the stage, rather than on a social dance situation or dance
within music videos (Cohen 2009). This combination of a scarcity of directly
related research and more abundant results from tangential areas suggests that
the study of the perception of popular music videos seems potentially fruit-
ful for future research, as does the study of live reenactments of music video
imagery in live performance (with either a prerecorded soundtrack or a live
musical performance; Kooijman 2006, Burns 2006) and also the perception of
visual turntablism (Brabazon 2012). However, there is not sufficient scope to
explore music videos fully here. While we consider the role of body movement
in performance briefly below, we do not consider how visual aspects of popular
music (in live performance or music videos) may contribute to understandings
of the notion of musical shaping in performance; for this the reader is referred
to Tan et al. (2013), in particular the chapter by Boltz exploring music videos
and visual influences on music perception and appreciation.
Secondly, within this chapter, we make generalizations about the genre of
popular music, despite the fact that research has shown that people are able to
categorize music at a very fine-grained level. In Greasley, Lamont and Sloboda’s
(2013) study, as few as twenty-three participants discussed more than 220
genres when talking about their musical preferences (for example ‘Rock’, as a
subgenre of popular music, was described as ‘Rock’, ‘Rock ‘n’ Roll’, ‘American
Rock’, ‘Christian Rock’, ‘Classic 60s Rock’, ‘Classic Rock’, ‘Funky Rock’,
‘Heavy Rock’, ‘Punk Rock’ and so on). In contrast, Prior’s (2010, 2012b) survey
research (described below) was carried out with popular musicians who per-
formed in around twenty popular music genres (e.g. ‘Rock’, ‘Jazz/blues’, ‘Pop’;
see Table 8.1). The conclusions we draw from their accounts are not genre-
specific (i.e. they are grouped under the broader umbrella of popular music),
partly because most of the musicians reported that they performed within more
TABLE 8.1 Number of popular musicians in Prior (2012b) who played each

genre of music
Genre Number of Participants
Musical theatre 3
Jazz/blues 13
Pop 9
Rock/metal 5
Country/folk/gospel 3
Urban (hip hop, soul, RnB, etc.), dance/electronic 3
(house, techno, electronica, etc.)
Contemporary/experimental 4
World 4
Crossover 3
than one genre. Greasley et al. (2013) also highlighted difficulties in definition
because of crossover in musical styles (e.g. ‘Folk Rock’, ‘Country Rock’, ‘Jazz
Rock’), which is why Shuker’s (2013) broad definition of popular music as con-
sisting of a hybrid of musical traditions, styles and influences is useful here.
Readers are invited to draw conclusions from the arguments we present that are
appropriate for the genres and subgenres in which they specialize.
We also make generalizations about the roles played by a number of contrib-
utors (e.g. performer, producer) in shaping popular music. Frith and Zagorski-
Thomas (2012: 5–6) note that ‘deciding who is responsible for what in the studio
is still a matter of record-by-record investigation (much of which remains to be
done) rather than, for example, genre generalization’. This can also be applied
to live music-making in popular music, with its frequent use of technology and
concomitant expert personnel. We aim here to present the potential for each
contributor to shape the music, rather than analysing existing recordings or
presenting any kind of blueprint for musical success.
The performer’s role in shaping music in live performance
In this section, we explore the performer’s role in shaping music in live per-
formances, drawing on evidence from popular musicians who responded to a
questionnaire study on musical shaping (Prior 2010, 2012b) and on work in the
popular music field.
Prior’s (2010, 2012b) questionnaire study provides insights into the use of
musical shape or shaping by musicians from a relatively broad range of back-
grounds. More than two hundred participants completed a mixed-response
questionnaire, which sought to establish some of the meanings and contexts
in which the idea of musical shape is used by musical performers. Participants
were asked about their musical background (e.g. main instrument, music
256 Music and Shape
categories in which they performed),2 questions concerning whether and how

they used shape in thinking or talking about how to perform music, and about
any links between music and shape that participants could describe. They were
also asked to rate their agreement with a series of fifty statements that had
been developed through reference to existing written quotations from musi-
cians using the idea of shape (see Prior 2010).
In order to extract popular musicians’ responses for the purposes of this
chapter, we collapsed the music categories to form two main (albeit simplis-
tic) categories of ‘classical’ (orchestral, choral and chamber music, as well as
opera) and ‘nonclassical’ (music theatre, jazz, popular, world, folk and cross-
over) music. On the basis of this divide, around 60 per cent of the sample
performed music within the classical genre exclusively, 10 per cent performed
within the popular music genre exclusively, and around 30 per cent performed
music from both broad genres. General trends in the data from the fifty state-
ments suggested that musicians who played nonclassical music (both exclu-
sively and as well as classical music) gave slightly less positive responses to the
idea of using the notion of shape in rehearsals with others or in informal dis-
cussions with other musicians. They also gave slightly less positive responses to
the idea of musical shape following the melodic line of the music, and to the
idea of musical shape moving from left to right, something that might reflect
the lower dependence on the musical score in nonclassical traditions (Prior
2012b). Nonetheless, some of the responses to the more open-ended questions
merit further attention and provide insight into the use of the notion of shape
by popular musicians.
Twenty-five respondents provided specific examples of using shape when
thinking or talking about popular music, and their responses are explored
below using a conventional qualitative content analysis (Hsieh and Shannon
2005). A brief examination of some of the musical and demographic data of
these twenty-five participants provides some context for their responses. The
most commonly named main instrument was the voice (N = 5), followed by
guitar (N = 4) and piano (N = 4), and double bass (N = 2), trombone (N = 2)
and violin (N = 2). Other participants played the clarinet, euphonium, per-
cussion, saxophone or turntables, or conducted. Twenty-one had more than
ten years’ experience of playing their instrument (though only four partici-
pants had more than forty years’ experience), and nineteen described them-
selves as performers of a professional standard. Table 8.1 shows the numbers
of participants who reported playing each genre of music. This is intended
to indicate the breadth of experience of the sample group rather than provid-
ing the means to identify trends within subgroups of participants, especially
as participants commonly selected more than one genre. Most commonly,
participants reported playing jazz or blues, pop, or rock or metal music.
Fourteen participants were from the UK and four from other English-speak-
ing countries. Six participants were from other European countries and one
was from Malaysia.
There seemed to be three main ways in which the idea of musical shape was
used by these musicians. First, a few performers discussed their use of shapes
and images to overcome technical difficulties. Several singers described how
shape was helpful for themselves and their students in achieving the correct
pitch, tone colour and expression:
Teaching someone with pitching issues to create a sense of how to find

pitch and sing through the notes. The student has used images of shapes
in order to overcome his problems pitching—it has proved very successful
for him. (Professional singer and teacher)
It was taught to me that when visualising your voice you should think of
it as a shepherd’s hook—that it runs from your diaphragm, up through
your body, into your mind, resonating behind the nose, and out through
your mouth. (Professional-standard singer)
I always explain the use of the voice to my students with . . . pictures of

shapes. For a beginner it is usually difficult to sing high notes. What often
helps them is imagining the tone as an arch that is streaming out of the top
of their heads, like a rainbow. I also explain the process of breathing and
singing as a circle that should not be disrupted. . . When I sing I always
produce pictures in my mind to achieve a certain sound, tone quality or
emotion. . . Low, warm tones, the ones that are used in Jazz Ballads I
always see as dark blue bubbles or circles . . . high very powerful tones that
are used often in Pop music, but also funk, often look like bright yellow or
red triangles or just lines. (Professional singer and teacher)
An amateur guitarist also discussed the ‘big, round shape’ of the ‘expansive’
timbre he was trying to achieve. A similar technical approach was described by
a guitarist and a pianist. The guitarist described how he would visualize chords
as ‘shapes on the fretboard’ (professional-standard guitarist). The pianist took
this idea further:
While improvising over the tune, I imagine the chords not as abstract
notions, but like architectures—and my movement from one to another
involves drawing different shapes which I select as I go along. (Professional-
standard pianist)
This idea of shapes being formed over time by changing musical features such
as chords or melodic patterns leads to the second way in which the idea of
musical shape was used, that is, in reference to a musical structure or trajec-
tory. This was a more common idea, cited by ten of the participants and often
discussed in relation to composition or improvisation:
Shape would have been used to talk about a large scale structural/expres-
sive trajectory which can help to guide an improvisation. (Professional-
standard pianist)
258 Music and Shape
Concentrating on the placement of one or more musical ideas and using

space/duration to create contrast between them. A piece that has ‘shape’
could be said to arise through this process. (Professional pianist)
(1) thought about the shape of my solo, started with short phrases with
repetitive rhythm then extended the phrases, and (2) thought about [the]
form of the piece (AABC), where to place solos, how many solos, whether
to have a ‘rhythm only’ chorus, how to end the piece. (Amateur saxophonist)
The idea was also discussed on a larger scale, in relation to the choice of music
over a whole performance:
So each piece had a different shape, to give variety to the performance.
Not just tutti, then individual jazz breaks, then chorus. We tried to vary
the structure and shape of the programme. (Professional double bassist)
We were discussing which track to open our set with, given the style of
the DJ who would be playing before us. We wanted to find an opening
track that would work well after the previous DJ and be significantly
different but also energetic enough not to clear the dance floor. We dis-
cussed it in terms of energy level, often using contour metaphors, which
are absolutely central to DJing (peaks and troughs, building it up, taking
it down). (Amateur DJ)
The way in which a set is compiled can create a very powerful performance, as
exemplified by Fast’s (2006) description of Queen’s performance at Live Aid,
in which the group abbreviated many of their most popular hits to create a
fifteen-minute act with a powerful emotional trajectory designed to enthuse the
audience and thereby raise as much money as possible.
The latter comments made by the questionnaire respondents form a useful
introduction to the third main way in which musical shaping was discussed,
that is, in relation to musical expression. This was discussed on a variety of
scales, in relation to both the whole piece and the shaping of individual phrases.
Sometimes these ideas were discussed in specific technical terms, with refer-
ence to phrasing and breathing, dynamics and tempo fluctuations, all of which
might vary according to the acoustic of the performance space:
How to craft phrases, the beginnings and ends of phrases, the swell of
dynamics, minute tempo changes bar to bar. (Professional- standard
euphonium player)
Tried to shape the line as I heard the song, giving breaks for breath at
what felt like natural points in the line and continuing through places that
needed a sense of continuation and flow. (Professional trombone player)
Shaping the music, rather like a sentence in poetry. Use of dynam-
ics to highlight the phrase. Reacting to new and unfamiliar acoustics.
(Professional-standard percussionist, in this instance conducting a choir)
Other participants would discuss this expressive shaping in a more metaphor-

ical way, in terms of contrast and shade, energy or climaxes within the music:
My thinking about ‘shape’ was used more in an overall way, i.e. how do I
represent contrast and shade within a song that contains light and dark
images, as well as scenes (e.g. ‘roaring traffic boom’, ‘silence of a lonely
room’). (Amateur singer)
The idea was to a high energy beginning, dropping down pretty
quickly, then slowly building to climax. Sort of a backwards N shape.
(Professional-standard pianist)
These three ideas—technical shapes, shape as a formal or structural trajectory,

and shaping as expression—formed the majority of participants’ responses,
though one other participant mentioned gesture as a component of musical
shape. These are all ideas that were also discussed by classical musicians in
relation to musical shaping, both in this questionnaire study (Prior 2012b) and
in later interview studies (Prior 2012a). Indeed, there appears to be little dif-
ference between the ideas of these popular musicians and those of classical
musicians completing the same questionnaire, as revealed by the qualitative
responses. Although some of the quantitative responses to the questionnaire
may have indicated a different emphasis in popular musicians’ understanding
of musical shape, their qualitative responses are not remarkably unlike those of
the classical musicians. The interview study of classical musicians (Prior 2012a)
allowed a more in-depth study of musical shaping in a practical context and,
as a result of this, revealed some more sophisticated ideas surrounding musical
shaping. Not only were technical and expressive ideas discussed, but it became
apparent from participants’ responses that they were using shape-related ideas
heuristically, that is, using nonspecific (often metaphorical) terms as short-cuts
for complex technical ideas (Leech-Wilkinson and Prior 2014). It also became
evident that participants had a multimodal understanding of musical shape
that could be expressed verbally, through musical sound or through gesture,
and that it was possible for participants to feel that their musical shaping was
closely intertwined with their identity. The short quotations above may not be
of sufficient length and depth to provide conclusive evidence of these ideas,
but the metaphorical, non-technically specific nature of some of the comments
hints at heuristic thinking; the mention of gesture by one participant may stem
from a multimodal understanding of musical shaping; and the mention of per-
sonality in relation to musical shaping might relate to links between musical
shaping and identity. At the very least, there is evidence that further study of
popular musicians may reveal similar and equally interesting understandings
of musical shaping compared to those used by classical musicians.
The role of technology in shaping popular music practices has been well doc-
umented (Cook et al. 2009; Frith, Straw and Street 2001; Frith and Zagorski-
Thomas 2012; Gracyk 1996; Katz 2004; Théberge 1997, 2001; Toynbee 2000;
260 Music and Shape
Warner 2003), yet only a small number of respondents referred to their use
of technology in relation to musical shaping. Participants mentioned simple
techniques such as ‘reverb’ and ‘fading out’, as well as the use of previously
recorded performances of improvisations to aid the creation of new improvisa-
tions. The latter is best understood through one participant’s own words. He
described the situation as ‘rehearsing in a duo with a saxophonist I regularly
work with’, and the use of shape as follows:
Listening back to previous recordings to give an idea of where the key
ideas lie, the piece being rehearsed will begin from the key idea and prog-
ress to an end-point. The duration of these pieces are usually short and the
‘shape’ of such pieces that have been devised from this method are usually
more focused than ones that last longer in duration. (Professional pianist)
Several participants with interests in record production discussed technolog-

ically based shape-related ideas when asked about ‘other links between music
and shape’, as evidenced by these comments from a singer and a guitarist:
Sound has a shape (electronic and digital, i.e. sound-waves) and music is
the combination of sounds and silences (among other things), this can
extend to timbre (e.g. shrill, screeching), dynamic variation, phrasing
(legato, staccato, etc. /technique) and even possible mental images cre-
ated from a sound /song /words. (Amateur singer)
Shape for me is simply a handy way to visual[ize] what I hear. Use of
‘shape’ now extends more broadly to the use of software programs such
as Protools where visualizing a recorded performance will not only allow
rapid editing, amongst many other things, but also gives a differing
insight into things like song structure and arrangements—it also gives a
deep insight into feel or groove. (Professional-standard guitarist)
The above comments suggest that the availability of computer programs that
display music and sound as waveforms has added a visual element to these
participants’ understandings of musical shaping (see the Reflections by Savage
and by Reuben later in this volume), rather as notation seems to have done for
classical musicians’ conceptualizations of shaping (see Küssner, Chapter 2 of
this volume). Other technology has also had an influence. Two participants spe-
cifically mentioned mixing and equalization:
From a mixing point of view—EQ-ing tracks to blend together better
is very often a visual thing, i.e. different instruments’ contour shaped so
they don’t all compete for the same frequency ranges. (Amateur electric
guitarist)
When DJing you manipulate the equalization (EQ) of tracks in order to
make them blend as well as possible, which means thinking in terms of
frequency space, often on an up/down or left/right scale. You also think

in terms of acoustic space—tracks are produced for different spaces (‘big
room’ or ‘small room’ tracks) and also create different impressions of
acoustic space, or spaces, within themselves. (Amateur DJ)
The limited number of participants mentioning technology in their descrip-

tions of shaping may have been due to the nature of the methodology: ques-
tionnaires generally elicit shorter and less detailed answers than interviews.
Equally, the wording of the questionnaire, with its (albeit deliberate) focus on
performers and performing, may have influenced the types of responses given
by musicians. Had the questionnaire been directed explicitly towards producers
and recording engineers, it is likely that more technologically based conceptions
of musical shaping would have been found. There may also have been other fac-
tors influencing these participants’ responses. Although some described their
use of shape within recording sessions, they perhaps did not mention technol-
ogy because they were focused mostly on their own performance while some-
one else (i.e. the sound engineer) was (usually) dealing with that side of things.
It was surprising, for example, that the singers in Prior’s (2012b) study did
not describe the ways in which they use microphones to achieve certain vocal
effects. Several texts have explored the role of the microphone in popular musi-
cians’ practices, highlighting the extent to which it has influenced vocal style
(Campbell, Greated and Myers 2004; Frith 2001; Greig 2009; Horning 2004;
Théberge 2001). Musicians have built up knowledge of the types of micro-
phones available and how to employ these to help them to achieve particular
expressive goals. For example, pianissimo can be produced not only by singing
more quietly, the addition of more breath than tone and making less use of the
vocal tract, but also by moving the microphone away from the mouth (Greig
2009). Regulation of this distance (holding the microphone away for high, loud
notes; holding closer for quieter, low-register notes) can lend warmth and grain
to a vocal performance (Barthes 1990; Frith 1981; Théberge 2001), but it also
reveals the intricacies of a vocal performance and thus can highlight flaws as
much as it can highlight richness of tone (Lees 1987; Théberge 2001).
Microphones offer just one example of how popular musicians use tech-
nology in their live performances. Amplification has also changed popular
musicians’ practices (Frith 2001; Théberge 2001). Popular styles such as rock
and heavy metal have adopted extended amplification techniques (e.g. distor-
tion, feedback) which provide musical outputs distinctive to those styles (Poss
1998; Théberge 2001; Walser 1993). An illustration of the use of technologies
for expressive effect in live performance—from instrumental and vocal tech-
niques to extended amplification—can be found in Hughes’ (2006) analysis of
Nirvana’s live performance at the University of Washington in 1990. None of
the participants in Prior’s (2010, 2012b) questionnaire study mentioned their
use of amplification techniques.
262 Music and Shape
Other aspects of live performance not mentioned by the popular musicians

were body movement and the audience. Research in the field of music psy-
chology shows that body movement plays a crucial role in the production and
perception of music (Davidson and Malloch 2009), and that performers move
their bodies in identifiably different ways according to expressive intentions
(Davidson 1993). The more highly expressive the piece, the larger and more pro-
nounced the movements (Davidson 1994). Descriptions of live performances
of popular music frequently highlight audience participation such as moving or
singing along to the music, often as an indicator of an audience’s engagement
with and enjoyment of the listening experience (e.g. see Inglis 2006). Some
performers actively encourage this participation through their gestures to the
audience. Fast (2006) suggests that an audience’s physical engagement with a
performance enables the audience to participate in, and thereby feel invested in,
the creation of the performance.
Above, we have discussed popular musicians’ perspectives on musical shap-
ing, and recent literature in the field of popular music, highlighting how per-
formers may use the notion of shape in relation to instrumental techniques and
the technology available to them. There were a number of similarities in the
classical (see Prior, Chapter 7 of this volume) and popular musicians’ ideas of
musical shaping, including the conceptualization of shape as relating to struc-
ture and musical expression, and as a means of working through technical dif-
ficulties. The next section explores musical practices in the recording studio.
Arguably, the powerful and intractable element of ‘liveness’ that is present in
live performance (see Fast 2006) is lost there; however, in popular music, in
particular, recording processes have the potential to generate a finished prod-
uct that surpasses the possibilities of live performance through the combined
creative input of the performer, sound engineer and producer and their use of
technology in the studio.
The roles of performer, producer and technology in shaping music

in the recording studio
The demands of the recording studio with its concomitant customized environ-
ment and lack of audience require of performers a different understanding of
musical performance compared to their usual live performance situation (Blake
2009; Gander 2011; Horning 2012; Pras and Guastavino 2011; Williams 2012;
Zak 2009). While studies in the popular field typically present producers as hav-
ing most (or in some cases all) of the control over the finished musical product,
performers have responded creatively to both the technical restrictions (most of
which are now historical) and the opportunities afforded by the studio environ-
ment, and have generated new performance techniques as a result (Doğantan-
Dack 2008; Cook et al. 2009; Frith and Zagorski-Thomas 2012). Many of the
means of musical shaping at the disposal of musicians in live performance (if

we assume similar perspectives on ‘shaping’ as those in Prior’s Chapter 7, such
as the use of instrumental technique to bring out a particular emotional tone)
are of course available to them in a studio setting. There is, however, further
potential for performers to modify the sounds they are producing (in conjunc-
tion with engineers and producers) to create an experience for the listener that
goes beyond the possibilities afforded by live performance (Kania 2008).
The studio environment, with its more o r less controlled acoustics, has the
potential to influence decisions made by performing musicians. For example, in
the same way that (classical) music performers in a live setting will adjust their
instrumental techniques to reflect the acoustics of the room in which they are
playing (see H9.13 in Table 7.H9, available online ) for popular musicians
too, sound quality and corresponding perceptions that influence their musical
decisions will vary with the recording space (Gander 2011). Williams’ (2012)
ethnographic research into recording studio practices (carried out from the per-
spective of a musician, engineer and producer) demonstrates that even the use
of technology as seemingly straightforward as headphones can have far-reach-
ing consequences for the social and musical interactions taking place between
musicians performing together and between those musicians and engineers;
this in turn influences the creative process and aspects of the final recording.
In the studio environment, performers relinquish some of their control to
other personnel in the studio (e.g. sound engineer, producer). Music producers
aim to work with performers to overcome possible reluctance or insecurity,
challenge them artistically and steer them to reach for an imaginary world of
sound where technology emotionally enhances the original artist’s vision and
performance, instead of compromising it (Frank Duchene, personal commu-
nication). While this is a somewhat idealized view of the producer’s role, it
acknowledges the considerable interpersonal skills the producer needs in order
to work effectively with artists, as recent studies have shown (Bielmeier 2013;
Davis and Parker 2013). In many cases, performers and producers work col-
laboratively, with discussions leading to the modification of sounds and the
selection of performances for the final recording. However, sometimes produc-
ers conceal their decisions—such as modifying the mix sent to the musicians’
headphones without their knowledge (see Gander 2011: 149–53)—and there
are often tensions about relative contribution (Blake 2009). The work of musi-
cian Miles Davis and producer Teo Macero is a good example of the latter.
Davis described some of the thinking behind his studio recordings in his auto-
biography (Davis with Troupe 1989, in Brackett 2009). According to Davis, the
complexities of the arrangements of the improvisations on the album Bitches
Brew were determined by him with the musicians, without the influence of the
producer. This suggests a dominant role of the performers in the musical out-
come. However, the producer’s contribution is undocumented and simplified
by Davis. Other accounts (e.g. Blake 2009; Szwed 2002) suggest the process
264 Music and Shape
between Macero and Davis had been a great deal more collaborative—that the
pair had listened to the many hours of takes together in order to make deci-
sions about which sections to edit, splice and cut in the production of the final
album.
Technological advances drive creative practices in the studio (Théberge 1989,
2001; Warner 2003): the microphone offers a crucial example (Horning 2002,
2004). Horning (2004) maintains that ‘the art of microphoning’ (see Canby 1956)
is a skill which evolved as a natural progression from the recording engineer’s
placing of performers before the acoustical recording horn, and one which is
acquired tacitly—by recording engineers and performers alike—through experi-
ence. It is a skill that can be used to achieve unique musical outcomes (Moorefield
2005), and that in some hands has been likened to ‘a painter mixing colours from
a palette’ (Horning 2002: 710). The increased role and responsibility of the engi-
neer for achieving musical balance through careful placement of microphones
led to the development of the multitrack studio, which was instrumental in the
development of popular and rock styles through the potential it offers for the
control and layering of sounds (Théberge 1989, 2001). Multitrack recording
was first used in popular music in the 1950s and is characterized by the separate
recording of multiple sound sources to a number of audio channels to create a
recording. This allows engineers to examine intricate details of timing and tuning
(Blake 2009; Frith and Zagorski-Thomas 2012) as well as the broader perspective
of the musical sound (Zak 2009), an approach which has again led to produc-
ers likening their role to that of an artist painting (Phil Harding, in Frith and
Zagorski-Thomas 2012). Such technology has increased the control of producers
over the recorded sounds, not only because of the detailed level at which they are
able to work, but also because of the necessary separation ‘of the artists from
each other, separation of their performances, and further a separation of the
artists from their song and even their performance’ (Gander 2011: 132). Gander
argues that this empowers the producer to make musical decisions.
Other technological advances have been seen in sampling and computer-
based sequencing, including signal processing, Musical Instrument Digital
Interface (MIDI) sequencing, and sound synthesis (Blake 2009; Katz 2004;
Théberge 2001; Warner 2003). Signal processing enables producers to add spe-
cial effects (e.g. reverb, delay, chorus, flange, compression) to tracks; MIDI
sequencers facilitate enhanced control over layering of sounds; and digital sam-
pling enables the manipulation of sound in a variety of ways down to the finest
detail without any discernible loss of sound quality (Goodwin 1990). The level
of control of the sound that these technologies afford provides many creative
opportunities: one only needs to think of the ‘Amen’ break—a four-bar drum
solo performed by Gregory Coleman in the 1960s song Amen, Brother which
has been used extensively in a range of electronic music styles such as break-
beat, hip-hop, hardcore, jungle, and drum and bass (Butler 2006)—to realize
the potential for the use of samples in creating new records. Some authors,
however, have noted the lack of expressive shaping in MIDI-sequenced music

and the effect this can have on co-performers, who try to imitate the precision
of the sequenced sound at the expense of expressive gesture (Warner 2003).
Nonetheless, technological advances have afforded performers, producers and
engineers greater control over the sounds they are producing.
A common theme throughout the literature is the conceptualization of the
‘studio as instrument’ or ‘studio as creative tool’ (Blake 2009; Hennion 1989;
Horning 2012; Zak 2009), with accounts of the day-to-day activities of produc-
ers, recording artists and musical directors generating a strong sense of explor
ation and experimentation (Blake 2009; Hennion 1989; Thompson 2010; Zak
2009). In some cases this seems to lead to new performance practices in live per-
formance (Blake 2009). What constitutes a ‘recording studio’ has changed over
time (in response to technological advances) such that there is now a blurred
distinction between specific professional or commercial studio locations and
the home studio environment (Théberge 2012). Advances in digital technology
mean that records can now be compiled using different studios in remote loca-
tions (sometimes across the other side of the world), reducing the social nature of
musical practices (e.g. social interaction, personal exchanges, communication)
between musicians and producers (see Negus 1992). The increasing prevalence
of long-distance online collaboration and growth of PC-based music produc-
tion has encouraged the development of remix sites through which musicians
share multitrack files and edit and discuss one another’s mixes (Théberge 2012).
This, Théberge notes, means that the artist’s mix becomes just one version of
the music, and that ‘fragments of music flow through a series of multivalent
exchanges only coming to completion when, and if, the participants decide to
bring the process to an end’ (ibid.: 87). Drawing on examples from a number
of popular genres, Moorefield (2010) outlines how multitrack recording tech-
niques facilitate remixing and ‘mash-up’ practices wherein elements of tracks
are reordered, rebalanced and recontextualized. Goodwin (1990: 271) argues
that digital samplers have played a key role in remixes because the differing
length of sounds that can be stored can ‘be used to manipulate, extend, and/or
condense the structure of a song, as well as its texture, arrangement and tim-
bre’. MARRS’s ‘Pump up the Volume’ (constructed using thirty samples from
other records) and The Avalanches’ ‘Frontier Psychiatrist’ (almost entirely con-
structed of samples from other records) are good examples of this.
The roles of the performer, producer and technology in the recording studio
are not easily separable:3 the construction of a popular music recording is a
collaborative process between artists, engineers and producers, and the tech-
nological equipment they use. As Phil Ramone (quoted in Massey 2000: 50)
asserts, ‘you, as the engineer, have to share in the painting with the artist’. There
is agreement that producers need significant interpersonal and leadership skills
to manage the production process (Jarrett 2012; Mike Howlett, in Frith and
Zagorski-Thomas 2012), and tensions that may arise can be found in accounts
266 Music and Shape
given by musicians and producers throughout the literature. Moreover, the bal-
ance of performer, producer and technology will change with every record-
ing, not just as a result of a particular set of personalities involved. Björk, for
example, is said to have described her albums Post and Debut as ‘collections
of duets with the producers who had inspired her: Nellee Hooper, 808 State’s
Graham Massey, Tricky, Howie B.’ (Jonathan Van Meter, in Brackett 2009:
522). In contrast, she states that her later album Homogenic ‘is more like one
flavour. Me in one state of mind. One period of obsessions. That’s why I called
it Homogenic’ (ibid.).
The discussion concerning record producers’ shaping of popular music above
clearly simplifies the myriad influences on the producers themselves. Zagorski-
Thomas (2012) highlights the complex social, commercial and economic fac-
tors contributing to both the availability and the use of recording spaces and
technologies by record producers that can be perceived in the sounds of the
records they produced. The separation of shaping by performer(s) and producer
is somewhat artificial: often, the collaboration between the two is sufficiently
close that separation of the decision-making process is impossible. This situa-
tion is compounded when the performer becomes a sound engineer or producer:
Brabazon (2012) describes a situation where the performer undertakes some
of the production work herself, but later hands her materials over to another
producer to ‘clean up’ and combine with new recorded materials. Some artists
have recorded entire albums themselves: Moby recorded his album Play at home
using Cubase software (ibid.: 62). What is clear is that technological develop-
ments in the studio (whether professional studio or home studio) have been—
and still are—at the heart of the creative process in popular music.
DJs’ perspectives on musical shaping
While popular music recordings (e.g. original recording, remix) may be viewed
as a fixed ‘final product’ once released, they are also used by others to create
new musical experiences. One may be the creation of an accompanying video,
mentioned briefly at the outset. Another, which we explore here in more depth,
is the creation of a new live, ephemeral performance through the use of record-
ings by DJs. While some DJs perform solely with their own music productions,
most are typically using combinations of others’ dance music recordings. This
involves a process of selecting records to be performed back-to-back as well as
consideration of the order in which the records will be performed overall. Some
DJs perform ‘live’ sets, creating dance music over the course of the set (which
may be one, two or three hours in duration) by improvising with beat loops
and samples (Collins 2007). In this instance, the music is still pre-recorded,
but the raw material is manipulated, transformed and recomposed in real-time
using programmes such as AbletonLive, a music sequencer and digital audio
workstation (DAW). There is also live coding wherein programmers write code
in real time which translates into musical output (Collins et al. 2003; Eldridge’s
Reflection earlier in this volume), though a consideration of this is beyond the
scope of this chapter. Here we focus on the use of recordings by DJs, drawing
on our own recent research.
We conducted an interview study exploring DJs’ perspectives on musical
shaping (Greasley and Prior 2013). Using a similar methodology to the inter-
view study with classical musicians mentioned above (Prior 2012a; also her
Chapter 7), we asked DJs to perform a short mix in three conditions: first with
unfamiliar records, second (with the same records) while thinking about musi-
cal shaping or the shape of music, and third (again with the same records) with-
out musical shaping. The first author visited the DJs in their homes, and they
performed on their usual equipment (i.e. decks and mixer set-up). Responses
given by the three male DJs were then compared to the responses of a single
female DJ to the questionnaire study (Prior 2012b) to explore similarities and
differences in the DJs’ perspectives on musical shaping. Results showed a num-
ber of similarities. First, all of the DJs emphasized the importance of record
selection in shaping the overall contour of a performance, confirming previous
literature which has highlighted the centrality of choosing the ‘right’ tracks
in the ‘right’ order (Broughton and Brewster 2002; Straw 1993). As well as
reflecting their own musical preferences, the DJs reported that their choices are
shaped by other factors such as the type of club (e.g. capacity, sound system,
dance floor/seating), the specific night (e.g. single musical style or combination,
night’s reputation), the absence/presence of co-performers, and the audience.
According to the DJs in our sample, the expected audience exerts a power-
ful influence on the records chosen both before and during the performance.
Referring to popular music producers in the recording studio, Hennion (1989)
argued that the audience is never ‘left outside’, that the audience is always in
consideration. Similarly for these DJs, it was apparent that musical choices for
the performance are made with the audience firmly in mind.
A second similarity in perspectives on musical shaping was that all of the
DJs used the functionality on the turntables (e.g. pitch faders) and mixer (e.g.
up-fader, cross-fader, equalization) to manipulate the tempo, volume and fre-
quencies (e.g. bass, mid, treble) to modify the overall sound. The DJs reported
reducing elements of the outgoing tune and increasing elements of the incom-
ing tune, with a particular focus on bassline entries (a combination of which can
be ‘too much’ if equalization is not sufficiently balanced). They also discussed
their use of effects such as echo, flange and reverb to emphasize key structural
points (particularly the ‘drop’, where the bassline reenters after a breakdown).
These findings confirm previous research (Brewster and Broughton 1999, 2012;
Moorefield 2010) in that DJs go beyond the pre-recorded musical materials
they are working with, creating unique compositions, in real time, in the con-
text of the performance (see also Smith 2013).
268 Music and Shape
All of the DJs emphasized that records had inherent shape (one DJ con-
trasted The 45 King’s The 900 Number with DJ Shadow’s Stem/Long Stem to
illustrate this; see Greasley and Prior 2013), and much like the classical musi-
cians in Prior’s (2012a) study, they argued that it was not possible to elimi-
nate shape entirely from a performance because of this. The DJs in our sample
discussed the ways in which they used the existing shape of records, such as
placement of ‘drops’, and equalization to balance overall sound intensity and
musical texture. In trying to perform without shape, the DJs were less likely to
use the existing shape of the record or key structural points, or to employ func-
tions on the turntables and mixer other than for beat synchrony.
Importantly, performing without shaping the music felt ‘unnatural’ to
these DJs, highlighting the seemingly implicit nature of musical shaping (see
Greasley and Prior 2013). This mirrors a key finding in Smith’s recent work
(2013) exploring the compositional processes of hip- hop turntable teams:
she found that decisions about which samples to use and which techniques to
employ were applied unconsciously by the teams in order to achieve the desired
musical outcome. There seems to be a tacit understanding that a ‘good’ per-
formance requires shaping. A further similarity in Greasley and Prior’s (2013)
study was the multimodal understanding of shape that the DJs expressed; they
used gesture and visual diagrams to explain their notions of musical shap-
ing. In particular, DJs reported using the visual diagram of the waveform (i.e.
through Serato, Traktor or the sound recording software) to help them shape
their performances.
The DJs worked with a variety of technology—ranging from Technics
1210 turntables and a Pioneer DJM600 mixer to the latest Denon SC3900
digital media players and a Rane 16 digital mixer—and this highlighted key
differences in their perspectives on musical shaping. The two DJs working
with digital systems discussed their ability to assign cue points on the records
and jump straight to those at any point during a performance. They discussed
how they are able to programme sections of the record or samples, use sam-
ples from the same record simultaneously (as if playing with two identical
vinyl copies at the same time) and loop segments of the musical material.
They reported that during a performance, some of this may have been pre-
pared in advance, while other choices will be made in response to audience
behaviour. The combination of software and hardware allows a greater range
of creative acts; as Poschardt (1998: 365) has argued, ‘technology for them
[DJs] is an integral part of life that offers virtually unlimited creative oppor-
tunities’. In addition, programs such as Serato and Traktor have a synchro-
nization button, which automatically synchronizes the beat on the records
playing. A DJ using this feature is spared the time required to beat match, and
can start applying various effects to modify the sound almost immediately.
The role of technology in DJ practices has been explored by Montano (2010),
whose research with DJs in the Sydney dance music scene provides evidence
for the practices employed by our participants. He notes that:
The development of technology has enhanced the work of the DJ, so that
tracks can be altered and reshaped in order to fit the specific requirements
of the DJ. Vocals can be added and tracks can be extended or shortened,
allowing the DJ to have more control over the actual ‘sound’ of their set,
which increases the extent to which they can impose their own personal,
unique ‘musical’ identity upon it. (ibid.: 404)
Both of the DJs working with digital set-ups in our study were also scratch
DJs, or as Katz (2012) would call them, ‘performative DJs’, who not only
select recordings but manipulate them in real time for audiences. Scratching is
a specific performance style which involves the use of turntables as a musical
instrument to create and manipulate beats, sounds and samples (typically from
a wide selection of popular music styles) for expressive performance (ibid.;
Hansen 2010; Poschardt 1998; Smith 2013). For these DJs, scratching tech-
niques were fundamental to their conceptualization of a shaped performance.
They emphasized the importance of identifying samples and employing vari-
ous scratch styles (e.g. crab, scribble, hydroplane) and turntablism techniques
(e.g. beat juggling).
In summary, for the DJs we interviewed, shape was related to musical struc-
ture, dynamics, using samples, selecting records and mixing styles. They also
expressed a multimodal understanding of shape including the visual represen-
tation of tracks and mixes, and indicating shape-related ideas using gesture
(see Greasley and Prior 2013 for a full write-up of the study). There were
therefore a number of overlaps between these DJs and the popular musicians
in Prior’s (2012b) questionnaire study, most notably relating shape to struc-
ture, to dynamics and to the multimodal understanding through visualizations
and gesture.
Summary: notions of musical shaping in popular music
This chapter has explored notions of musical shaping from the perspectives
of performer, producer and engineer, and through the contexts of live perfor-
mance and studio recordings. The practices of popular music are viewed in a
somewhat simplified, layered approach. We began with the performers’ per-
spectives in live performance, while noting the contribution of sound engineers
in this context. Recent questionnaire research has shown that popular musi-
cians use the notion of shape (and images) to overcome difficulties, in reference
to musical structure (or trajectory) and to achieve particular expressive goals
(Prior 2010, 2012b). Only a few references were made by the popular musicians
270 Music and Shape
to their use of technology when thinking about or using the notion of shape.
This was surprising given that technology is an essential element in the defi
nition of musical sound and style, and that musicians’ varying use of tech-
nology reflects diverse aesthetic and cultural priorities (Théberge 1997). Other
research highlights how performers use technology (e.g. microphone, distor-
tion) to achieve particular musical outcomes (Greig 2009; Théberge 2001) and
also how the audience may influence performers’ decisions (Fast 2006; Inglis
2006). Performers and sound engineers in live contexts work towards the pro-
duction of a transient, ephemeral listening experience, and audience members
may participate in this to a greater or lesser extent through singing along and
moving to the music.
We then discussed the potential contributions of performer, sound engineer
and producer in a studio context to create a recording, which is usually viewed
as a fixed, repeatable listening object. In the absence of research specifically
asking these individuals about their understanding of ‘musical shaping’ in the
studio, we made inferences from literature in the popular field. The role of
technology, such as placement of microphones, multitrack recording, MIDI
sequencing and digital sampling, is crucial to creative practice in the studio,
but other influences on the finished product, such as other personnel involved
in the recording industry and economic factors, are also important. It is also
worth noting the potential influence of the recording process on a subsequent
live performance, as highlighted by Blake (2009). It is not always clear whether
the performer(s) or the producer has the greatest influence over the creative
decisions in the studio (particularly given the increasingly blurred boundar-
ies between artist and producer), and we agree with Frith and Zagorski-
Thomas (2012) that assessing relative contribution requires record-by-record
investigation.
Finally, we discussed the use of popular music recordings in new contexts
focusing on the live performances of DJs. DJ performance requires the creative
use of records that have been previously shaped by others; it is not unusual for
DJs to play with more than thirty records in an hour-long DJ performance. In
this way, just as recordings are seen to enable musicians to go beyond the poten-
tial afforded by live performances, DJs go beyond the possibilities afforded by
the simple playback of existing recordings by reshaping them into a new per-
formance, or even a new piece of music, usually creating a transient listening
experience for an audience. These ideas about musical shaping from the differ-
ent perspectives are summarized in Table 8.2.
To illustrate our conceptualization of layers in shaping popular music, it is
helpful to draw on an example in which a track has been performed live, recorded
in a studio, remixed, and then performed with by DJs. Sarah McLachlan’s track
‘Fallen’ from her album Afterglow (which sold more than five million copies
worldwide) is one such. She has performed the track in live contexts (nota-
bly her 2004 summer tour ending in Vancouver) and has also recorded it in
TABLE 8.2 Layers of shaping in popular music performances
Who May Shape the Music Means by Which the Music May Be Shaped Final Result
Live Performer(s) - Shape of set (choice of repertoire) Transient,

performance - Musical structure (through ephemeral
composition or improvisation) listening
- Instrumentation experience
- Instrumental/vocal technique to
shape phrases
- Microphone technique
- Amplification techniques
(e.g. distortion, feedback)
Sound engineer - Positioning of microphones
(in dialogue with - Mixing of sounds (e.g. EQ, balance)
performer) - Use of reverb, fading in and out
Studio Performer(s) As above Fixed, repeatable
recording Producer/sound As above, plus: listening object
engineer (in dialogue - Multitrack recording
with performer) - Selecting musical materials from
multiple takes (editing and splicing
musical materials)
- Special effects (e.g. reverb, delay,
chorus, flange)
- MIDI sequencing
- Digital sampling
- Shaping the audio space using
microphone positioning
- Overall sound (acknowledging
technological possibilities, e.g.
stereo vs. surround sound)
- Covers
- Remixing
- Mash-ups
Using studio All of the above, plus: All of the above, plus:
recordings
Performer (DJ) - Overall contour of the Transient,
performance—choice of ephemeral
records (and specific order) listening
- Use of turntables and mixer to experience
manipulate tempo, volume and
frequencies to shape the
overall sound
- Use of special effects to emphasize
musical structure on record
- Scratching techniques
- Highlighting or manipulating
existing shape of records
- Looping
Video engineer - Shaping the visual accompaniments
to the sound
the studio with producer Pierre Marchand. After the artist released the multi-
track stems of the studio recording, it has been remixed by a number of artists
(for example, Josh Gabriel and Dave Dresden produced the ‘Anti-Gravity Mix’)
which was then used in live performances by DJs around the world, and also in
272 Music and Shape
mixtapes (for instance, DJ ATB used the Anti-Gravity remix in his continuous
mix for the DJ2 series). There are many thousands of similar examples wherein
a piece originally written for acoustic live performance has been shaped and
reshaped over time by a range of musicians, producers and DJs.
‘Shape’ has been shown elsewhere to be a versatile term that is used
metaphorically and heuristically by classical performing musicians (Prior,
Chapter 7; Leech-Wilkinson and Prior 2014). Its versatility is also appar-
ent from the data and literature relating to popular music that have been
explored in this chapter. In the same way that some classical musicians can
understand shape to refer to both the large-scale structure of a piece of
music and the moment-to-moment changes they make to a single phrase
or note, shape for DJs seems to apply to an entire set or a track, but also
to small-scale changes made to the sounds produced (e.g. flange, reverb).
The role of the producer has been highlighted, extending further the col-
laborative shaping that may occur in creating a musical product. What may
differ between record production and live performance is the availability of
time to rework materials without public scrutiny, making the shaping pro-
cess of record producers perhaps more similar to that of an artist than to,
say, a dancer’s process (a similarity discussed by some classical musicians).
Indeed, we have noted the extent to which record producers liken their roles
to that of an artist. Phil Harding observes that producing a pop record is
like painting a picture. Susan Horning comments that the placement and use
of microphones has been likened to a painter mixing colours on a palette.
In live sound production or DJ performance, this artistic process is recali-
brated so that it does occur in real time, and yet the analogy persists, albeit
with a focus on the process of collaborative painting rather than a finished
product. In Smith’s work on turntable teams (e.g. Mixologists), one DJ (Beni
G) stated, ‘it’s like two, three or four artists all holding the same paint brush
wanting to paint one picture and they’re all trying to paint it in a slightly
different way’ (Smith 2013: 62). The recurring analogy may provide an alter-
native explanation to our proposition earlier in the chapter—namely, that it
could be due to the historical tendency to snub popular music—as to why so
many titles on record production and DJing start with ‘The Art of . . .’
This chapter has highlighted some of the perspectives on music and shape
that are apparent within popular music. Some insights were gained through
empirical research; others came from examining existing literature in the field.
While participants in the survey and interview studies were able to talk about
shape and to refer to graphic metaphors for their musical sensibilities, it was
less clear that ‘shaping’ is the way they think about their music unprompted or
that ‘shaping’ is something that drives their practices. More research is needed
before arriving at any conclusions in this regard, if indeed this is a question that
can ever be fully answered given musicians’ tacit understanding of the concept.
Nonetheless, the chapter provides a starting point for future research studies,
highlighting how notions of musical shape may differ with roles and contexts
(e.g. performer in a live setting, producer in the studio). There is a need to
study a broader range of performing musicians in some depth, to investigate
not only the ways in which different instrumentalists and singers conceptualize
music and shape, but also to explore the great diversity of genres. Future work
should also focus on the several roles in the studio (e.g. performer, producer,
sound engineer) given that there is no empirical work on this at present. Finally,
research could focus on the collaboration that comes to the fore in popular
music, attempting to determine how shaping decisions are distributed among
co-performers in a band or group, or between singers, instrumentalists, produc-
ers and sound engineers. It is clear that much remains to be done to understand
more fully the elusive yet prevalent, tacit and yet seemingly fundamental con-
cept of musical shaping within popular music.
References
Barthes, R., 1990: ‘The grain of the voice’, in S. Frith and A. Goodwin, eds., On Record:
Rock, Pop and the Written Word (London: Routledge), pp. 293–300.
Bayley, A., ed., 2010: Recorded Music: Performance, Culture and Technology (Cambridge:
Cambridge University Press).
Bennett, A., B. Shank and J. Toynbee, eds., 2006: The Popular Music Studies Reader
(Abingdon: Routledge).
Bielmeier, D. C., 2013: ‘Determining the relationship between new recording engineers’
perceived skill sets and those observed by their employers’ (PhD dissertation, Argosy
University).
Blake, A., 2009: ‘Recording practices and the role of the producer’, in N. Cook, E. Clarke,
D. Leech-Wilkinson and J. Rink, eds., The Cambridge Companion to Recorded Music
(Cambridge: Cambridge University Press), pp. 36–53.
Boltz, M. G., 2013: ‘Music videos and visual influences on music perception and apprecia-
tion: should you want your MTV?’, in S.-L. Tan, A. J. Cohen, S. D. Lipscomb and R. A.
Kendall, eds., The Psychology of Music in Multimedia (Oxford: Oxford University Press),
pp. 217–34.
Brabazon, T., 2012: Popular Music: Topics, Trends and Trajectories (Los Angeles: Sage).
Brackett, D., ed., 2009: The Pop, Rock and Soul Reader: Histories and Debates, 2nd edn
Brewster, B. and F. Broughton, 1999: Last Night a DJ Saved My Life (London: Headline).
Brewster, B. and F. Broughton, 2012: The Record Players: The Story of Dance Music Told
by History’s Greatest DJs (London: Virgin Books).
Broughton, F. and B. Brewster, 2002: How to DJ: The Art and Science of Playing Records
(London: Transworld).
Burgess, R. J., 2001: The Art of Music Production (London: Omnibus Press).
Burns, G., 2006: ‘Live on tape. Madonna: MTV Video Music Awards, Radio City Music Hall,
New York, September 14, 1984’, in I. Inglis, ed., Performance and Popular Music: History,
Place and Time (Aldershot: Ashgate), pp. 128–37.
274 Music and Shape
Butler, M. J., 2006: Unlocking the Groove: Rhythm, Meter and Musical Design in Electronic
Dance Music (Bloomington: Indiana University Press).
Campbell, M., C. Greated and A. Myers, 2004: Musical Instruments: History, Technology
and Performance of Instruments of Western Music (Oxford: Oxford University Press).
Canby, E. T., 1956: ‘The sound man-artist’, Audio 44–5: 60–1.
Cohen, A. J., 2009: ‘Music in performance arts: film, theatre and dance’, in S. Hallam, I.
Cross and M. Thaut, eds., The Oxford Handbook of Music Psychology (Oxford: Oxford
Collins, N., 2007: ‘Live electronic music’, in N. Collins and J. d’Escriván, eds., The Cambridge
Companion to Electronic Music (Cambridge: Cambridge University Press), pp. 38–54.
Collins, N., A. McLean, J. Rohrhuber and A. Ward, 2003: ‘Live coding in laptop perfor-
mance’, Organised Sound 8/3: 321–30.
Cook, N., E. Clarke, D. Leech-Wilkinson and J. Rink, eds., 2009: The Cambridge Companion
to Recorded Music (Cambridge: Cambridge University Press).
Davidson, J. W., 1993: ‘Visual perception of performance manner in the movements of solo
musicians’, Psychology of Music 21: 103–13.
Davidson, J. W., 1994: ‘What type of information is conveyed by the body movements of
solo musician performers?’, Journal of Human Movement Studies 6: 279–301.
Davidson, J. W., 2006: ‘ “She’s the One”: multiple functions of body movement in a stage
performance by Robbie Williams’, in A. Gritten and E. King, eds., Music and Gesture
Davidson, J. W. and S. Malloch, 2009: ‘Musical communication: the body movements of per-
formance’, in S. Malloch and C. Trevarthen, eds., Communicative Musicality: Exploring
the Basis of Human Companionship (Oxford: Oxford University Press), pp. 565–84.
Davis, R. and S. Parker, 2013: ‘Creativity and communities of practice: music technology
courses as a gateway to the industry’, paper presented at the Proceedings of the Audio
Engineering Society 50th International Conference, Murfreesboro, TN, USA, 25–27
July 2013.
Doğantan-Dack, M., ed., 2008: Recorded Music: Philosophical and Critical Reflections
(London: Middlesex University Press).
Fast, S., 2006: ‘Popular music performance and cultural memory. Queen: Live Aid,
Wembley Stadium, London, July 13, 1985’, in I. Inglis, ed., Performance and Popular
Music: History, Place and Time (Aldershot: Ashgate), pp. 138–54.
Frith, S., 1981: Sound Effects: Youth, Leisure, and the Politics of Rock ’n’ Roll (New York:
Pantheon).
Frith, S., 2001: ‘Pop music’, in S. Frith, W. Straw and J. Street, eds., The Cambridge
Companion to Pop and Rock (Cambridge: Cambridge University Press), pp. 93–108.
Frith, S. and A. Goodwin, eds. 1990: On Record: Rock, Pop and the Written Word (London:
Routledge).
Frith, S. and S. Zagorski-Thomas, eds., 2012: The Art of Record Production: An Introductory
Reader for a New Academic Field (Farnham: Ashgate).
Frith, S., W. Straw and J. Street, eds., 2001: The Cambridge Companion to Pop and Rock
Gander, J., 2011: ‘Performing music production: creating music product’ (PhD dissertation,
King’s College London, Department of Culture, Media and Creative Industries). Available
at https://kclpure.kcl.ac.uk/portal/files/5341102/Gander_PhD.pdf (accessed 9 April 2017).
Gibson, D., 2005: The Art of Mixing: A Visual Guide to Recording, Engineering, and Production,
2nd edn (Boston: Thompson Course Technology PTR).
Goodwin, A., 1990: ‘Sample and hold: pop music in the digital age of reproduction’, in
S. Frith and A. Goodwin, eds., On Record: Rock, Pop and the Written Word (London:
Routledge), pp. 258–76.
Gracyk, T., 1996: Rhythm and Noise: Aesthetics of Rock (London: Tauris).
Greasley, A. and H. M. Prior, 2013: ‘Mix tapes and turntablism: DJs’ perspectives on musi-
cal shape’, Empirical Musicology Review 8/1: 23–43.
Greasley, A. E., A. Lamont and J. A. Sloboda, 2013: ‘Exploring musical preferences: an
in-depth study of adults’ liking for music in their personal collections’, Qualitative
Research in Psychology 10/4: 402–27.
Greig, D., 2009: ‘Performing for (and against) the microphone’, in N. Cook, E. Clarke,
Griffiths, N. K., 2010: ‘ “Posh music should equal posh dress”: an investigation into the
concert dress and physical appearance of female soloists’, Psychology of Music 38/2:
159–77.
Hansen, K. F., 2010: ‘The acoustics and performance of DJ scratching: analysis and mod-
elling’ (PhD thesis, University of Stockholm). Available at http://www.speech.kth.se/
~kjetil/thesis (accessed 9 April 2017).
Hennion, A., 1989: ‘An intermediary between production and consumption: the producer
of popular music’, Science, Technology and Human Values 14/4: 400–24.
Horning, S. S., 2002: ‘Chasing sound: the culture and sound of recording studios in America
1877–1977’ (PhD dissertation, Case Western Reserve University).
Horning, S. S., 2004: ‘Engineering the performance: performance engineers, tacit knowl-
edge and the art of controlling sound’, Studies of Social Science 34: 703–31.
Horning, S. S., 2012: ‘The sounds of space: studio as instrument in the era of high fidelity’,
in S. Frith and S. Zagorski-Thomas, eds., The Art of Record Production: An Introductory
Reader for a New Academic Field (Farnham: Ashgate), pp. 29–42.
Hsieh, H.-F. and S. E. Shannon, 2005: ‘Three approaches to qualitative content analysis’,
Qualitative Health Research 15/9: 1277–88.
Hughes, T., 2006: ‘Nirvana: University of Washington, Seattle, January 6, 1990’, in I. Inglis,
ed., Performance and Popular Music: History, Place and Time (Aldershot: Ashgate),
pp. 155–71.
Inglis, I., ed., 2006: Performance and Popular Music: History, Place and Time (Aldershot:
Ashgate).
Jarrett, M., 2012: ‘The self-effacing producer: absence summons presence’, in S. Frith and
S. Zagorski-Thomas, eds., The Art of Record Production: An Introductory Reader for a
New Academic Field (Farnham: Ashgate), pp. 129–48.
Kania, A., 2008: ‘Works, recordings, performances: classical, rock, jazz’, in M. Doğantan-
Dack, ed., Recorded Music: Philosophical and Critical Reflections (London: Middlesex
Katz, M., 2004: Capturing Sound: How Technology Has Changed Music (Berkeley: University
of California Press).
Katz, M., 2012: Groove Music: The Art and Culture of the Hip-hop DJ (New York: Oxford
University Press).
276 Music and Shape
Kooijman, J., 2006: ‘Michael Jackson: Motown 25, Pasadena Civic Auditorium, March
25, 1983’, in I. Inglis, ed., Performance and Popular Music: History, Place and Time
Krahé, C., U. Hahn and K. Whitney, 2015: ‘Is seeing (musical) believing? The eye versus
the ear in emotional responses to music’, Psychology of Music 43: 140–8.
Leech-Wilkinson, D. and H. M. Prior, 2014: ‘Heuristics for musical expression’, in
D. Fabian, E. Schubert and R. Timmers, eds., Expressiveness in Music Performance:
Empirical Approaches Across Styles and Cultures (Oxford: Oxford University Press),
pp. 34–57.
Lees, G., 1987: Singers and the Song (New York: Oxford University Press).
Massey, H., 2000: Behind the Glass: Top Record Producers Tell How They Craft the Hits
(San Francisco: Backbeat Books).
McIntyre, P., 2012: ‘Rethinking creativity: record production and the systems model’, in
S. Frith and S. Zagorski-Thomas, eds., The Art of Record Production: An Introductory
Reader for a New Academic Field (Farnham: Ashgate), pp. 149–61.
Middleton, R., 1995: ‘The “problem” of popular music’, in idem, Musical Belongings:
Selected Essays (Farnham: Ashgate, 2009), pp. 75–88.
Montano, E., 2010: ‘How do you know he’s not playing Pac-Man while he’s supposed to
be DJing? Technology, formats and the digital future of DJ culture’, Popular Music 29:
397–416.
Moore, A., 2001: The Primary Text: Developing a Musicology of Rock, 2nd edn (Farnham:
Ashgate).
Moorefield, V., 2005: The Producer as Composer: Shaping the Sounds of Popular Music
(Cambridge, MA: MIT Press).
Moorefield, V., 2010: ‘Modes of appropriation: covers, remixes and mash-ups in contem-
porary popular music’, in A. Bayley, ed., Recorded Music: Performance, Culture and
Technology (Cambridge: Cambridge University Press), pp. 291–306.
Moylan, W., 2007: Understanding and Crafting the Mix: The Art of Recording, 2nd edn
(Oxford: Focal).
Negus, K., 1992: Producing Pop: Culture and Conflict in the Popular Music Industry (London:
Arnold).
Negus, K., 1996: Popular Music in Theory: An Introduction (Cambridge: Polity).
Poschardt, U., 1998: DJ Culture (London: Quarter Books).
Poss, R. M., 1998: ‘Distortion is truth’, Leonardo Music Journal 8: 45–8.
Pras, A. and C. Guastavino, 2011: ‘The role of music producers and sound engineers in the
current recording context, as perceived by young professionals’, Musicae Scientiae
15/1: 73–95.
Prior, H. M., 2010: ‘Links between music and shape: style-specific; language-specific; or uni-
versal?’, paper presented at Topics in Musical Universals: 1st International Colloquium,
University of Provence, Aix-en-Provence, France December 2010.
Prior, H. M., 2012a: ‘Report for interview participants’, http://www.cmpcp.ac.uk/
Report%20for%20interview%20participants.pdf (accessed 9 April 2017).
Prior, H. M., 2012b: ‘Shaping music in performance: report for questionnaire participants
(revised August 2012)’, http://www.cmpcp.ac.uk/wp- content/uploads/2015/09/Prior_
Scott, D. B., ed., 2009: The Ashgate Research Companion to Popular Musicology (Farnham:
Ashgate).
Shuker, R., 2013: Understanding Popular Music Culture, 4th edn (Routledge: Abingdon).
Smith, S., 2013: Hip-hop, Turntablism, Creativity and Collaboration (Farnham: Ashgate).
Straw, W., 1993: ‘The booth, the floor and the wall: dance music and the fear of falling’,
Public 8: 169– 82. Available at http://pi.library.yorku.ca/ojs/index.php/public/article/
viewFile/30160/27715 (accessed 9 April 2017).
Szwed, J., 2002: So What: The Life of Miles Davis (London: Heinemann).
Tagg, P., 1982: ‘Analysing popular music: theory, method and practice’, Popular Music
2: 37–65.
Tan, S., A. J. Cohen, S. D. Lipscomb and R. A. Kendall, eds., 2013: The Psychology of
Music in Multimedia (Oxford: Oxford University Press).
Théberge, P., 1989: ‘The “sound” of music: technological rationalisation and the produc-
tion of popular music’, New Formations 8: 99–111.
Théberge, P., 1997: Any Sound You Can Imagine: Making Music/Consuming Technology
(Hanover, NH: Wesleyan University Press).
Théberge, P., 2001: ‘ “Plugged in”: technology and popular music’, in S. Frith, W. Straw
and J. Street, eds., The Cambridge Companion to Pop and Rock (Cambridge: Cambridge
Théberge, P., 2012: ‘The end of the world as we know it: the changing role of the stu-
dio in the age of the Internet’, in S. Frith and S. Zagorski-Thomas., eds., The Art of
Record Production: An Introductory Reader for a New Academic Field (Farnham: Ashgate),
pp. 77–90.
Thompson, D., 2010: Wall of Pain: The Life of Phil Spector (London: Omnibus Press).
Toynbee, J., 2000: Making Popular Music: Musicians, Creativity and Institutions (New York:
Tsay, C.-J., 2013: ‘Sight over sound in the judgment of music performance’, Proceedings of the
National Academy of Sciences 110/36: 14580–5.
Walser, R., 1993: Running with the Devil: Power, Gender and Madness in Heavy Metal Music
(Middletown, CT: Wesleyan University Press).
Warner, T., 2003: Pop Music: Technology and Creativity—Trevor Horn and the Digital
Revolution (Aldershot: Ashgate).
Williams, A., 2012: ‘ “I’m not hearing what you’re hearing”: the conflict and connection
of headphone mixes and multiple audioscapes’, in S. Frith and S. Zagorski-Thomas,
eds., The Art of Record Production: An Introductory Reader for a New Academic Field
(Farnham: Ashgate), pp. 113–27.
Zagorski-Thomas, S., 2012: ‘The US vs. the UK sound: meaning in music production in
the 1970s’, in S. Frith and S. Zagorski-Thomas, eds., The Art of Record Production: An
Introductory Reader for a New Academic Field (Farnham: Ashgate), pp. 57–76.
Zak, A., 2009: ‘Getting sounds: the art of sound engineering’, in N. Cook, E. Clarke,
Reflection
Steven Savage, record producer and sound engineer
Creating unnatural shapes
When I was asked to write about music and shape for this volume I immedi-
ately thought of the reverb programmes that I use to add ambience to indi-
vidual tracks when I am mixing. Reverb presets often come in the form of
representations of physical space. General categories might include stadiums,
concert halls, churches, theatres, auditoriums, nightclubs, small rooms, etc.
Today’s sampling reverbs, which can translate specific acoustical spaces into
ambiences that can be used on any sound, include such presets as the Sydney
Opera House, St Paul’s Cathedral or the Ryman Auditorium at The Grand Ole
Opry, as well as less renowned, smaller spaces such as a closet, a tiled bathroom
or the interior of a Ford Econoline van. Some programs simulate very specific
types of spaces such as a medium concert hall with stage, which recreates the
effects of a typical concert hall including the anomalies created by the stage
area. Every acoustical space consists of shapes that inform our ear and create
mental images of environments.
How does all this inform my work as a professional music-mixer? One of
my primary tasks as a mixer is the creation of these acoustical environments.
Although some recordists are careful to create realistic environments (primarily
in the western art music world), those of us dealing in popular music tend to
create ‘impossible’ environments. They may be impossible simply because var-
ious elements are in different spaces (the guitar is in a small wood room while
the vocalist is in a concert hall); but they may also be impossible because we
can create ambiences that don’t actually exist in nature—a truncated reverb or
a ‘perfect’ digital delay, for example. Figure R.25 illustrates one such imaginary
acoustic environment in which a three-dimensional model represents the var-
ious qualities produced by spatial constructions that are a part of the mixing
process. The various blocks represent frequency content based on their height
278
Reflection: Steven Savage 279
Frequency
Ambience
Pan
FIGURE R.25 Three-dimensional mixing metaphor. This diagram indicates one way in which shapes
may be used to represent essential conceptual approaches to building audio mixes such as frequency
balance, panning position and the sense of depth created by the addition of ambiences.
(Figure credit: Iain Fergusson)
(amount of high frequencies), panning is represented by the horizontal posi-

tion of the blocks, and depth or ambience is represented by the relative position
from front to back. Together the figure defines the sonic landscape in terms of
varying shapes and their relative position.
In the context of this book I wonder what effect these unnatural spaces
have on the listener. Is there a disorientation that comes from one’s inability to
match the musical environment to one’s experience? If our mind cannot place
the music in a particularly shaped environment, do we lose part of the natural
connection to the music? I’m inclined to think that the evolution of contem-
porary recordings traces the evolution of the cultural ‘ear’. And so, over time,
we have come to accept and be comfortable with these unnatural soundscapes
to the point that they don’t disrupt our ability to enjoy the music. I’m not sure
of that: perhaps the oft-referenced dislocation caused by the experience of
recorded music may be exacerbated by the recordist’s lack of adherence to any
natural model. In any event, that isn’t going to prevent me from continuing
with the pleasures of building complex, if unnatural, soundscapes as an essen-
tial part of the sonic shaping process.
PART 4
Shapes seen
Reflection
Mark Applebaum, graphic composer
Handbook for The Metaphysics of Notation
The Metaphysics of Notation (2008) is a 72-foot-wide, hand-drawn pictographic

score divided into twelve continuous panels.1 It is accompanied by no instruc-
tion regarding its interpretation. The work aspires to elicit a musical response
from a performer, but despite its profusion of concrete, detailed glyphs it advo-
cates nothing specific about the nature of their aural realization. Furthermore,
I heard no sound in my head while composing the piece. This is a radical depar-
ture from the approach to composition I was taught, in which the composer’s
job is to imagine—preferably with exacting resolution—a sound object, and
then, through the deft application of the most relevant notation (whether tra-
ditional or invented—but if the latter, surely a defined one) to produce a speci
fication from which a performer (burdened or invigorated by a marginal or
essential role as interpreter) can realize this imagined sound.
The score has inspired more than a hundred radically diverse musical inter-
pretations and remains the locus of a larger social project that invites questions
about musical ontology, the meaning of notation, the roles of composer and
performer, the boundaries of interpretation, the impact of context on musical
enterprise (it first appeared in a museum), and so forth. This Reflection focuses
on some of the musical attributes of the score’s notation. At its core, the piece is
a singular artefact narrowly defined by its frozen, circumscribed visual makeup,
however elaborate. It is a work that is self-consciously concerned with form
above all else—its particular shapes, geometries and contours. While these
shapes may at first seem unfamiliar or even exotic to most performers, I will
show that they can function in ways that are analogous to fundamental com-
positional conventions.
Although it is worthy of discussion, in this context I evade a deliberate
examination of my aesthetic motivation for such reckless creative enterprise.
283
284 Music and Shape
I wish to point out, however, that it runs parallel with—it has not replaced—
an abiding interest in composing music in which I first pre-hear the result, as
well as the manufacture of scores whose symbols are assiduously defined (by
me or through communion with a common practice, both ancient and recent).
Instead, my point here is that, although the premise for The Metaphysics
of Notation is arguably unconventional (or at least not mainstream), it has
qualities that are common to all music. To wit, my determinate, nonpic-
tographic pieces, and most music for that matter, are also concerned with
form—through their shapes, geometries and contours. The composer Roger
Reynolds used to extol the idea of musical ‘profile’ considered in various
parameters. In lessons he would note the blandness of a particular ampli-
tude profile or the iconicity of a noteworthy rhythmic profile. The word ‘pro-
file’ immediately conjures visual imagery and ideas about shape. Pinocchio
has an especially memorable profile. By extension, one might consider the
‘silhouette’ of Mickey Mouse. Not only is it well known as a successful com-
mercial meme, but it is arguably a shape of intrinsic distinction, and this
gives it the kind of memorable quality that is often desirable to the composer
of musical material. Reynolds’ use of visual analogy is telling but not new.
One recalls the now esoteric Schillinger System of Musical Composition,
which allows a city skyline to be employed as a melodic template. And pic-
tographic notation goes back at least to the ars subtilior composers of the
fourteenth century.
Although I chose not to hear sound when composing the Metaphysics, I was
often conscious of the deliberate analogy between notational shape and musi-
cal discourse. There are two corresponding concerns that I will explore. First
is the idea of devices used for rhetorical development. Second is the idea of
large-scale formal connectivity.
Devices used for rhetorical development
Consider panel 4 of Figure R.26. At its left side a shield appears. (For ease of
expression I’m calling it a ‘shield’; but I think of it equally as ‘a shield-looking
thing that, to the broad-minded interpreter, may or may not evoke shieldness.’
The point is that an interpreter should not be limited by my verbal descrip-
tions.) Although partially obscured behind the first shield, a hook or letter J
appears to rotate from one shield to the next, thus implying (or again, more
accurately, ‘potentially inviting an inference of ’) inversion (Figure R.27).
The incremental clocklike advance— as opposed to a sudden 180- degree
flip—may intimate either some kind of slow motion melodic inversion or the
inversion of a chord from root position to first inversion to second inversion,
etc. Meanwhile, the shields descend in the vertical dimension, thereby sug-
gesting transposition or sequence. Their descent by equal quanta evokes a
FIGURE R.26 The Metaphysics of Notation, panel 4
286 Music and Shape
FIGURE R.27 The Metaphysics of Notation, panel 4 close-up: descending ‘shields’
chromatic-like field as opposed to a diatonic one. That the shields’ patterns

are each unique—a display of autonomous, unrepeated vocabulary—implies
a kind of serial approach to class, or perhaps the succession of timbres heard
in Klangfarbenmelodie.
Later, circles appear in a sinusoidal curve, potentially arguing for a conso-
nant timbre (Figure R.28). The circles grow in size, thus indicating the occur-
rence of augmentation. They become distorted, perhaps inviting an enrichment
of timbral character. Or perhaps it is not distortion, but rather dematerializa-
tion, elimination or fragmentation—like Beethovenian atomization or what
Messiaen called skeletonization.
As we continue to the beginning of panel 5, we find the faint detritus accret-
ing, materializing in retrograde fashion back into something concrete and
complete (Figure R.29). It reminds me of the ethereal, vaporous beginning of
Mahler’s First Symphony out of which eventually grows identifiable material
of chiselled suasion. But in contrast to panel 4, the opening material in panel
5 no longer repeats along the path of a sinusoidal waveform; instead, it merely
diminishes in descending glissando fashion (Figure R.30). More noteworthy,
its materialization is not a strict retrograde return to circular forms. Instead, it
employs orthogonal, rectilinear shapes as if a mode change has occurred, be it
from major to minor, or mean to just intonation; or perhaps a modification in
some other dimension altogether, say metre (e.g. duple to triple), dynamics (e.g.
soft to loud), personnel (solo to ensemble), poetics (e.g. aquarian to existential),
etc. The point is that some kind of modality seems to have changed valence.
FIGURE R.28 The Metaphysics of Notation, panel 4 close-up: sinusoidal curve
FIGURE R.29 The Metaphysics of Notation, panel 5
FIGURE R.30 The Metaphysics of Notation, panel 5 close-up: materialization of rectilinear forms
290 Music and Shape
Returning to panel 4’s wave of circles (see Figure R.28), we see a counterpoint
emerging underneath in the form of the upward sloping diagonal (potentially
a portamento) comprising many small details (reminiscent of ornamentation,
decoration). Unexpected, irregular (syncopated) bits of varying length extend
above or dangle below the slope, thus suggesting accents, chordal congruencies
or multiphonics. This slope feels fundamentally independent of the circles (it is
both contrasting and non-accompanimental), thus establishing polyphony: an
expansion of voices and a richer texture. And the languages of the two voices
are so dissimilar in personality that one envisions Ivesian simultaneity or the
character patterns of Carter.
Just before the slope disappears, two new glyphs appear above it: a small cir-
cle and a small oval, both black. The circle echoes the genesis of the sinusoidal
wave of circles that have since evolved to their mature state of augmentation
and dematerialization. The oval is its squashed permutation, a kind of thematic
metamorphosis or, in its simplest sense, a variation. The circle and the oval are
far enough apart that they might appear atomic, isolated—or, to use Cage’s
language, unimpeded. But, because they are connected by slender lines, we are
compelled to see mutual belonging, a molecular constellation. Cage would call
them interpenetrated, and their connection affects how we understand them
and, presumably, how we might play them.
The downward glissando of materializing rectangles in panel 5 seems to ter-
minate in a point, a seed that grows into a flower (Figure R.31). This consti-
tutes a striking change, the sudden presentation of contrasting material soon
followed by more idiosyncratic references (e.g. a bell, an apple, a telephone).
Arguably, these materials possess stronger, more concrete cultural associations
than their more geometrically platonic neighbours. This might parallel an act
of musical quotation, or perhaps natural mimesis like birdcall in Beethoven or
Messiaen. I’m at a loss to suggest additional meaning for these icons. But by
now the reader is probably able to play this game without my help.
A vertical stripe of decorative embellishments appears next. If it seems
familiar, it is because it constitutes a reappearance of the irregular, syncopated
bits that extended above or dangled below the slope in panel 4. As such, it rep-
resents motivic recurrence, something that could arouse an emotional affect:
after the appearance of many contrasting novelties the return to the familiar
could be felt as a welcome tonic, economical relief, or perhaps even wistful
nostalgia. At the same time, the stripe has changed: it is elaborated by a bul-
bous bottom and contextualized by a heart shape. Taken together, my daughter
instantly identified a ‘heart guitar’. So perhaps a lyric song and instrumenta-
tion change are in order at this point.
To the right of the heart extends a series of dots arranged in two paral-
lel rows. The dots embody syncopated repetition. If you look carefully, you
will notice that the rows contain identical proportions; they are just tempo-
rally displaced as in imitation or canon. I chose dots as a deliberate homage to
FIGURE R.31 The Metaphysics of Notation, panel 5 close-up: contrasting materials, ‘heart guitar’ and canonic dots
292 Music and Shape
Conlon Nancarrow’s temporal canons for player piano, their rolls methodically
punched with holes just so.
Above these piano-roll holes appear odd stalagmites crowned with unique
figuration. They contrast with the limited vocabulary found in the dangling
mobiles underneath, whose sundry angles (one looks like a hockey stick) are
simply axial inversions, retrogrades and retrograde inversions of one another.
Having commented on most of two panels, I will end my analogic exegesis
here. A more thorough evaluation is certainly possible, but this will suffice as an
introduction to the manner in which visual shapes can be considered analogous
to traditional musical devices.
Large-scale formal connectivity
It should be evident that shapes move in a logical manner in the horizontal

plane, a visual rhetoric that suggests continuity. (Again, I’m not insisting that
an interpreter abide by or even consider this.) Logical congruence is present
both within a given panel and across the interstices between successive panels.
So the logic of the right edge of panel 4 continues on the left edge of panel 5
and so forth. Moreover, the glyph at the end of panel 12 is the same as the glyph
that appears at the beginning of panel 1, so the entire work forms a circle in
the horizontal plane. As such, there is no implied beginning or ending point; in
fact, my very use of panel numbers is only for ease of discussion.
But the logic also works in the vertical plane. That is, visual continuities
appear when panel 4 is stacked on top of panel 5. So the score can be read
up and down, as well as left and right. And it also forms a loop in the vertical
dimension: panel 12 can be placed above panel 1. Thus large-scale formal con-
nectivity is a deliberate design attribute of The Metaphysics of Notation. This
kind of interlocking, overly wrought, hyper-idealized formal plan is featured
in much of my conventionally notated, determinate modernist music. But in
pieces that are about sound (which is to say, virtually all music), any given
moment exists just for an instant, after which it can persist only in memory. In
contrast, the plastic, permanent surface of the page is not limited to sound’s
fleeting temporal essence, and it thereby affords the composer new structural
opportunities, such as the Metaphysics’ torus-like formulation.2
In Figure R.32 we see panels 3, 4, 5, 6 and 7 in a stacked arrangement. From
this view one can track some of the mechanisms by which formal connectivity
appears in the vertical domain. First, let us reconsider the interpenetrated circle
and oval described in panel 4. We see that, from a horizontal perspective, not
only is the oval a local, adjacent permutation of the circle, but the two together
constitute a distant motivic echo of the similar circles and ovals sprayed on the
preceding panel. But when read in the vertical plane, they appear immediately
as the inversion of one particular corresponding circle and oval pair at the bot-
tom of the preceding panel (Figure R.33).
FIGURE R.32 The Metaphysics of Notation, panels 3, 4, 5, 6 and 7 in stacked arrangement
294 Music and Shape
FIGURE R.33 The Metaphysics of Notation, close-up: circle and oval pair inverted across panels 3
and 4
Inversion can also be seen earlier in panel 3 at the bottom, where a kind of
scroll shape adorned with the number 5 is inverted onto the upper part of panel
4 (Figure R.34). The shading of the latter scroll is altered, thereby suggesting
a kind of mode change. Such mutation anticipates other variances: first, the
scroll outline inverts though the number 5 within it repeats without alteration;
but more significantly, eight ‘ribbons’ extend from the panel 3 scroll and five
appear in the panel 4 scroll, only three of which are in common as inverted
reverberations of one another.
The logic continues across panel 4 to panel 5, first by way of two inverted
shields, and then by the aforementioned ‘heart guitar’, which lines up precisely
with the first bit of dangling embellishment in panel 4 (Figure R.35).
Then panels 5 and 6 are conjoined in the vertical plane by one of the dan-
gling angles—a hockey-stick-looking doodle—that points directly to its double
mirror image (it is flipped both horizontally and vertically; Figure R.36).
Continuing downward, we see panels 6 and 7 linked by two pathways (Figure
R.36). First, a vertical chain of circles, themselves sequentially augmented in
size, is seen in flipped form on the other side of the divide. And later in panel 6
there appear twelve equal-sized dots arranged in a ring, like points on a clock.
FIGURE R.34 The Metaphysics of Notation, close-up: ‘scroll’ with number five inverted across panels
3 and 4
FIGURE R.35 The Metaphysics of Notation, close-up: panels 4 and 5 inverted shields, connection to the ‘heart guitar’
Reflection: Mark Applebaum 297
FIGURE R.36 The Metaphysics of Notation, close-up: panels 5, 6 and 7 dangling angles, chain of
circles, dot clock
This ‘clock’ is directly adjacent to an identical clock (one of two) in panel 6, a

kind of reflection through the looking glass.
Let us reconsider one of the vertical linkages between panels 4 and 5 in
order to reflect on its meaning (see Figure R.35). As already mentioned, the
‘heart guitar’ in panel 5 lines up precisely with the first bit of dangling embel-
lishment in panel 4. The heart guitar appears in panel 5, but it is as if it is
hanging from—belonging to—the slope in panel 4. As such, its direct connec-
tion to the prior panel would seem to eliminate its identity as a distant motivic
recurrence. If its ancestry is of the present instead of the past, it should erase
our ability to feel things like nostalgia. In other words, if materials in panel 5
occur simultaneously with those of panel 4, they cannot be considered part of
298 Music and Shape
panel 4’s future. So the vertical logic confounds the horizontal and vice versa.
This structural superfluity purposefully forces the interpreter to choose among
temporal constructs, or to ignore them entirely in favour of a different strategy
for harmonizing the inherent temporality of sound with the intrinsic stasis of
the drawn image.
Similarly a strange temporal puzzle occurs across panels 9 and 10 (Figure
R.37). Starting in the second half of panel 9, a series of tiny repeating dots
curls from the bottom of the page, loops anticlockwise and straightens into a
horizontal comportment where it grows in size, diminishes and finally vanishes
off the right edge of the page. Continuing horizontally onto panel 10, we see
that these dots are the genealogical progenitors of those that begin on the
panel’s left side; after all, they line up in the horizontal plane with panel 9.
The panel 10 dots grow in size, multiply into three larger circles and lead to a
series of waves constituted by various polygons and simple shapes. Eventually
a curl of small shapes emerges near the middle of the panel. It arcs around
clockwise, anticlockwise, and then disappears off the top of panel 10. But this
trail connects—perhaps begets—the aforementioned series of dots on panel 9.
So where is the origin of this infinite, recursive visual rhetoric? It is a paradox
of chronology that is evident in the visual domain but cannot be rationally
represented in the time—one could say the shape—of musical sound.
Coda: a retroactive invitation
Is The Metaphysics of Notation merely visual data? By asserting that it is a

composed musical score, I invite a search for musical continuity and the dis-
covery of analogous traditional musical devices, several of which have been
proposed above. If we continued to probe this terrain, we would find a plethora
of additional devices on other panels: ostinato, sound mass, phase shifting,
pedal point, microtonality, isorhythm, clusters, elongation, rotation, melisma,
cadence, drone, non-retrogradable rhythm, metric modulation and stochastic
textures generated by aleatoric procedures. Climax can be found nearly every-
where: the highest point, the blackest field, the wiggliest line, and so on. One
can also observe that the work is graphically teeming with comparably generic
concerns that musicians and visual artists alike consider: symmetry and asym-
metry, juxtaposition and superimposition, consonance and dissonance, reso-
lution, interruption and the predictable satisfaction of propensity versus the
inhibition of tendency, to name a few. There are even a variety of musical tech-
niques whose linguistic genesis resides in visual art, such as the mobile form
and pointillistic texture.3
I’m not insensitive to the fact that, despite my elucidatory examination, the
score will still appear foreign to most musicians. Many will find its provocation
an insult to their years of tireless devotion to common-practice approaches.
FIGURE R.37 The Metaphysics of Notation, panels 9 & 10 in stacked arrangement
300 Music and Shape
The composition is not, however, intended for these ‘professionals’. Its fanci-
ful, idiosyncratic curiosities are directed to more ‘abnormal’ players, often ones
who have overcome their conservatoire training. This breed is game for such
creative enterprise, a collective of musicians who, while indeed a minority, form
a remarkably expansive and extraordinarily enthusiastic community.4
But for both the inclined and the averse, my purpose here is simply to
recognize the kinship that this kind of artistic adventure has with tradi-
tional compositional devices. Josquin, Bach and Schoenberg use retrograde;
so do I. Counterpoint can be heard in the music of Palestrina, Brahms and
Ferneyhough; and while it may or may not be heard in Metaphysics (its sounds
are left to each interpreter), it can be seen clearly there. My score and those
of Frescobaldi, Beethoven and Messiaen employ augmentation. Palestrina,
Haydn, Wagner and I are concerned with cadences. Sequence is common to
Du Fay, Mozart, Chopin and my score. The Metaphysics of Notation didn’t
invent the canon; it is found in Ockeghem, Monteverdi and Nancarrow. My
point is that it is worthwhile to note the commonality among our composi-
tional tools, not only the obvious contrasts in notational vocabulary. And there
is another contention here: in all contexts, these compositional tools imply and
embody shape.
At the same time, I cannot stress enough that the aforementioned observa-
tions need not direct an interpreter. The project of the Metaphysics includes the
hope—abundantly fulfilled—that I would experience utterly novel and unex-
pected interpretative solutions to the work’s peculiar challenges. These wide-
ranging outcomes were mainly the consequence of the breathtaking scope of
the players’ imaginations. But I believe that they were also aided by my cautious
avoidance of providing hints. (It is a piece that I have vowed never to perform,
precisely in the hope of not suggesting authorial precedent.) For example, the
best performers do not assume that the score must be read from left to right
(even if my aforementioned description of, say, retrograde relies on such a con-
ception), or top to bottom, or even in a single direction. For that matter, it
doesn’t have to be read linearly at all; some have chosen to interpret entire
panels as a single gestalt—much as a quaver is not read up or down or side to
side but is simply grasped wholly as an indivisible symbol. The score needn’t
be considered in its entirety: some players have set up fixed instrumentation in
front of a single panel as opposed to taking a peripatetic tour of all of them.
A realization can be improvised or carefully predetermined.5 And the score
could stimulate responses that are not even conventionally musical (it has, for
example, been interpreted by spoken word poets and dancers).
So I write this with a degree of wariness, one alleviated mainly by an under-
standing that the intellectually intrepid, curious, creative musicians who are
attracted to Metaphysics will likely read this Reflection and simply ignore it;
they will accept the challenge of inventing their own solutions that are beyond
my limited conception. Cardew’s Treatise Handbook undertakes a seemingly
Reflection: Mark Applebaum 301
similar project, but it is markedly different: it collects competing solutions for

the interpretation of his nonstandard notation; that is, it is a post-mortem
account of diverse sonic production. In contrast, my comments merely observe
how my nonstandard notation is compositionally analogous to traditional
compositional technique; that is, it retroactively imagines how its shapes came
to be on the page for a musical purpose.
Reflection
I-Uen Wang Hwang, painter and composer
Tonality in soundscapes
In twenty-first-century atonal music, tonality is more than the organization of

musical components: it also includes timbres, texture, instrumentation, articu-
lation and additional elements which will be discussed in a moment. In modern
art since the era of impressionism, painting has developed towards a synaes-
thetic scheme in which tonality is considered to be a succession of movements.
Modern art also places a greater emphasis on the use of colours to depict emo-
tions. Traditional subjects are often superseded by geometric patterns. Abstract
art is thus akin to atonal music, which replaces tonal centres with alternative
methods of structuring the twelve notes.
Since I grew up in a family of artists, I have always been intrigued by the
visual impact of painting and how it is analogous to the ineffable power in
music. Both types of art can turn ordinary subjects into the extraordinary.
I often combine my interest in painting and musical composition, typically
starting the painting first to facilitate the generation of ideas. The painting
evokes the sound I want to convey in my music. The learning process to create
nuances of tonality is a wonderful journey.
The components of tonality include line, colour, morphology, space/volume
and texture. In the following, the relationship between these components and
my process of creation for both music and painting are discussed.
The concept of line in painting is somewhat obvious, as lines and contours
form basic components of paintings. However, line also refers to the path the
viewer’s gaze follows as it explores the entire visual composition. In music, line
is also an essential element for composition. The analogous component is mel-
ody or voice-leading. In atonal music, methods of structuring voice lines with-
out key centre include pitch-class sets, the specification of intervals, derived
series from twelve notes, or twelve-note rows. Music is a temporal art, relying
302
Reflection: I-Uen Wang Hwang 303
on the listener’s ears to accumulate the varying sounds. Line is the structure
that provides direction for the listeners, leading them to form an interpretation.
In both painting and music, colour provides energy and excitement. In
music, timbres arising from varying combinations of instruments and pitch
registers are analogous to the different paint colours. The orchestration of the
music is equivalent to the paint palette. As in painting, music may rely on a
well-demarcated contrast between the primary and complementary colours or
may be based on gradual variations, corresponding to the sfumato technique in
painting in which outlines are blurred by blending one tone into another. Both
techniques may have a powerful impact on the emotions conveyed. A skilful
composer arranges and combines all twelve notes together as a painter mixes
colours of paint. As with painting, mixing multiple colours together may result
in a greyish hue. However, this is not necessarily an undesirable result since a
greyish homogeneity forms a quiet background against which other voices or
colours are more emphasized and appear more luminous. For example, Marie
Laurencin’s portraits and the Symphonies Nos. 3 and 4 of Lutosławski both
use this concept of a grey canvas. In my own painting, rather than relying on
premixed complementary colours, I manipulate a few chosen hues of the pri-
mary colours, combining warm and cool variations with each other to form my
complementary colours, sometimes on the palette but, at other times, directly
on the paper or canvas.
Morphology, or shape, in music is formed by rhythms, repeating motives
and varying metres. The technique of counterpoint has been used for centuries
but is still one of the best examples to help understand the concept of musi-
cal morphology. The largest-scale classification for morphology consists of the
basic music forms, such as sonata and dance forms. Relying on simple shapes
within a painting is one method for facilitating the clear manifestation of a
subject. Paul Klee was a maestro in the application of simple shapes. Since he
had a background in music, his paintings provide an excellent example of the
relationship between music and the visual arts.
Space and volume consist of the three-dimensional content of music and
art. In painting, the three-dimensional perspective of a visual art is quite
intuitive. However, when I studied music composition, my teachers also
taught me to design the music with multiple layers, including a foreground,
middle ground and background. The presence of multiple layers is analogous
to perspective. In addition, the rests in music are analogous to the empty
spaces in a painting. The spaces and rests divide the artwork into important
sections and help the viewer or listener understand the structure as a whole.
Volume in the music is most often considered in terms of dynamic change,
piano versus forte. However, it also relates to the density of music, ranging
from minimal to rapid changes. In painting, volume refers to the degree of
contrast between the lightest and darkest regions. Contrast helps the viewer
focus on the composition.
304 Music and Shape
Texture refines tonality in music, reinforcing the line, colour, morphology

and space/volume, and combining them to complete the soundscape. In paint-
ing, texture uses all the elements in the composition to enhance the effect of the
work. František Kupka’s painting Piano keys—lake exemplifies the effective
application of texture. Similarly, texture in music allures the listener by enhan
cing the profoundness of the sounds. For example, Takemitsu’s November Steps
epitomizes the abundant use of textures.
In conclusion, although painting and music are clearly different types of art,
certain concepts are overlapping or analogous. Creating visual art and music
simultaneously is thus synergistic and facilitates imagination and the genera-
tion of new ideas.
Dream Garden, Series II
In the following, I briefly discuss one of my own compositions, Dream Garden,

Series II (2004) for two pianos, and how it relates to my painting. Figure R.38
shows the two accompanying watercolour paintings, which formed the inspira-
tion for the music (for colour versions, and the corresponding sound files, see
the companion website ). Each of the two movements corresponds to one of
the paintings and shares the same name. Each piano has its own character with
different timbres and rhythms, as if each part represented a unique colour or
(a) (b)
FIGURE R.38 Watercolour paintings: (a) Red and White and (b) Fireworks
Reflection: I-Uen Wang Hwang 305
space of the painting. Just as the articulation and textures combine to create a
painting, the two piano parts merge naturally to form a single musical image.
The first movement, ‘Red and White’, is slow and tranquil as conveyed by its
namesake painting. Piano 1 and Piano 2 represent the red and white flowers in
the painting, respectively. The delicate flower petals inspire the elegance of the
music. The timing of the interactions of the two piano parts is analogous to the
three-dimensional spatial arrangement of the flowers.
The second movement, ‘Fireworks’, is energetic and lively, expressing the
transparent quality of the watercolour painting in which the layers can be seen
through each other. To achieve this effect, each piano has its individual set
of rhythms, as if representing different firework effects combining to create a
magnificent display.
When I create my art, I do not formally or overtly consider each of the con-
cepts discussed here. This would run the risk of creating music that is too ‘stiff’.
However, as one paints and composes simultaneously, the concepts naturally
and subconsciously arise during the artistic process.
9
Music and shape in synaesthesia

Jamie Ward
I like the way the piano goes from dark brown to bright
yellow as the notes get higher. A quick succession of high
notes appears almost like a moving star cluster, and an
arpeggio like a printed pattern spooling in front of my
eyes. It can be very sensual—… an animated spectacle.
—Ward (2008: 74–5)
What is synaesthesia?
People with synaesthesia experience automatic ‘extra’ sensations elicited by a

given set of stimuli, common examples being experiencing vision from music,
colours from numbers, or flavours from words. They are ‘extra’ in the sense that
they are not universally shared by others in the population. They are automatic
and elicited insofar as they are driven by a stimulus (rather than ‘willed’ like a
mental image) with little top–down control (although attention can be used as
a filter to some extent: Mattingley 2009). Each type of synaesthesia can be con-
strued as having a dual component: an inducer that elicits the experience, and
a concurrent, the experience, itself (Grossenbacher and Lovelace 2001). Types
of synaesthesia tend to be labelled as inducer–concurrent pairs: so, for instance,
grapheme–colour synaesthesia refers to grapheme inducers (letters and digits)
eliciting a colour concurrent. The most common synaesthetic concurrents are
visual, and their most common inducers are linguistic elements (words, letters,
numbers). The prevalence of synaesthetic colour experiences has been esti-
mated as approximately 4 per cent based on self-reported phenomenology and
confirmed by test–retest consistency (Simner et al. 2006). Intriguingly, the most
common inducers of colour are sequential concepts such as days of the week
(2.8 per cent), months of the year (1 per cent), and letters or numbers (1.4 per
cent), with music–colour being comparatively rare (0.2 per cent). This chapter
focuses primarily on auditory–visual synaesthesia and, in particular, on musi-
cal inducers and shape-based concurrents.
306
Music and shape in synaesthesia 307
Synaesthetic visual experiences are often referred to merely as being

‘coloured’; historically, the term ‘coloured hearing’ (and its translated ver-
sions: audition colorée in French, and Farbenhören in German) was used as an
umbrella term for all types of synaesthesia (e.g. Millet 1892). Colour is perhaps
a noteworthy feature of synaesthesia given that it is unique to the visual modal-
ity, unlike other attributes such as texture, shape, motion, size and location,
which can all be perceived from other senses (Aristotle’s ‘common sensibles’).
Colour also has the advantage of being relatively easy to measure. The most
common diagnostic test of synaesthesia involves selecting colours (from a com-
puterized colour picker) on multiple attempts and then determining whether
the colours are stable over time. Synaesthetes show strong stability in their col-
our associations relative to controls instructed to guess or use some kind of
mnemonic strategy (Eagleman et al. 2007; Rothen et al. 2013). However, in
almost all cases the synaesthetic experiences consist of far more than colour.
In grapheme–colour synaesthesia, for instance, the colour typically takes on
a shape (the shape of the inducer itself) and has a particular spatial location.
Some claim to see them externally (so-called projectors), whereas others claim
to see them internally or to just ‘know’ the colour (Dixon, Smilek and Merikle
2004). For instance, they may claim to see a coloured copy of the words on
some inner screen or in their mind’s eye. In addition to colour, the synaesthetic
experiences often have texture, sheen (e.g. dull or quasi-metallic) and opacity
(e.g. transparent or opaque; Eagleman and Goodale 2009). The same applies
to auditory–visual synaesthesia: colour is one common feature of the experi-
ence, but the sounds are almost always accompanied by visual shape, move-
ment and/or texture perceived in some spatial location (often extending in three
dimensions).
Although there is general agreement as to the broad definition of synaesthe-
sia, and common examples of it, a precise definition remains elusive (Simner
2012). Ward (2013) has argued that synaesthesia has three core characteris-
tics: the concurrent experience is percept-like, it is elicited, and it is automatic.
The automaticity of synaesthesia is demonstrated by studies in which synaes-
thetes must name the true colour of a stimulus and ignore their synaesthetic
colour (Mattingley, Rich and Bradshaw 2001). Synaesthetes are slower when
the true colour is incongruent with their own associations than when it is con-
gruent. Demonstrating that synaesthetic experiences are ‘percept-like’ is less
straightforward but is corroborated by brain-imaging evidence showing that
regions of the visual cortex are engaged during synaesthesia. To give one exam-
ple, Nunn et al. (2002) studied synaesthetes with ‘coloured hearing’ for whom
spoken words elicited colours. During brain imaging (using fMRI, i.e. func-
tional magnetic resonance imaging) the synaesthetes, more than controls, acti-
vated regions of their visual cortex, including a region specialized for colour
perception (area V4), when listening to speech with their eyes closed.
308 Music and Shape
The developmental form of synaesthesia, considered in this chapter, is

believed to emerge early in life and persist throughout the lifespan. A predis-
position to synaesthesia runs in families but the particular associations (e.g. ‘A’
being red or blue) do not (Barnett et al. 2008), and neither is the modality of
the concurrent strongly heritable (Ward, Simner and Auyeung 2005). Genetic
differences have been found to distinguish synaesthetes from unaffected family
members, but no specific genes have been isolated yet (Asher et al. 2009; Tomson
et al. 2011). One theory is that all infants start out in life as synaesthetic but lose
the ability as part of the normal maturation process (Maurer and Mondloch
2006). For instance, infants show far less neuronal specialization in response to
unimodal sensory stimuli and are born with an exuberance of connections that
are cut back during development (for a review see ibid.). So one possibility is
that this normal maturation process is disrupted in synaesthetes due to genetic
differences (Baron-Cohen 1996). Of course, the environment clearly plays a
role too, if only because the most common inducers (e.g. linguistic symbols) are
not themselves innate. What is less clear is whether this supplants other forms
of synaesthesia that may date back to infancy (e.g. sound–vision synaesthesia).
In addition to developmental forms, synaesthesia can be acquired temporarily
as a result of certain hallucinogenic substances or sensory deprivation includ-
ing damage to the sensory organs (summarized in Ward 2013).
Visual experiences to music in synaesthesia
The first recorded case of synaesthesia was that of Georg Sachs in 1812, who
briefly mentioned colour associations to music in addition to a more detailed
account of grapheme–colour synaesthesia (Jewanski, Day and Ward 2009).
However, the earliest detailed account of synaesthetic experiences to music
was that of the Nussbaumer brothers in 1873. Nussbaumer describes how
he, and his brother, experienced colours for different tones. Spatial aspects
(including shape) of the synaesthesia were not described in detail, but the
colour experiences were noted to be ‘inside his head, where it starts at the
temple as a band of colors, which go to the middle of the forehead’ (Jewanski
et al. 2011: 297). In the same year, Lussana (1873) describes colour experi-
ences triggered by the human voice. However, it was not until the 1880s that
multiple cases were documented and contrasted (Bleuler and Lehmann 1881),
and it was in this decade that the term ‘coloured hearing’ took root to describe
synaesthesia more generally.
Colour associations to music tend to be linked either to the pitch class (C, D,
F♯, etc.) including musical keys, or to the pitch height. For those in whom col-
our is linked to pitch class, this is invariably linked to the possession of perfect/
absolute pitch. In these cases, the same note in different octaves is linked to the
same colour (e.g. Carroll and Greenberg 1961; Rogers 1987). In some instances
the colours appear to be derived from the letter-names belonging to the notes
themselves: so, for example, both the letter ‘A’ in a passage of text and the note
‘A’ in a passage of music may be perceived as red. Note that in this case the red-
ness of the note derives from its culturally assigned label, not from any quality
of the sound. For those in whom colour is linked to pitch height, sounds of a
similar pitch height tend to have similar colours, but those that are more distant
(including the same note an octave apart) tend to be less similar. In these cases,
there is a general tendency for notes that are higher in pitch to be lighter in col-
our (more yellow, for example; Ward, Huckstep and Tsakanikos 2006). That
is, although synaesthetes differ greatly in their precise choice of colours, there
tend to be commonalities between synaesthetes (and, indeed, nonsynaesthetes).
Aside from pitch, the colour of musical notes may be influenced by timbre
(ibid.) or, in a few cases, solely determined by timbre (Day 2009). Famous cases
of music–colour synaesthesia include Messiaen (Bernard 1986) and, possibly,
Scriabin (Myers 1915).
Visual experiences aside from colour have been less well documented in
the literature. Zigler (1930) noted synaesthetic shapes associated with instru-
ments (colours were noted too). To document this, a variety of instruments
were brought to the laboratory and played at three loudness levels, in differ-
ent durations (staccato, sustained) and using various pitches. The synaesthetic
shapes had a three-dimensional and dynamic quality to them. The basic shape
tended to be determined more by the instrument than by other characteristics
(loudness, pitch, duration). It was also noted that the two synaesthetes tested
differed greatly in how the shapes were described (see Table 9.1). Increasing the
duration of a note led to elongation of the shape or duplication of the form
(e.g. spherical forms passing across the field of vision in a definite direction).
TABLE 9.1 The ‘tone shapes’ reported by Zigler (1930)
Instrument Subject A Subject B
Flute Thimble or acorn cup Hollow tube

Saxophone Cup with solid inner core Bursting of a mass into rough, jagged
and splintery particles
Bugle Morning glory or pipe Sphere with opening on upper side
Harmonica Series of spatially distributed discs Flat rectangle
Jazz whistle Thick waving streamer Lumpy dough-like elongated mass
Simplexophone Dagger Megaphone of very vague outline
Musical saw Elongated globule with jagged surface Yards and yards of round ribbon-like
material
Cello Flat horizontal base with spring-like Thick ribbon
vertical projections
Violin Tube with enlarged nodules Ribbon much thinner and smaller than
that of cello
Piano Quadrangular blocks Spheres
310 Music and Shape
For instance, one synaesthete saw the forms pass from left to right starting from
two feet away from the face and ending much further away. In orchestral pieces,
the perceived shapes tend to reflect a small number of instruments that are
attended to (so the shapes of one instrument gain prominence). When listening
to a duet, one synaesthete stated that ‘both instruments of a duet united in such
a way that the larger form occupies a lower position and appears to support the
smaller one with which it is connected by a small thread-like projection’.
Karwoski and Odbert (1938) also emphasized aspects of vision other than
colour by requiring their participants to draw or describe visual experiences to
music, including whole pieces as well as isolated notes. Some of their synaes-
thetes reported rather localized visualizations such as ‘cloud, film or veil effects
which may billow vaguely’, or ‘points or limited areas of colour, which may
move or expand or contract with changes in the music’. These can be contrasted
with other synaesthetes who reported ‘developing bands or ribbons of colour’
that ‘almost always show a left-right development’ (ibid.: 9). These could be
single bands of colour that change colour or move with the pitch, or multiple
bands linked to different instruments, rising and falling with changes in pitch.
For such synaesthetes there are, apparently, shapes within shapes (the overall
shape of the melody, and the shape of particular instruments and notes).
In the contemporary literature, Mills, Boteler and Larcombe (2003) docu-
ment the shapes and colours of music and notes in a synaesthetic student (GS).
The synaesthesia operates at multiple levels such that individual notes have a
colour and shape, but longer sections of music exist as ‘landscapes’ or ‘maps’
of coloured lines and blocks: ‘It’s easier actually for me to read my colour map
than to look at the music, because I look at the map I came up with and it’s
kind of like a gauge of how it’s going to sound. Whereas I look at the music
and I can’t always tell what it’s going to sound like’ (ibid.: 1363). The tempo
of the music determines the dynamic changes within the landscape. The pitch
of a note determines its size (high pitch is smaller) and position (high pitch is
higher). Shape is strongly related to timbre, with similar instruments eliciting
similar shapes, and, to a lesser degree, to pitch.
Although synaesthetic experiences are vivid and highly variable, it has been
noted for many years that there are certain ‘rules’ by which sounds and vision
tend to be linked (Marks 1978). For instance, high pitches tend to be judged as
brighter, smaller, higher in position and less rounded in shape (e.g. Marks 2004).
These have recently been termed ‘correspondences’ and their relationship with
synaesthesia itself is debated (Spence 2011). One account suggests that syn-
aesthesia exists on a continuum from strong to weak, in which ‘weak synaes-
thesia’ is essentially normative experience (Martino and Marks 2001). Others
have argued that these correspondences provide a starting-kit for linking sen-
sory dimensions but with synaesthesia being a qualitatively (rather than quan-
titatively) different developmental outcome (Ward et al. 2006). It is certainly
the case that many of these correspondences can be observed in the first few
months of life and, hence, are unlikely to be a product of language or culture.

For instance, infants presented with a tone ascending in pitch will look towards
a visual object moving upwards (rather than downwards) or a shape that is
becoming spikier (rather than more rounded; Walker et al. 2010). Such effects
have also been argued to underpin sound symbolism in language, i.e. the non-
arbitrary pairing of shapes with names (Ramachandran and Hubbard 2001).
Finally, Ward et al. (2008) attempted to capture the multifaceted nature of
synaesthetic visual experiences to music using animated clips to sounds (single
tones) produced from detailed drawings and descriptions. These animated clips
captured colour, texture, shape, movement, size and location (albeit constrained
to two dimensions). Across the synaesthetes there was a general tendency for
music to be seen from left to right (i.e. following western reading conventions).
Sustained notes tended to produce elongated shapes, and staccato notes tended
to be more rounded. Ward et al. (2008) also presented the animated audiovi-
sual clips to nonsynaesthetic participants in various experimental settings. For
instance, the clips could be altered such that the colour was changed or the
orientation rotated and then presented alongside the original animations. In
these situations, there is a tendency for nonsynaesthetes to prefer the original
(synaesthetic) clip over the manipulated clip when asked which clip is a better
depiction of the sound. The same is also found when the clips derived from syn-
aesthetes are directly contrasted with clips derived from nonsynaesthetes asked
to imagine the sounds visually. Thus, the correspondences that link music and
vision in synaesthetes are readily accessible to those who lack synaesthesia and
are, in some sense, ‘preferred’.
First-person perspectives of synaesthetic music and shape
This section considers some first-person perspectives on synaesthetic shape

and music. It begins with quotations from the literature and is followed by an
interview with a synaesthete about her experiences. Most of the psychology
literature has considered shape responses to individual notes and instruments,
but synaesthetes themselves have discussed the shape of musical pieces and also
how shape may influence memory for music and performance.
The famous psychiatrist Eugen Bleuler described how his own synaesthetic
experiences to music served as a memory aid: ‘it is generally not the sounds
themselves, but only their … [associated visual experiences] that are remem-
bered because perception of the latter is much easier’ (1913: 11–12).
Söffing (2006: 104) describes how she can compare different musical perfor-
mances of a friend using her synaesthesia:
How was I supposed to compare the pieces? What did I hear yesterday?
She played more hectically, faster or slower, quieter. But what about a
312 Music and Shape
precise description of how she played? I was unable to express it in musi-

cal terms. At the point I suddenly realised that coloured shapes appeared.
I remembered these as visualisations. So when she played, I was able to
refer to my visualisations and see if she played exactly in accordance with
its appearance. If she played faster, harder or softer, the visualisation
would change: I was able to refer to the visualisation and tell her the exact
point that had been played differently.
Sinha (2009: 200–2) describes how she uses her synaesthesia to analyse music
structurally and decompose it into parts (multiple voices):
I have always had the ability to see the form and internal structure of a
piece of music on my internal screen like a chart or a three-dimensional
film… Even without being able to read music I can always tell precisely
from the beam of sound how a piece is constructed, and I always know
exactly where we are within it. The sound beam always comes towards
me at an angle from the right; just in front of me it makes a slight turn
and moves away to the left. (This happens even if the sound source is
located on my left…)… The first few times I heard the song ‘Walking
Down the Street’, without noticing the details, I saw a thick but not
particularly solid beam of sound. It can be most readily compared to
a thick rope formed of a combination of different materials (but not
twisted like a cord)… As soon as I turned my attention to listening to
each voice individually, the beam began to unravel, until the five vocal
parts and the sixth for the rhythm separated out quite clearly and dis-
tinctly… The ray of sound belonging to each one can be described
precisely.
Bass: consists of very long, soft fibres, like mohair—very soft to the
touch.
• Colouring: dark, almost black with a little dark grey.

• Consistency of the sound beam: not very firm: it would be easy
to squeeze it in the hand and it would immediately return to its
original size. Not spongy or grassy but rather fibrous, dry, has no
‘lubricant’.
• Diameter: fairly large, 4–5 centimetres.
Alto: not smooth, and fibres much shorter than for the men’s voices.
• Colouring: saturated mid-golden yellow with scattered dark golden

lights.
• Consistency of the sound beam: firm core, getting softer towards
the edge.
• Diameter: small, 1.5–2 centimetres.

AN INTERVIEW WITH A SYNAESTHETE
Some of the themes that emerged above reoccurred during a recent interview
between the author (JW) and a synaesthete (RP), which is worth quoting at
length here, both for its inherent interest and because it covers so many aspects
of synaesthetic experience. RP is a native- German- speaking woman with
grapheme–colour and auditory–visual synaesthesia.
JW: o you have early memories or experiences of music and

D
synaesthesia?
RP: I am aware of synaesthesia since 2002 so when I was a child I didn’t
know anything about synaesthesia so I just took for granted all that
my senses would give me. It was kind of like normal but it felt like
I was no good because I really sucked at learning notes.
JW: In terms of your early motivation to become engaged in music, did
this come from your parents or from you being interested in certain
things?
RP: My parents are Baptists too so I was in church every Sunday until I
was twenty. I heard a lot of music. Imagine being a teenager in church
on a Sunday—that sucks—so you just flow away with the music. I
dreamed myself a way out of this boring situation, somehow… I love
music a lot, especially male voices, and my father was singing in a choir
and that I enjoyed a lot. So we all learned instruments, we have five
siblings, and I learned accordion and piano and flute. My music teacher
was a very nice person and she somehow felt that I learned in a way that
was different from other kids. So she did not concentrate so much on
the written notes but more played it to me several times, so that I could
play it without the [written] notes—just listening. I think even at that
time I did it with my inner picture that was creating in my mind, my
memory, and I used that as an orientation for the piece.
JW: What about learning formal music—musical notation and things
like this?
RP: I am unable to do that. I cannot read notes. That was because the
other types of synaesthesia come in the way. There are also colours
for the letters of the notes, and the lines [of the stave] are numbered
so they have colours too. My fingers are numbers, and they have
colours too, and the piano keys have colours too. So I have to
translate it, and that is too much colour stuff, and I was always trying
to find a bridge from seeing a black dot on a piece of paper telling me
which finger has to go on which key. That was like talking a language
I do not understand. I could not translate it.
JW: Can you describe what your synaesthetic experiences to music
are like?
314 Music and Shape
RP: It is like ‘just there’. I have to relax and close my eyes to be aware of
it. It is like being on a train and looking out and being aware of the
reflections on the window. Normally, you would not do this as you
would just look out of the window. But when you do that you see
that there is a second layer, with a different movement.
JW: As you listen to an extended piece of music do you see the part that
you are currently listening to, or do you see the part that you have
been listening to—like the history of the music?
RP: It would be like a path, like you see on Google Earth, so you could
zoom in and see the part that is just happening right now but you
could also zoom out and see, kind of like, the whole piece (like the
melody line)…
JW: But you don’t see ‘the future’, i.e. the music that you haven’t
yet heard?
RP: Except if I know the piece perhaps. That would mean I have seen it
before.
JW: How much level of control do you have, either in terms of zooming
in, or if it is a very complex piece of music I guess you could focus on
different instruments or is [what you see] very fixed?
RP: If there are certain instruments I could focus on one or another and
the rest would, like, go into the background—greyed out or faded out
a little bit. So you can bring something to the foreground… You can
concentrate and filter something out if you wish or you can look at
the whole.
JW: If we think about music at the very small scale, so single notes and
phrases, would these have shapes?
RP: Yes, it has and these are the pictures I sent you—a single note, a
human voice, humming. He does this technique of overtone singing
and this is the moment that the overtone arrives. [Figure 9.1]
JW: If you have an ascending tone then does it [the synaesthesia] go up
in similar ways?
RP: Yes, similar to the melody. It goes up and down according to the
melody.
JW: If you had, say, a flute and double bass how would this separate out
in the line? Would you see different elements?
RP: They would be on different parts of this inner space. It is three-
dimensional. It would be like two beams, or flows, or flying scarves.
The lower instruments would also be on a lower place in that room,
in that space.
JW: What determines what is close to you in the depth plane?
RP: I don’t know. Usually I am able to move around in that space and
take a look from several points of view. But it also may happen that
suddenly it sucks me in and it goes through me. Imagine a huge
(a)
(b)
(c)
FIGURE 9.1 RP’s synaesthetic experience to overtone singing by Wolfgang Saus. This singing
technique filters parts of tones from the overtone spectrum of the voice so that they are perceived as
individual and separate tones. RP describes her experiences as follows: ‘The first picture [Figure 9.1a]
is a snapshot—a frozen tone. A single humming or tone is very dynamic, carrying and expanding. It
has a middle axis that is floating free through space, moving up and down according to the melody.
The wing-like shapes are not wings but more or less radiating beams of energy. The second picture
[Figure 9.1b] is the magical moment when the overtone appears. It takes off like a phoenix. The
overtone sounds similar spherical like a glass harp and for me even looks similar, but there is even
more: he changes the sound-image of the humming sounds below and creates a new unity/common
form. Most times the notes move from left to right, but I can also go around them and view them from
above, below, front or back. The third picture [Figure 9.1c] shows the same scenario from the front.’
Colour versions of these images are available at the companion website .
316 Music and Shape
beam like a rainbow, it is like the game ‘Supermario in Rainbow

Land’, it showed these beams in the air that move and if something
moves I am standing in the middle of this thing. It all goes through
my body.
JW: To what extent is the colour of [heard] music as important as shape
and texture?
RP: It is the texture, the shape and the movement and from this snake—or
whatever you call it, flying through space—it has ‘energy beams’ going
up and down. So it’s like a three-dimensional wonder-dragon but
there isn’t really a single colour when I look at a melody. Yesterday,
a friend of mine did me a recording of Lorde’s song ‘Royals’. He
did a recording of the song in his studio with his friends and I was
listening to it and, at a certain point, it was like some pink missing,
and I realised that in the refrain there are usually four voices and one
of them was missing. Usually I would not say that a whole piece is
coloured.
JW: If music has lyrics then would the words be a separate line within the
music itself and would the words be a source of colour?
RP: So the words do have shapes too but the shapes would be integrated
in the melody beam. They would be inside.
JW: So would the words move in space depending on whether sung in
high or low pitch?
RP: Yes and even a single word can look very funny.
JW: So words can be stretched if you say them l-o-o-o-ng or whatever?
RP: Yes, of course.
JW: But you see the letters spelled out or the sound of the words?
RP: I tried to paint the word ‘security’ because it looked so funny, but
then I gave it colours of the letters of the word ‘security’.
At the end of the interview, RP was asked whether there was anything else that
she wanted to add.
RP: Are you only talking about the visual stuff rather than what you feel
in your body?
JW: You mean the ‘chills’ or you mean other things?
RP: The chills would be positive but there would also be the opposite. An
example would be that I hate Whitney Houston, I cannot stand it. If
it is anywhere, I go out of the room. I cannot stand the way that the
voice feels in my body. It is like tooth pain but it is not in my teeth, it
is somewhere else, like electricity, because it is too perfect and if I had
to paint it, it would be like the metal chrome; a chrome-coloured
freshly polished car bumper. That is what Whitney Houston looks
like but it also like how it feels—perfect but cold—and I really cannot
stand her voice.
DISCUSSION
There are several general points emerging from the interview that are echoed
elsewhere. First, the experiences themselves are very dynamic: changes in the
synaesthetic visions reflect changes in the temporal characteristics of the music.
This operates at multiple levels, from notes to single voices/instruments within
a more complex piece, to the piece as a whole. To some extent, the synaesthete
can attend to these levels and see the structure of the music come apart.
The interview with RP, and comments of other synaesthetes, also suggest
that their synaesthesia can act as an internal ‘map’ of the music that, to some
degree, makes an external representation of the music (a written score) less
important. There are certain structural similarities between typical synaesthetic
visualizations and western musical notation, including high pitch being higher,
and a tendency to read from left to right. However, in other ways they differ
(e.g. low pitch being bigger and darker), and it may often be the case that these
two musical systems (symbolic, synaesthetic) are not completely aligned.
Finally, the synaesthetic shapes of music typically are not just two-dimen-
sional but have solidity and texture that lends them an almost tactile qual-
ity. They are probably not providing a true auditory–tactile synaesthesia, but
rather there is a sense of what the visualizations would feel like if touched.
One of Zigler’s (1930: 278) synaesthetes describes the surfaces of the shapes in
terms such as ‘hard, soft, velvety, jagged, uneven, smooth, polished, streaked,
crystalline’. Sinha (2009: 201) describes it thus: ‘I can always describe exactly
what it would feel like in the palm of my hand and on the inside of my fingers
if I were to stroke the beam of sound: soft, prickly, pleasant, warm, grassy/airy,
or solid.’
Summary and concluding thoughts
In summary, although research into synaesthesia is enjoying something of a hey-

day, the topic of shape remains largely neglected. This is perhaps because it is not
as easy to measure as colour. However, synaesthetes themselves often note that
shape is at least as important as colour when it comes to visualizing music. The
extent to which this leads to genuine benefits in musical cognition is unknown,
but it will almost certainly lead to individual differences in strategies for musi-
cal perception and performance (relying more on internal ‘maps’ of the music).
Nonsynaesthetic musicians are likely to use similar music–shape correspon-
dences. In effect, the same set of underlying principles might manifest themselves
in a number of ways depending on developmental trajectories (styles of learning,
genes). In synaesthetes, they are vivid, precise (e.g. in hue), automatic and subjec-
tively ‘seen’. In nonsynaesthetes, the use of shapes can range from the abstract or
metaphorical through to mental images that are under some degree of control.
318 Music and Shape
References
Asher, J. E., J. A. Lamb, D. Brocklebank, J. B. Cazier, E. Maestrini, L. Addis, M. Sen,

S. Baron-Cohen and A. P. Monaco, 2009: ‘A whole-genome scan and fine-mapping link-
age study of auditory-visual synesthesia reveals evidence of linkage to chromosomes
2q24, 5q33, 6p12, and 12p12’, American Journal of Human Genetics 84/2: 279–85.
Barnett, K. J., C. Finucane, J. E. Asher, G. Bargary, A. P. Corvin, F. N. Newell and K. J.
Mitchell, 2008: ‘Familial patterns and the origins of individual differences in synaesthe-
sia’, Cognition 106: 871–93.
Baron-Cohen, S., 1996: ‘Is there a normal phase of synaesthesia in development?’, Psyche
2, http://hstrial-tridenttechnical.homestead.com/BaronCohen1996.pdf (accessed 9
April 2017).
Bernard, J. W., 1986: ‘Messiaen’s synaesthesia: the correspondence between color and
sound structure in his music’, Music Perception 4: 41–68.
Bleuler, E., 1913: ‘Zur Theorie der Sekundarempfindungen’, Zeitschrift für Psychologie
65: 1–39.
Bleuler, E. and K. Lehmann, 1881: Zwangsmaessige Lichtempfindungen durch Scall, und ver-
wandte Erscheinungen auf dem Gebiete der anderen Sinnesempfindungen (Leipzig: Fues’s
Verlag).
Carroll, J. B. and J. H. Greenberg, 1961: ‘Two cases of synaesthesia for color and musical
tonality associated with absolute pitch’, Perceptual and Motor Skills 13: 48.
Day, S. A., 2009: ‘In search of a means to explore’, in A. Dittmar, ed., Synaesthesia: A Golden
Thread through Life? (Essen: Die Bleue Eule), pp. 209–12.
Dixon, M. J., D. Smilek and P. M. Merikle, 2004: ‘Not all synaesthetes are created equal:
projector vs. associator synaesthetes’, Cognitive, Affective and Behavioral Neuroscience
4: 335–43.
Eagleman, D. M. and M. A. Goodale, 2009: ‘Why color synesthesia involves more than
color’, Trends in Cognitive Sciences 13/7: 288–92.
Eagleman, D. M., A. D. Kagan, S. S. Nelson, D. Sagaram and A. K. Sarma, 2007: ‘A stan-
dardized test battery for the study of synesthesia’, Journal of Neuroscience Methods
159: 139–45.
Grossenbacher, P. G. and C. T. Lovelace, 2001: ‘Mechanisms of synaesthesia: cognitive and
physiological constraints’, Trends in Cognitive Sciences 5: 36–41.
Jewanski, J., S. A. Day, J. Simner and J. Ward, 2011: ‘The development of a scientific under-
standing of synesthesia from early case studies (1849–1873)’, Journal of the History of
Neurosciences 20: 284–305.
Jewanski, J., S. A. Day and J. Ward, 2009: ‘A colorful albino: the first documented case of syn-
aesthesia, by Georg Tobias Ludwig Sachs in 1812’, Journal of the History of Neurosciences
18: 293–303.
Karwoski, T. F. and H. S. Odbert, 1938: ‘Color-music’, Psychological Monographs 50: 1–60.
Lussana, F., 1873: Fisiologia dei colori (Padua: Sacchetto).
Marks, L. E., 1978: The Unity of the Senses (London: Academic Press).
Marks, L. E., 2004: ‘Cross-modal interactions in speeded classification’, in G. Calvert, C.
Spence and B. E. Stein, eds., The Handbook of Multisensory Processes (Cambridge,
MA: MIT Press), pp. 85–105.
Martino, G. and L. E. Marks, 2001: ‘Synesthesia: strong and weak’, Current Directions in

Mattingley, J. B., 2009: ‘Attention, automaticity and awareness in synesthesia’, Annals of
the New York Academy of Sciences 1156/1: 141–67.
Mattingley, J. B., A. N. Rich and J. L. Bradshaw, 2001: ‘Unconscious priming eliminates
automatic binding of colour and alphanumeric form in synaesthesia’, Nature 410: 580–2.
Maurer, D. and C. J. Mondloch, 2006: ‘The infant as synesthete?’, Attention and Performance
21: 449–71.
Millet, J., 1892: Audition colorée (Montpellier: Hamelin frères).
Mills, C. B., E. H. Boteler and G. K. Larcombe, 2003: ‘ “Seeing things in my head”: a syn-
esthete’s images for music and notes’, Perception 32: 1359–76.
Myers, C., 1915: ‘Two cases of synesthesia’, British Journal of Psychology 7: 112–17.
Nunn, J. A., L. J. Gregory, M. Brammer, S. C. R. Williams, D. M. Parslow, M. J. Morgan,
R. G. Morris, E. T. Bullmore, S. Baron-Cohen and J. A. Gray, 2002: ‘Functional mag-
netic resonance imaging of synesthesia: activation of V4/V8 by spoken words’, Nature
Neuroscience 5: 371–5.
Ramachandran, V. S. and E. M. Hubbard, 2001: ‘Synaesthesia: a window into perception,
thought and language’, Journal of Consciousness Studies 8: 3–34.
Rogers, G. L., 1987: ‘Four cases of pitch-specific chromesthesia in trained musicians with
absolute pitch’, Psychology of Music 15: 198–207.
Rothen, N., A. K. Seth, C. Witzel and J. Ward, 2013: ‘Diagnosing synaesthesia with online
colour pickers: maximising sensitivity and specificity’, Journal of Neuroscience Methods
215/1: 156–60.
Simner, J., 2012: ‘Defining synaesthesia’, British Journal of Psychology 103: 1–15.
Simner, J., C. Mulvenna, N. Sagiv, E. Tsakanikos, S. A. Witherby, C. Fraser, K. Scott
and J. Ward, 2006: ‘Synaesthesia: the prevalence of atypical cross-modal experiences’,
Perception 35: 1024–33.
Sinha, J., 2009: ‘E is not the same as E, or: 8 + 8 = 19’, in A. Dittmar, ed., Synaesthesia: A Golden
Thread through Life? (Essen: Die Bleue Eule), pp. 209–12.
Söffing, C., 2006: ‘Wenn das Leben grau erscheint, lege ich mir rot-orange-gelbe Musik auf’,
in J. Jewanski and N. Sidler, eds., Farbe—Licht—Musik (Bern: Peter Lang), pp. 104–9.
Spence, C., 2011: ‘Crossmodal correspondences: a tutorial review’, Attention, Perception, &
Psychophysics 73/4: 971–95.
Tomson, S. N., N. Avidan, K. Lee, A. K. Sarma, R. Tushe, D. M. Milewicz, M. Bray, S. M.
Leal and D. M. Eagleman, 2011: ‘The genetics of colored sequence synesthesia: sugges-
tive evidence of linkage to 16q and genetic heterogeneity for the condition’, Behavioural
Brain Research 223/1: 48–52.
Psychological Science 21/1: 21–5.
Ward, J., 2008: The Frog Who Croaked Blue: Synesthesia and the Mixing of the Senses
(London: Routledge).
Ward, J., 2013: ‘Synesthesia’, Annual Review of Psychology 64: 49–75.
Ward, J., B. Huckstep and E. Tsakanikos, 2006: ‘Sound-colour synaesthesia: to what extent
does it use cross-modal mechanisms common to us all?’, Cortex 42: 264–80.
320 Music and Shape
Ward, J., S. Moore, D. Thompson-Lake, S. Salih and B. Beck, 2008: ‘The aesthetic appeal
of auditory-visual synaesthetic perceptions in people without synaesthesia’, Perception
37: 1285–96.
Ward, J., J. Simner and V. Auyeung, 2005: ‘A comparison of lexical-gustatory and grapheme-
colour synaesthesia’, Cognitive Neuropsychology 22: 28–41.
Zigler, M. J., 1930: ‘Tone shapes: a novel type of synaesthesia’, Journal of General
Psychology 3: 277–86.
Reflection
Timothy B. Layden, synaesthete
For me every sound has its own shape or form. This sense of shape is like
objects in my periphery. They move around me and change in size and struc-
ture depending on how the sounds change. The experience is more intense
with more complex sounds and when the source of the sound is not visible.
I wonder if it is my brain creating the visual for the sound. The shapes seem to
reflect the sound: liquid sounds often create fluid bubbly shapes; sharp clanging
sounds have more angular shapes like growing crystals; bass sounds are large
and expanding. When there is a loud, seemingly singular sound, this can create
a sense of space around me as if I were inside the shape itself. When many
sounds occur at once, the shapes often combine, creating a complex structure
or a texture. These shapes sometimes blend together, rather as sounds do in the
environment, creating a moving landscape. These experiences are part of how
I sense the world and rarely stand out as distractions. Sometimes, however, a
sudden, unexpected sound will evoke a synaesthetic experience that is distract-
ing, drawing my attention away from whatever I might be doing.
There are certain sounds and combinations of sounds that I particularly
enjoy and seek out. I am a lover of experimental and improvised music, as these
can create unexpected and exciting shapes. I also enjoy the sounds of a busy
urban environment with its bells, screechings, fountains and many pounding
feet. I enjoy industrial sounds at a distance but not close up: when sounds are
too loud and incessant I can become overwhelmed. I love the early morning
calling of birds in spring; their shapes are like a garden of wildflowers.
As an artist I have used my experience of synaesthesia as a source for my
work. I record and create sounds that I use to make music and soundscapes
filled with shapes that I draw, paint, sculpt and write about. Through the pro-
cess of observing my experience of sound shapes, I have become more aware
of them, and the experience has become more detailed. I also feel that I have
become a better artist by using my synaesthesia in my work.
321
322 Music and Shape
FIGURE R.39 Timothy B. Layden, Dark Glistening. A colour version may be seen at .
A simple slow bass in my composition Dark Glistening (sound file at )

appears as oddly shaped weightless black stones floating in a liquid space.
Sharp bright lengths of thick thread, like tails or snakes, lash around the
stones in the way that flags flap in heavy wind. Deep vibrations interfere with
the sense of liquid space and turn the weightless stones to spinning orbs that
leave soft, dark trails tracing around my periphery. The movement acceler-
ates, creating choreography of transparent worlds: shiny, glistening. A slap-
ping sound creates an explosion of spiny needles as on a thistle. A soft hiss
of yellow and grey lines streak almost imperceptibly between everything.
Reflection
Stephen Hough, pianist
Musicians are always talking about ‘shape’ in reference to phrasing, but mostly
I think they merely mean that something has shape rather than being ‘shape-
less’—always a derogatory term. Shape in this sense means direction, a start
and a finish with something pleasing in the middle. Seldom are the visual pat-
terns of a draughtsman or an artist relevant to a musician. Nevertheless, only
today I was doing an interview and trying to describe to the journalist a CD
I recently recorded of music by Scriabin and Janáček. Strange bedfellows, it
would seem, despite some parallels in their Slavic origins and their eccentric
visions. But what makes them so different from each other (and therefore fas-
cinating in juxtaposition) is actually related to shape. Scriabin is all seductive
curves (across the phrases and up through the exotic harmonies), whereas
Janáček is angular and fragmented, motives repeated and insisted on like an
army of elbows. Perfumed art nouveau versus bleak if passionate cubism.
323
Reflection
Alex Reuben, filmmaker
I have a background as a DJ and in art and design. I feel there’s a rhythm in a

group of people and how they move, a sense of contagion through improvis
ation and structure passing from one to another, a group experience. My films
feature dance and are directed to and by sound. When I DJed, the music spa-
tially cut up the nightclub for me: it created a sculptural sensation, a form, a
conjoined physical, emotional weight.
I saw a dancer in a club with a unique, sensual fluidity. Louise James was
dancing to Latin-Jazz. She seemed to encompass a whole history and personal-
ity in the way she moved. At around the same time, I became aware of a physi-
cal reaction when I looked at large, abstract, American expressionist paintings,
a greater response than I would have to a sculpture. For example, looking at
a Rothko physically drew me in, to the point that I had to stop myself from
moving towards it.1 I tried to translate these sensations into a film, Que Pasa
(2001),2 by focusing on the core of Louise’s body and by isolating shape. What
you see is a moving line: your perception depends on whether you engage with
it as a figure or not.
Que Pasa is an uncut sequence, reframed in editing, with movement dictated
by the beautiful Horace Silver soundtrack I had played as a DJ when I origi-
nally saw Louise dance: the connection is important. The framing and matting
techniques offer a way of focusing physically on the connection between music
and dance. They choreograph by moving shapes, capturing an anthropology
encoded in the dancer’s way of moving that had so enchanted me. At some
points the image is actually blank, but audiences do not appear to experience
this: it’s the sound that is carrying our sense of movement. The shapes continue
in space and ‘off camera’, away from our visual frame of reference. It’s like
remote painting, projection.
In Big Hair (2003)3 there’s a sequence where the geometric set design becomes
full frame and is travelling very slowly across the screen. In effect, the shapes,
324
Reflection: Alex Reuben 325
like the characters, are also dancing. The effect should be about as interesting
as watching paint dry, and yet audiences seem to like it. I think it’s the combi-
nation of the music and the clashing, bright red and pink that pulls us along,
engages us: it’s a conversation, mirrored by the performers. I composed the
soundtrack with a very simple, three-note phrase which repeats over and over
again—what cognitive scientists might describe as an ‘earworm’. I wanted the
film to be like a TV commercial and had written the music to be catchy. I also
used it in a radio advert. Of course, I’m not consciously thinking of any of this
as I make the films. I’m working instinctively and emotionally. It’s expressive
and I love doing it.
Afterwards I became interested in what was moving me and audiences. I’d
previously collected articles on evolution and began rationalizing what was
going on in my films, and this led me naturally towards research in cognition.4
I learned from the work of neuroscientist Chris Frith that we are far more
engaged in contagion, reacting to the actions and reactions of others, and
that we have less agency than we may like to believe, while the work of neuro
aesthetician Semir Zeki showed me that some of the most popular and engross-
ing paintings are those that carry aesthetic questions—dark spaces where the
viewer must fill in the gap. This taught me something about Que Pasa, that
audiences went on a narrative journey through moving shapes, even when noth-
ing appeared to be happening.
I wondered why, while I feel a very powerful response to space in architec-
ture and the countryside, I have such a strong, physical reaction to paintings.
So in Line Dance (2004)5 I used twenty-four-camera, motion-capture technol-
ogy to create 2D figures in 3D space, a virtual space. My ‘choreogeography’,
co-performed in rehearsal with Afua Aweku, is re-choreographed on computer
with a ‘stick-figure’ model, using data from real dancers. I instinctively felt that
the simple shapes could best reveal movement: I created them in a computer
with the help of skilled technicians. Like the dance, it’s a collaboration, a digi-
tal continuation of the original choreography and improvisation. This digital
choreography is the manipulation of shape and colour. It gives me much the
same sensation as when I am drawing with pencil and paper, an exciting rhythm
and feeling that runs throughout my body, a sense of ‘being in the moment’
or ‘zone’. I have this same feeling whether it’s me operating the pencil or com-
puter, or when working with skilled craftspeople as in this case.
The ‘projection’ also happens when I compose music. I’m not a trained
musician and cannot play an instrument. I compose it in my head and hum it
to musicians. I can ‘audiolize’ exactly what I want into a material sound just
as I visualized the figures. When I edit the soundtrack with pictures, I feel the
same sense of composition.
In Line Dance, the lines and colours build, layer and overlap in much the
same way as a Jackson Pollock painting. If you look at a Pollock canvas
close-up, rather than a reproduction—Autumn Rhythm (Number 30),6 for
326 Music and Shape
example—you begin to see layers of rhythm. It’s a very physical experience,

perhaps missed in a book or a postcard. When I look close up at his paint-
ings, I become lost or engulfed, as if swimming in rhythm. As we see on film,7
Pollock actually made his paintings very physically, moving around and over
the canvas, laid flat on the floor (‘action painting’). It’s a kind of dance and it’s
not haphazard. Pollock also did his paintings while listening to jazz 78s, par-
ticularly those of bebop pioneer Charlie Parker. His paintings have something
of the sophistication and skill of the musicianship at the heart of improvised
jazz. The core of my interest is my love of improvisation, and it’s fascinating
how the outcomes of this process can be seen in shapes, as if they contain raw
energy and magic.
Line Dance is a movie where shape in sound, movement and technology
dictates the film language. The motion-capture technology is the stalwart of
computer games and hi-tech cinema. It was originally developed by medical
practitioners to track and understand motion and injury—following on from
the pioneering work of photographers like Eadweard Muybridge. It strikes me
that perhaps the reason children and adults love computer games is that the
moving arrangement of sound in space, with travelling shapes on a screen, is
the nearest thing we have to a framed externalization of what the brain presents
to us internally as ‘consciousness’.
I find that with age, experience or interest in all these things, my perception
of the proportions of rhythm has got longer. So my films get longer as well, and
yet it’s still ‘the moment’ that I’m interested in. As my own love and work are in
cinema with its big sound and image, perhaps the attraction is the immersion in
the surrounding sound and silver screen: looking and listening instead to video
on a phone or laptop is a different experience, like postcard to painting. In
Routes (2008),8 made specifically for cinema, I feel as though I’m painting with
the microphone and camera in separate hands, often pointing them in different
directions. There’s the same sense of being ‘in the moment’, of art and design.
So when I’m recording live musicians and dancers I physically move: you could
say I’m improvising and creating a video choreography at the same time.
I’m instinctively interested in the movement of sound, and I think that
crosses over into shape and form. I’m predominantly responding to and being
led by music; likewise in editing. These longer films incorporate environment,
landscape and architecture. I call it ‘choreogeography’: they collect social
movement, not just dance. Audiences have said that the effect builds up and
that they are feeling narrative through their bodies throughout the films. I think
this audience experience is cumulative. Audiences are moved emotionally at the
end of the movies as if they’ve been told a story, except there are no words. The
narrative is music and shape, formed by movement that expresses creativity,
character and emotion. I’m interested in this narrative of moving sound and
shape, an emotional narrative, so that when you see a movie you have other
narratives going on somewhere in your brain: they might be formed by colours,
Reflection: Alex Reuben 327
sunset, water, emotion, violence, sex. It might be any of the things that are in a
film, working at lots of levels. One of the things I’m trying to do is stretch the
‘now’ over the whole. And it almost feels like a dance itself, this relationship of
the whole and being in the moment.
When I teach sound and video projects to art students, something I empha-
size is texture. The relationship between texture in sound and image can be
found in shape. The easiest way to explain this is to illustrate how sound and
music have the equivalent of smooth and rough textures, like the surfaces of
three-dimensional materials; and then to say that there are as many variations
in sound qualities as there are in colours, and that screens have diverse, visu-
ally tactile properties, as in touch. They get it instantly. It’s then a small step
to discuss the relationship with technology and how music producers like Phil
Spector,9 Hank Shocklee (from ‘Public Enemy’)10 or Lee ‘Scratch’ Perry11 are
artists working in sound, musicians whose primary instrument is technology
and the recording studio. For me, in their recordings, the key element is the
texture and movement. That can move me as much as melody and rhythm.
In music and sound, production is almost like a feeling, a total whole when
I’m working with sound and image. I can’t distinguish them: it’s like painting
with sound. In editing there’s something quite visual in the pattern of film and
sound tracks; they make shapes that change according to the proportion of
your timeline. You can look at them on a big or a small scale, though when it
gets so big that I can’t see the overall form for long, I get uncomfortable. Once
I’ve done a focused bit of editing, I’ll always return to the longer scale so that
I’ve got a sense of proportion. It has to feel proportionate to something physi-
cal in my body, and in that sense it seems to have a shape that I feel.
I’ve expanded choreogeography from the one minute of Que Pasa into the
physical world around me. It’s very much to do with how I feel myself in the
landscape. I don’t just mean the physical landscape but also the landscape inside
our heads. I walk everywhere and have made the longer films with the same
sense of travel. I would put the graphic Que Pasa and Line Dance into that cat-
egory as well as Routes and Newsreel,12 which are shot quasi-realistically. The
former don’t have any spatial perspective or consistent figuration in them. It’s
intuitively to do with the line between consciousness and reality, or the physi-
cal world and the conscious world of sense data: visually making a fluid line in
Line Dance through the ever-present original two figures, without any cuts from
the figurative to the abstract, somehow seems to connect to that space between
consciousness and the physical world.
In my current work about improvisation, cognition and movement, I’ve real-
ized the importance for scientists of the fact that none of our senses works
independently. Our perceptions are fundamentally multisensory. This rhymes
for me because I’ve always felt that shape is something that I don’t just see: it is
something that I feel. For me, music is a very physical, emotionally embodied
experience. Perhaps thinking is moving, even when we are standing still.
10
Intersecting shapes in music and in dance

Philip Barnard and Scott deLahunta
Ubiquity, differentiation and ambiguity of shape
As the editors of this volume suggest in the Introduction, shape is a term more
directly allied with spatial geometry than with sound. Its referential range has
nonetheless usefully been repurposed to help share propositional and meta-
phorical understandings of musical theory and practice. These span levels
of musical form, connections of phrasing to meaning and embodied move-
ments that accompany musical performance. Movements that accompany
music naturally lead to questions about how ideas of shape in music and dance
intersect with each other. The co-occurrence of music and dance has been
universal across human history and cultures (Mithen 2005). That universal-
ity raises questions not just about multimodal perceptual integration and the
coordinated control of human action, but also about more ineffable qualities
of meaning and emotion. In addressing choreomusical1 relationships involving
shape or other properties, we must inevitably draw not just on theories of music
and dance but also on theories of mental and cultural systems.
While the idea of shape may well be ubiquitous in discourse about music,
the significance of the term inflates by an order of magnitude with its differenti-
ated use in dance. Body shapes created by dancers and observed by audiences
self-evidently lie at the heart of choreography. Posters, publicity materials and
reviews of specific choreographic works almost invariably use visual images
that draw attention to beauty or drama in the shape of a body (Figure 10.1a),
or to the configuration of several bodies shaped in space (Figure 10.1b). These
shapes may be embellished by a sense of the scenography within which the
movement is embedded. Figure 10.1c, for example, makes graphic reference
via a grid to the movement analyses of the nineteenth-century photographer
Eadweard Muybridge.2 For shapes in dance that are frozen in time, their visual
328 context inherently constrains the meaning space in which questions might be
Intersecting shapes in music and in dance 329
posed and answers sought. Accompanying titling, captions and texts can signif-
icantly embellish and expand the range and specificity of insights, but explicit
photographic images of body shape and spatial surroundings in dance clearly
assume a ubiquity for which there is no simple equivalent in music.3
Any literature search will reveal the differentiated use of shape in discourse
about form and meanings in dance, particularly in the context of education.
In one dance curriculum,4 for example, the term ‘shape’ is taken to refer to
individual body shapes (the way in which dimensional space is used by the
body) and group shapes. The curriculum states that body shapes are present in
all actions in dance and that they convey meaning. Shapes with different but
specific attributes are identified, such as those with curving or organic shapes
(Figure 10.1a); open and closed shapes; symmetrical and asymmetrical ones;
harmonious and contrasting ones; centred and off- centred shapes (Figure
10.1b); or straight lines and angles (Figure 10.1c). The text then invites the
students to think about how differently nuanced meanings might, for example,
be expressed by curved versus angular bodily shapes as well as how shape for-
mation can be realized in personal space when moving on the spot or across the
space of a studio floor.
While many might agree about some basic aspects of the use of shape in
dance discourse, the picture becomes less clear when we segue to the dynam-
ics of movement over time. Music practitioners and theorists, across genres,
appear to share frames of reference provided by notation and nomenclature.
An associated body of music theory addresses hierarchical structuring of units
that are either parts of specific forms (e.g. exposition, variation, episode, fugue
or subject) or common to all tonal forms (e.g. phrase, period, theme or motive).
Since notational form in a music score indexes required properties of form over
time in sound, the common ground of shared notation and nomenclature in
music provides a scaffolding of huge value for discussing ideas about shape and
about how it relates to expressivity and meaning. Despite the presence of move-
ment notation systems such as Laban or Benesh, an equivalent scaffolding of
widely agreed nomenclatures does not exist in dance, a condition recognized
by humanities scholars (Jordan 2000: 84). This means that it may be easier to
discuss the dynamic properties of shape in music, and how these might relate to
nuances of meaning and interpretation, than to do so with the same precision
for dance.
On the other hand, avant-garde developments in both music and dance
practice evidence the degree to which shared nomenclature gives rise to non
traditional concepts and formal innovation. For example, the idea of a phrase in
dance could be attributed to modern dance pioneer Doris Humphrey (1959: 68),
who stated that ‘the good dance should be put together with phrases, and the
phrase has to have a recognizable shape, with a beginning and an end, rises and
falls in its over-all line, and differences in length for variety’. Tracing develop-
ments in dance after Humphrey, one quickly finds choreographers rejecting the
(a)
(b)
(c)
FIGURE 10.1 Selected illustrations for the productions of (a) ATOMOS, (b) ENTITY and
(c) UNDANCE for Wayne McGregor | Random Dance (photos: Ravi Deepres)
very notion, with Yvonne Rainer (1974) and Meg Stuart (reported in Burrows
1998) proposing either to minimize or to eliminate the use of the term ‘phrase’.
In developments in music in the twentieth century, radical experimentation
with and rejection of shared notation and nomenclature are evident in John
Cage’s 1969 book Notations, a collection of nontraditional graphic scores col-
lected from 256 individual composers. However, what Notations also makes
evident is the ubiquity of traditional historical and cultural notation and scor-
ing techniques in music, providing a clear point of departure for the musical
avant-garde and again pointing towards what appears to be lacking in dance.
Interestingly, various research projects initiated by contemporary choreo
graphers in the last fifteen years have sought to address this apparent gap in
(a)
(b)
FIGURE 10.2 (a) Still from video annotating form and flow in Forsythe’s One Flat Thing;
(b) Difference Forms in movement viewed from above (image credit: Synchronous Object Project,
The Ohio State University and The Forsythe Company)
332 Music and Shape
dance’s notation and scoring traditions by seeking to combine visual record-

ings of dance with annotation and other forms of image processing to produce
evidence of latent (present but not visible) form in shape and timing relation-
ships in dance. Figure 10.2a is a still from an annotated video illustrating align-
ments, with which choreographer William Forsythe designs relationships in
space and time. Figure 10.2b, from the same project, looks at the dance from
above. Here patterns and shapes in time are illuminated through video pro
cessing that reveals fleeting alignments, bursts of turn, clusters of action, and
horizontal and vertical flows. The Synchronous Objects website,5 and examples
from other choreographers developing similar resources based on their work
(deLahunta 2013), provide numerous additional illustrations of the intricacies
and challenges of developing documentation approaches, analytic and visual-
ization practices that begin to capture, for the purpose of communicating it,
what is present but perhaps latent and implicit in dance creation and perfor-
mance. The discourse surrounding these research projects often refers to mak-
ing ‘choreographic thinking’ and ideas accessible.
What is ‘thinkable’ at the intersection of music and dance
The term ‘shape’ itself is, of course, used literally to underline intersections
between music and dance. A recent review of Tractys: The Art of Fugue, a piece
choreographed by Wayne McGregor for the Royal Ballet to an arrangement of
Bach’s original material, provides a canonical example: ‘Osipova and Edward
Watson skate over the shape of the score as they pull themselves into extreme
poses—the audience gives a hissed intake of breath at her first long past six
o’clock arabesque’ (Anderson 2014; our italics). A reference to organization in
music is seen as driving what the dancers are doing, and what they are doing
is projected as affecting the emotional responses of the viewer linked to an
elegant but, in this case, extreme body shape. The idea of shape is admirably
flexible here, but it is difficult to differentiate, disambiguate and share system-
atically across the full range of ways in which the sense of shape is experienced
and expressed in dance.
The same applies to how shape in music might relate to more ineffable quali-
ties of meaning and emotion. While the more stable nomenclature in music
might be useful to humanities scholars, the properties of meanings in music
have been widely discussed from psychological and linguistic perspectives but
with little consensus (e.g. Barnard 2012; Cross and Tolbert 2009; Tan et al.
2010). Emotional attributes of music, in particular, have been the subject of
much analysis and debate (e.g. Juslin and Sloboda 2010). Sloboda (1985: 62),
for example, notes that the tonal system can offer analogies for how people
represent emotions in ‘some semantic space’ and that these support a partial
mapping of tonal relations onto emotions. Upward movements away from the
tonic are seen as suitable for expressing outward emotions while movements
towards the tonic signify rest or repose. Although meanings in dance have not
attracted such extensive debate in psychology, similar observations could be
made about both the emotional expressivity of specific properties of movement
and the wider ineffability of the meaning of movements.
We all have some grasp of what Sloboda might have meant by the idea of
‘some semantic space’ because his example gives that sense form. Likewise,
the idea of dancers ‘skating over the shape of the score’ makes communica-
tive sense but tells us little about whether an attribute of the score really did
underpin the choreographer’s decision-making or what the salient attributes,
if any, might have been. What the bigger semantic space or the detail it con-
tains might be is unspecified in both cases and therefore ineffable. When
cultural influences on making and experiencing performances are factored
into our wider equation, the domain to be addressed is potentially bound-
less. In this sense, music and dance are equally challenged to provide deeper
accounts of the impact of shape on processes and expressions of artistic
practice.
The sense of shape in music and dance is perhaps more readily ‘thinkable’ in
the context of a score as described by choreographer Jonathan Burrows (2010).
In A Choreographer’s Handbook, he denotes two definitions of a score. In one,
the score holds within it the detail of what you eventually see or hear. The
classical music score, as Burrows mentions, fits this definition, and as such the
classical score may be easier to study and discuss, as has already been indicated
by reference to shared notation and nomenclature. He points towards another
kind of score in which ‘what is written or thought is a tool for information,
image and inspiration, which acts as a source for what you will see, but whose
shape may be very different from the final realization’ (ibid.: 141). The idea of
shape in either music or dance as information that is mapped into a very dif-
ferent realization is clearly important, but how we flesh out a more enriched
understanding of what is involved in the mapping requires methods for pro
bing both processes and thinking. That in turn implies that researchers need
concepts and methods to study the processes and decision-making of artists
working in situ.
Applied psychology offers such concepts and methods. It will typically
seek to gain some understanding of a domain of practice, and then consider
how psychological theory and data might inform, and even modestly enhance,
aspects of that practice. To reveal senses of shape requires methods and instru-
ments that keep both implicit and explicit aspects in view. Such instruments
may also assist in filling in where there is the apparent lack of a shared notation
or nomenclature, thus coming into alignment with the artist’s research initia-
tives already mentioned.
334 Music and Shape
Tools to support interdisciplinary collaboration and exchange
During ten years of collaborative research with Wayne McGregor | Random

Dance we developed two analytic lenses that helped us, over time, to make sub-
stantial contributions to the specific artistic research objectives of McGregor
and his company.6 The first lens examines contemporary choreography as a
process of design that bridges sources of inspiration and a finished artwork.
The second analytic lens is one derived from our understanding of the archi-
tecture of mind. Precisely formulated ideas about mental architecture provide a
language for informing discussions of multimodal synthesis, the use of mental
imagery, and their collective intersections with meaning and emotion.
The journey from an idea to a completed artwork typically involves much
time, thought and expertise. Figure 10.3 depicts our analytic lens for exploring
the making of dance as design (Barnard and deLahunta 2011).7 This lens has
previously been applied to other design domains including information tech-
nology (Long and Dowell 1989; Barnard 1991) as well as diagnostic tests and
therapeutic interventions for mental illness (Barnard 2004).
In the right-hand box are representations of resources that might inspire
choreography. These capture the landscape in which the artist might forage over
existing ideas for inspiration and to which new knowledge and ideas can be
added. It represents the food for thought that surrounds a design space in focus.
In the left-most box are examples of representations within the real world of
Representations of Bridging representations Representations of

the world of ideational resources
contemporary dance Discovery representation
dancers topics for absorption essence of

analyse assimilate
+ choreographer (x)’s
performance spaces processes & tasks work
+ body configurations +
composer & musicians constraints model/sense of
+ this work
designers & +
scenographers other choreographer’s
+ work
design
evaluate assess & +

audiences
& select modify other arts
+
salient encounters +
(intentional or not) sciences and
+ philosophy
incidents examples: +
+ synthesize current dynamic contextualize essences of natural
books vocabulary encounters
etc...... current structuring etc......
The landscape of Production representation The landscape of

artistic production foraging opportunities
FIGURE 10.3 Relationships and representations that bridge sources of inspiration and a finished work
in contemporary dance
contemporary dance. This is the cultural landscape within which a production

originates and is ultimately delivered. In the centre are bridging representa-
tions, including a ‘discovery’ representation and a ‘production’ representation.
By analogy with diagnostic tests in medicine, discovery representations in the
arts might include descriptions of the processes and methods used in a studio.
Again by analogy with treatment manuals in medicine, production represen-
tations, among other things, might include scores, notes, directions, or sound
and video recordings of material. An activity of design interlinks the topics,
processes and tasks that a choreographer may employ to support discovery and
to feed into representations of a state of the production as it evolves over time.
The arrows between the boxes are mental processes that act on these rep-
resentations and interlink them. Collectively, processes of analysis, assimila-
tion, evaluation and selection, assessing and modifying, contextualizing and
synthesizing are the cognitive substrate of expertise for creative mental work
or crafting a performance. As a general rule, it is not too difficult to fill out the
boxes. We can draw on the likes of interviews, academic literatures, reviews,
educational curricula and so on. The mental skills involved in linking up the
parts of the diagram are central to making and performing. But these are much
harder to characterize unless we make reference to theories in cognitive science,
as we have done in our collaborative studies of how dancers direct attention to
and use various forms of imagery while they are creating innovative movement
vocabularies in the studio.
The bridging model in Figure 10.3 also provides a framework for integrating
other scholarly perspectives. To illustrate this, we can return to the intersec-
tions between music and dance. In her volume Moving Music (2000: 19–21),
dance scholar Stephanie Jordan reviews the overall historical spectrum of rela-
tions between music and choreography. Across a major partition of that spec-
trum, music has traditionally been seen as dominant, with the choreography in
some way describing or interpreting the music. Jordan’s choreomusical analysis
explores how choreographers work with an explicit coupling of meaning and
musical form on the one hand and form and meaning in dance on the other.
She provides us with numerous examples of such analyses from the work of the
choreographers Balanchine, Ashton and Tudor. In other studies (e.g. Jordan
2012) she has also examined the work of the American choreographer Mark
Morris, who is a modern advocate of choreographic strategies for linking dance
to musical form. Jordan (2000: 71) is very clear that she takes as a fact that
we perceive relationships between music and dance, but that it is a dialectical
relationship in which they inform each other. At one level, music clearly acts as
a source that inspires the choreography and additionally provides a route into
understanding how the fine detail of movement material may have been selected
and assessed. Rather than using a generic term like shape, Jordan makes precise
use of formal categories such as rhythm or counterpoint, as well as scores for
both music and dance, to elucidate the practice of the choreographers on whom
336 Music and Shape
she focuses. Here is just one illustrative analysis of her noting a relationship of
parallelism between music and dance in the work of Balanchine:
There is a clear example of close parallelism to quiet, lightly humor-
ous effect in Balanchine’s Agon (1957) Pas de Deux. Here we see a series
of staccato isolated gestures when the woman is in penché with one leg
around the man’s shoulder; they release the right hand clasp, she touches
the floor with her free hand, he releases his left hand to take his arm back
high. The gestures follow the rhythm and pitch contour of three notes in
Stravinsky’s score. They stand out as three ‘special’ moments within the
Pas de Deux. (Jordan 2000: 75)
With many such examples Jordan provides evidence of a rich picture of how
choreomusical relationships of this type can be expressed and formalized.
Pursuit of those relationships is one important methodology for adding gram-
matical and semantic detail into our understanding of intersections between
music and dance. For each of the choreographers whom Jordan discusses, or
any others inspired by her choreomusical thinking, the explicit products of her
scholarly work could populate the boxes in Figure 10.3, providing depth and
breadth to the representations that can be elaborated to fill out the right-hand
box and the mechanisms of discovery and production.
The properties of scores invoked in Jordan’s scholarly analyses are clearly
rather more specific than the generic idea of skating over the shape of the
score quoted earlier from the review of Tractys: The Art of Fugue. The prop-
erties brought into focus are dependent on the type of observer, the perspec-
tive they adopt and exactly what they happen to be attending to at the time.
In our own experimental research at Wayne McGregor | Random Dance, we
found that elite dancers vary considerably in the features they attend to when
analysing the same short segments of movement material (deLahunta and
Barnard 2005). While such variation can be seen as exposing the many ways
in which latent properties of music and dance can be used, it equally well
underscores the point that intersections between music and dance identi-
fied by scholars, critics and even expert colleagues may not reflect the inten-
tions of choreographers or the actual cognitive bases of their decisions. To
find out about these, we need to probe the thinking of the choreographers
themselves.
A recent interview with McGregor on how the idea of shape in music contrib-
utes to his own choreographic practice can also be integrated into Figure 10.3.8
Because he is trained in music, a musical score can provide him with a window
on how the composer is thinking rather than with what might be thought of
as content to be described or visualized in dance. Among other things, he uses
the musical score as a discovery representation from which aspects of the music
can be assimilated into his knowledge resources and choreographic thinking
rather than as a production representation, where a score is used to support
performance. These remarks enable us to grasp what Burrows pinpointed as the

use of a score as information. Even if the scores being referred to here are clas-
sical music scores, for McGregor they play a key role during creative decision-
making as sources of information:
So if I were working with a contemporary composer like Steve Reich

or Mark-Anthony Turnage, the notations in their score already give you
some sense of order and shape and form, things that they were think-
ing about. So in a Reich score, he will notate where there is a change of
thought, by number, so the score is all notated down. Obviously the music
is written but then he indicates massive changes of theme with a number
like five and then six and seven. So through that score you have got it
organized in a particular way that gives you some sense of shape.
… you see it even more when you have got the kind of reduction of the
orchestral score. If I were working with a score like that, I would look at
the anatomy of the score in terms of its shape, its architectural shape, to
inform what I might do physically. But when I’m working with something
which is say electronic music or music which might be counted in eights
or where there is not that kind of level of interest or range in the music,
I have to find other ways of intervening with it and that’s where I would
think about shape in a totally different way.
… Often what I do is get the conductor to go through the score with
me way, way in advance so you get their way of conducting the score
which is their phrasing and the counting and that shape, but often the
conductor’s shape for a score and other musically important things are
often very different from a choreographer’s or a dancer’s way of hearing
something.
The last point about counting is particularly interesting because it is a poten-

tial instance where the micro organization of the music might actually inter-
fere with movement and where a focus on some broader property of shape in
music is a potential means of taking interfering counts out of the equation.
McGregor also prefers to work with living composers with whom he can inter-
act to avoid the constraints of a fixed score. Mirroring Jordan’s remarks about
a dialectical relationship between music and dance, McGregor finds that the
body has its own music and that this needs to be accommodated as a central
part of his discovery representations and choreographic thinking:
… so for example rhythm, obviously there is a way you can do rhythm

with a body, you know, making rhythm noises, but there is an inherent
rhythm to a body doing shapes, throwing shapes, there is a kind of music
to that which has its own coding and its own context and its own thing. It
becomes … it emerges and it forms itself and it becomes something that
you can’t really alter.
338 Music and Shape
… I also think the body, when it’s tasked, will create something shape-
wise which has an inherent musicality to it. The actual thing that’s made
has a musicality to it and that thing that’s made on that task is different
from the musicality of the thing that’s made over here with this task, if
the tasks are working…
Taken together, implicit or explicit ideas about shape in music and in dance
intersect in several distinct ways that again touch on meaning, emotion and
the parts played by creative design processes. When viewed in the context of
creating a production, the kinds of representations and processes identified in
Figure 10.3 enable us to organize and categorize intersections between music
and dance rather than just list them. It is a descriptive framework allowing us to
characterize choreographic, and potentially choreomusical, practices.
Intersections of senses of shape in music and dance

in creation work
From ethnographic studies, key aspects of McGregor’s discovery (e.g. Kirsh

et al. 2009) and production representations (e.g. deLahunta, McGregor and
Blackwell 2004) have been documented along with inferences about how he
selects and modifies candidate material. When McGregor is creating work for
his company, diverse imagery contributes to the development of his movement
vocabularies in the context of setting his dancers movement tasks to explore
and solve. McGregor’s practice and principles for developing visuo-spatial
imagery and translating it into movement are well illustrated in his 2012 TED
lecture.9 Musical imagery is one of many devices invoked as points of departure
for the entire company to spend a period generating movement material for
him to consider for further development. This example is taken from a research
exercise: ‘Think of a familiar song or piece of music. Focus on the memories,
feelings or sensation it evokes, in you or someone else. Translate it into 3D and
draw the meaning’ (May et al. 2011: 407).
In this case, the description of music through movement is not a choreo-
graphic end in itself. Rather, the music is information: it is a generative device
where the imagery and exploration are important means through which move-
ment habits can be overcome and innovations developed, evaluated and selected
on their merit and goodness-of-fit to a larger, dynamically evolving choreo-
graphic picture. In McGregor’s practice, tasks based on music constitute only a
very small part of what goes on in generating material: the stimuli and transla-
tions may just as well be sounds, words or graphic images. How such imageries
are translated into actions by his dancers involve a wide range of strategies and
mappings. As our own research using experience sampling with dancers work-
ing in the studio shows (ibid.), such tasks are mental work in which numerous
types of mental representation and imagery are called into play alongside fre-
quent shifts in attention among images in mind, body and external world. Here
the dancers are exploring movement design, assessing and modifying shapes
and timing, as they make their own decisions. In doing so, they traverse many
points in mental as well as physical space. What they attend to is as varied as
what was attended to in our study of observers viewing and parsing move-
ment material (deLahunta and Barnard 2005). The activities surrounding and
involved in task-related creation involve many of the mental processes (analy-
sis, assimilation, evaluation and selection, assessing and modifying, contextu-
alizing and synthesizing) used within the iterative cycles of design that lie at
the heart of Figure 10.3. These mental processes are open to systematic prob-
ing, again drawing on theories in cognitive science to help expose the decisions
behind the creative choices.
Mental architecture, shape and meaning
Shape or form in music clearly intersects with dance in powerful but quite dis-
tinct ways. It is information rather than a prescription. The bridging framework
offered in Figure 10.3 helps us to make sense of variation in the use of informa-
tion in music by pinpointing how it can come to be used in framing choreo-
graphic approaches, thinking, decision-making and even scholarly analyses of
them. The bridging model was one of the analytic lenses we used in the collab-
oration; the second and the one that supported most of the ongoing practical
work with McGregor and the company is a macroscopic lens that looks at men-
tal architecture as a whole. As with Figure 10.3, this lens provides some core
categorical distinctions of relevance to the analysis of mental processes and
mental imagery systems occurring within creation and performance of dance.
The particular mental architecture we called on here, Interacting Cognitive
Subsystems (Barnard 1985; Teasdale and Barnard 1993), is not just about cold
analytic and rational processes. It charts the whole embodied mental landscape
and provides a vocabulary for addressing both cognitive and emotional mean-
ings including the essences of ineffable feelings and intuitions.
If you show small children from different cultures the two shapes in Figure
10.4 together with the words ‘ulumoo’ and ‘takete’, studies show they pair
them up systematically (Davis 1961). This indicates something connecting these
sounds and shapes that is deeply shared. We experience such cognitive–affective
patterns as systematic ‘senses or feelings’ that yield reliable and measurable
behaviours, yet we seem to have made this connection intuitively. The dance
curriculum quoted in our opening paragraphs invited teachers to get pupils
to think about the meanings of angular body shapes and curved ones. Just
like the children matching angular or curved shapes to words and modalities,
dance pupils can undoubtedly do something similar with the idea of different
340 Music and Shape
Shapes or
Objects
Non-Words: Ulumoo Takete

Words: Words of love Swear words
Body: Soft cushions Piercing thorns
Taste: Hot chocolate Lemon juice
Sounds: Lapping waves Breaking glass
Music: Harmony Dissonance
Actions: Gentle caress Repeated prods
Emotional Neutral to Neutral to likely
significance: probable comfort discomfort
FIGURE 10.4 Examples of deep patterning in multimodal fusion
body shapes with some measure of agreement. However, like wider meanings
in music and dance that make intuitive sense, as with our earlier discussion of
Sloboda’s (1985) notion of a semantic space in which emotional meanings and
musical patterns might exist, the items across modalities listed in the left-and
right-hand columns of Figure 10.4 share an ineffable quality. What underpins
the emotional responses to these shapes and nonsense words and other related
experiences involves a recipe that has something to do with sharpness, perhaps
an element of hardness or a dose of irregularity with a pinch of abruptness in
temporal patterning thrown in for good measure. And to understand this phe-
nomenon we may not need to reference any aspect of advanced cognition at all.
Figure 10.5 illustrates the potential gains from viewing issues of shape and
meaning across modalities of human experience through the lens of mental
architecture. On the lower left is a dog, set in the kind of learning experiment
made famous by Pavlov. For the purposes of argument, let us suppose the dog
has three sensory subsystems, each with three kinds of resources. An ‘image’ in
each one contains the patterns of sensations the dog experiences as informa-
tion flows in from the eyes, the body and the ears. Over time the patterns of
what goes with what within each of these modalities are extracted and stored in
a ‘memory’. Moment-to-moment content changes are analysed by ‘processes’
that produce summaries of the information being attended to in the visual,
auditory and bodily landscapes. These are passed to the ‘multimodal’ compo-
nent where a new image is formed of what goes with what out there in the world
and within the dog’s body. This is not a sensory image, but what the dog might,
hypothetically, experience as what we humans might understand as ‘a feeling’.
Again what goes with what is extracted and stored in memory, and précis of
Sights Sensors Sounds
Visual Body-State Acoustic

Image Image Image
Memory Processes Memory Processes Memory Processes
MULTIMODAL
Image
Memory Processes
Som Visc
Tail
Limbs
Ears
Head
Mouth
Lips
EFFECTORS
FIGURE 10.5 A core mammalian mental architecture with four subsystems, each with three
components (image, memory and processes). The arrows here indicate how information flows—from
sensory systems through to a single multimodal subsystem that in turn sends instructions to bodily
effectors like muscles (Som = somatic; Visc = visceral.)
the deeper multimodal patterns are used to control somatic and internal bodily
responses as well as skeletal ones. Multimodal synthesis over sensory modali-
ties blends multiple ‘dimensions’ underlying patterns of information, including
emotional patterns. This enables us to understand why even such ‘simple’ sig-
nificances have an ineffable quality.
The human mind is not good at thinking about many things at the same time,
but that is what the neural networks that underpin mental architectures do well:
they pull out invariant aspects over many dimensions, like those included in
the ‘recipe’ for understanding what items in the two columns of Figure 10.4
shared, and smooth over the irrelevant variations, or ‘noise.’ In dance research,
for example, a recent study by Afanador et al. (2008) tested cross-modal per-
ception of musical tempo and speed of dance movement. The results suggested
that, through ‘aural capture’, different tempos change perceptions of (and feel-
ings in our bodies about) observed movement speed. But music might also cre-
ate an impression of greater movement impact, or heightened or sharpened
movement dynamics. The mental architecture of Figure 10.5 enables us to
342 Music and Shape
address phenomena like these or many of those linked to the entrainment of

actions in music and dance (e.g. Clayton, Sager and Will 2004), without invok-
ing properties of higher-level cognition.
But for other concerns we obviously must take higher cognitive functions
into account. There are simply no resources in a core mammalian mind to
enable us to address the kinds of choreographic thinking addressed in the pre-
vious section. There is no mind’s eye or mind’s ear in which images of scores or
actions can be internally generated and manipulated. There are no resources
to support more linguistic or propositional understandings to underpin artis-
tic decisions. We humans do not just react to rhythm and counterpoint; we
can understand the concepts that underlie their physical realization in either
music or dance.
Elsewhere Barnard et al. (2007) have argued that evolutionary differentia-
tion successively added subsystems of mind that augmented core mammalian
mechanisms with increasingly advanced mental functionality, culminating in a
nine-subsystem architecture (Figure 10.6) capable, among other things, of cre-
ating music and dance. All five new subsystems have exactly the same internal
elements as those in Figure 10.5. In much the same way as the bridging diagram
of Figure 10.3 enabled us to organize observations about choreographic prac-
tice, this more complex mental architecture enables us to address how mental
representations underlie choreographic thinking and decision-making.
Notice first that the architecture of Figure 10.5 is subsumed within Figure
10.6. The three sensory subsystems are shown in exactly the same way in both
figures. The multimodal subsystem of Figure 10.5 has now been renamed as
the ‘implicational’ subsystem and shaded in black in Figure 10.6 to reflect key
augmentations that humans have evolved for processing meaning. In addition,
the approach also proposes that specialist subsystems of mind have evolved to
coordinate the enormous degrees of freedom in skeletal effectors that humans
share with other primates. Likewise, another subsystem is proposed to have
evolved to coordinate musculature required for the articulation of speech. For
the present discussion, we focus on the three other additions, a morphonolexi-
cal subsystem, a spatial-praxic subsystem and a propositional subsystem, all
in the centre of this diagram. The morphonolexical subsystem is specialized
for dealing with the deeper patterns and structures over time in the auditory–
vocal domain, including ‘phrases’ in language and music. The spatial-praxic
subsystem serves an equivalent function for patterns and structures over time
for objects and actions in space. The propositional subsystem is specialized
to deal with what humans think of as specific ideas, concepts, properties and
inter-relationships. While these systems do real work in interpreting the world
and in controlling and coordinating response, they also play a vital part in sup-
porting thinking, imagery and imagination. Notice that none of these three
systems is connected directly to sensory or muscular mechanisms, but they do
interact with one another, as indicated in the diagram by the reciprocal arrows
Sights Sensors Sounds
Visual Body-State Acoustic

Image Image Image
Memory Processes Memory Processes Memory Processes
Propositional
Spatial-Praxic Image Morphonolexical
Image Image
Memory Processes
Memory Processes Memory Processes
Image
Memory Processes
Implicational
Skeletal Effectors Articulatory

Image Image
Memory Processes Som Visc
Memory Processes
Legs
Hands
Eyes
Head
Mouth
Breath
Mouth
Lips
Breath
Arms
Arms
FIGURE 10.6 Interacting cognitive subsystems: a nine-subsystem architecture for the human mind
(Barnard 1985; Teasdale and Barnard 1993)
passing information to and fro among them. While there is much technical
detail to add, these interactions, together with that between the propositional
and implicational subsystems, collectively give rise to our more advanced cog-
nitive capabilities.
As with our first analytic lens, this specification of types of mental image
and mental processes fractionates the mind in a way that enables us to be very
specific about properties of representations ‘in mind’ and how they depend one
on another. In the case of music and dance, we can illustrate how this kind of
fractionation enables us to address the ambiguities raised earlier about shape
and meaning. Try imagining in your ‘mind’s eye’ either a very simple score for
the scale of C or a dancer spinning on the spot. The image you create would be
assigned to the image component of the spatial-praxic subsystem, and it would
have been generated from a mental image of an idea with specific properties
344 Music and Shape
and relationships—proposition—which, in turn, was derived from this textual

instruction.
Now imagine in your ‘mind’s ear’ the sound of a flute playing that scale, or
of a vocalist singing the same scale but in the form of do re mi, etc. This time
the images would again be generated in the propositional subsystem but real-
ized in the image component of the morphonolexical subsystem. Ideas have
propositional form, but their content needs to be seriated over time within
the processes that project into the morphonolexical and spatial-praxic images.
These images have different properties from sensory images of stuff out there
in the world: they are more abstract and lack the full detail of sensations in the
moment. Hence, we can talk about shape within a temporal envelope in a way
that has different properties from those we addressed in the context of the four-
subsystem mind of Figure 10.5. In effect this more differentiated schema allows
us to organize explanatory possibilities in relation to what we mean by shape
and how shape relates to meaning.
Indeed, a central aspect of this architecture is that it provides a clear basis
for confronting the parts played by meanings in mental life in general and artis-
tic practice in particular. Embodied, perceptually derived meanings, coupled to
emotion of the ineffable kind discussed earlier for a four-subsystem mind, are
now subsumed in the implicational subsystem. Note, however, that the propo-
sitional subsystem also feeds into implicational meanings, and hence these now
blend perceptual, embodied, and also semantic dimensions with emotion. The
meanings that started out as ineffable are now ‘ineffable plus’. Implicational
meanings are where we locate our mental models of self, the world, others,
cultures and so on (Teasdale and Barnard 1993). Its image is where we exper
ience feelings, intuitions, and knowing or wisdom. In the current framework it
provides clear constraints on the ‘semantic space’ (see also Sloboda 1985), in
which emotional, musical or balletic attributes are intertwined. In contrast, it is
important to note that propositional meaning takes no direct input from sensa-
tion or embodiment (the arrows from the sensory subsystems pass behind the
propositional subsystem rather than entering it). The presence of two recipro-
cal arrows that interlink implicational and propositional meanings identifies
these interactions as the central engine of human ideation and innovation, and
its couplings to emotion.
Although its ineffability is far from easy to unpack, this architectural lens
provides a principled logic that enables us to say something specific about prop-
erties of meaning that can be tested against empirical findings. Notions such
as climax, or closure, of ‘shapes’ in music or dance may derive some dimen-
sions from the core multimodal fusion of our four-subsystem architecture (e.g.
energy build-up or release), but others may come from an overlay of meaning
that emerges with ‘ineffable plus’ and metaphor (e.g. climaxes or closures in
episodes of narrative or balletic form). Yet other properties may derive from
the use of propositional meanings to craft images in the mind’s eye or ear and
to translate those into movements for performing music and dance as well as to
write down notes or musical scores.
In the context of our research on dance, we have used this analytic lens
to study how dancers direct attention to and use different forms of imagery
while they are creating innovative movement vocabularies in the studio. Our
studies used a technique called experience sampling and examined a range of
tasks used in Wayne McGregor’s practice. From this we were able to establish
a great deal about the distributions of dancers’ habits of mind during their
creative work (May et al. 2011). Off the back of this empirical knowledge we
were then able to develop strategies for enriching their imagery to break those
mental habits—choreographic thinking tools. A more elaborate discussion of
the theoretical basis of these tools is provided elsewhere (deLahunta, Clarke
and Barnard 2012).
Augmenting practice: choreographic thinking tools
Our particular interdisciplinary journey brought together practice in a contem-

porary dance company with the methods of applied psychology and a theory
base from cognitive science. One outcome was the development and testing of a
teaching resource for dance education called Mind and Movement (McGregor
et al. 2013).10 This resource was based on the Choreographic Thinking Tools
that have emerged from our collaborative research around Wayne McGregor’s
methods of working together with his dancers in the creation of movement
material for his performance works. This practice has undergone useful aug-
mentation via the application of our two analytic lenses and by empirical meth-
ods that probed several aspects of his and the dancers’ choreographic thinking.
The decision to develop a teaching resource for schools served to connect both
the research (R-Research) and the education (Creative Learning) departments
of Wayne McGregor | Random Dance.
With funding from the Paul Hamlyn Foundation, Mind and Movement was
developed over a two-year period and was iteratively tested with several candi-
date schools and through ongoing consultation with a specialist research group
from the Graduate School of Education, University of Exeter. With the aim to
‘develop students’ personal imagination skills in order to enhance the creation
of new and original dance movement’, the resource was geared mainly towards
secondary-school teachers, but it has proved to be valuable for teachers, stu-
dents and practitioners of almost any age and level. A key feature of the mind
and movement resource is that it exercises mental as well as physical skills. The
resource includes a teachers’ guide with five lesson plans that focus on transmit-
ting a set of twelve principles to be used in the context of imagination exercises.
In addition to these exercises, it contains a set of clear tasks for developing and
structuring movement material, and questions to encourage reflection.
346 Music and Shape
The lesson plan traverses selected forms of imagery and puts in place numer-
ous strategies and principles for breaking mental habits, for enriching the
content of imagery, and for scaffolding the translation of image content into
movement material. Since this chapter is concerned with intersections between
music and dance, Figure 10.7 reproduces two parts of the lesson plan in Mind
and Movement specifically directed at the use of auditory and musical stimuli.
(a) Extract from Lesson 5: Exercising Imagery based upon External Acoustic Images
Imagine
· While listening to the music, select one
of the properties and create a Visual
Score on paper that follows the changes
in this property, e.g. the pitch goes from
Images A Images
high to low. Make short strokes rather specific
of idea in of idea in
than a continuous line. mind’s eye idea mind’s ear
· Repeat while listening to another Schemas,

intuitions,
property to add to your visual score. feelings and
emotions
· Imagine another a piece of music
which this music reminds you of
that is personally important to you. Figure of three imagery loops from Teacher’s
Guide used to enrich teacher and student
· Now imagine the colour that you would understanding of mental imagery
associate with this other piece of music.
(b) Extract from Lesson 5: Using imagery Example Principles used in this extract
developed from External Acoustic images • Select a drawn element uses the Assign/Choose principle
in movement creation Graphic: Other examples:
Select a part of an image to start from
Assign
Make a decision about the shape you
Create want to use
Pick a word you want to work with
Using the visual score to create Choose a connection between image
movement and movement
• Give it an action uses the Exemplify/Make It Specific principle
• Select one of the lines from your Visual Graphic: Other examples:
Score and give it an action, e.g. a straight
Exemplify Name the kind of object
line that crumples, a curved line that
e.g. a hat becomes a baseball cap
rotates, a zigzag line that straightens.
Decide that a sound is percussive
Without listening to the sound, explore
Give a surface a rough texture
the action with movement.
Put lemonade in a glass
• Repeat with all the lines or shapes • Using a colour associated with a song uses the Superimpose/
on your score to create a movement Layer principle
phrase you can repeat. Graphic: Other examples
Superimpose Combine a sound image with a visual
• Think back to the colour you image
associated with your imagined song Connect something in the mind to
and use this to layer a quality onto something ‘out there’ in the world
your created movement. Bring things that were apart together
Colour in a shape
FIGURE 10.7 Extracts from the Mind and Movement educational resource that illustrate (a) the
development of imagery based upon musical stimuli and (b) the translation of that imagery into
innovative movement material (content and graphics credit jointly to Wayne McGregor | Random
Dance and Magpie Studios, London)
The upper panel of this figure reproduces a segment dealing with the devel-
opment of imagery in response to a musical stimulus, and this section occurs
in the plan after students have been asked to identify a number of properties
of the music. From that point of departure they are asked to develop a visual
score and then layer in additional properties. Attributes of sounds are trans-
lated into shapes and emotionally enriched by meaning (through the ‘person-
ally important’ probe) and colour.
The upper panel of the figure also includes a graphic representation, taken
from the teaching resource, that is directly derived from the mental architecture
of Figure 10.6. It shows three mental processing loops—a representation of
the reciprocal arrows from the earlier, more detailed diagrammatic form. One
loop involves relating intuitions and feelings to specific ideas, a second relates
specific ideas to content in the mind’s eye, and the third relates specific ideas
to content in the mind’s ear. The imagery task is all about strategies for differ-
entiating and enriching image content—a key element of breaking habits and
generating innovations in their choreographic thinking. While this particular
example actually invites the students to generate a visual score, the primary
task is to be conducted within the mind alone. It is helpful to construe the
imagine task as an ‘attentional score’ that directs attention to move among the
three imagery loops and to delve into, and differentiate, the content of specific
images.
The lower panel of Figure 10.7 contains extracts from the same lesson and
illustrates some of the many mental steps involved in taking properties of men-
tal images, translating them into innovative movements and exploring how they
feel. The right-hand side of the lower panel provides examples of the twelve
general principles employed in this mapping, and additional examples of how
the same principles can be called into play across a number of modalities of
imagery. It is also noteworthy that the graphics, used to illustrate and index
each principle, were carefully designed to enhance senses of meaning through
visual shapes. Superimposing one property over another is a principle that can
be applied to any class of information or action. Looking back to our earlier
discussion about classes of intersection between music and dance within chor
eographic thinking, we can think of the visual scores derived from properties
of music as media for realizing information to underpin decision-making in
choreography or to constrain performance improvisations.
Mind and Movement is designed to be usable by teachers who are not spe-
cialists in theories from cognitive psychology. Therefore the model of mental
architecture that we have used throughout this collaborative research has been
carefully integrated into the structure of the lesson plans, the progression of the
exercises, and the questions that encourage the students to reflect on their exper
ience. The creative products of engaging with Mind and Movement clearly have
artistic potential, but these products may have wider impact for others curious
about collaborative research of the kind summarized here.
348 Music and Shape
Conclusion
Choreographers, observers and critics use many forms of information and

imagery to guide their decisions, observations, interpretations or criticisms.
The term ‘shape’, with all its limitations, has generic communicative utility
for indexing some property of music or of movement, or even to index the
more ineffable meanings and relationships that are intuited to ‘make sense’
in an artistic context. The general thrust of our explorations of intersections
between shape in music and dance is that we need to probe choreographic
and musical thinking to establish a basis for deeper shared understandings
of exactly what types of information are used and how information is used in
varying contexts.
Our two lenses, drawn from applied psychology and cognitive science, offer
a menu of options for consideration by artists at almost any interdisciplinary
table. The analysis of bridging and choreographic processes suggests that the
artistic decision-making space into which diverse information is factored is not
just large and resistant to differentiation. A structured set of processes and rep-
resentations underlie all forms of design, and these can be probed in ways that
have complementary properties to those identified in music theory or tradi-
tional dance scholarship. Our second lens, which addressed the internal behav-
iours of the mind, offers another form of scaffolding for augmenting practice.
It provides specific concepts and allied principles to share and extend under-
standings both of the creative use of imagery and of how attention is directed.
It can even support quite sophisticated discourse concerning the details of our
mental currency of embodied meanings, intuition, feelings and emotions. That
discourse can potentially enrich and train artistic endeavour without threaten-
ing to explain it away in oversimplifications.
The work we have focused on here has drawn heavily on practice in just one
company. However, it is exciting to note that other contemporary choreogra-
phers and companies have been developing a variety of publications and educa-
tion platforms with the aim of furthering understanding of their choreographic
ideas and processes and of ‘bringing these into newly productive relations with
both general audiences and other specialist practices’ (Leach, Whatley and
deLahunta 2009). Some of these distinctive publications are closely coupled
to musical understandings, for example, A Choreographer’s Score by Brussels-
based Anne Teresa De Keersmaeker (2012).11 A Choreographer’s Score is suf-
ficiently detailed and comprehensive to serve as an appropriate study object
of ‘choreographic thinking’ for scholars and scientists wishing to further their
understanding of latent form and ineffable meaning at the specific intersection
between dance and music.
The ‘applied’ orientation of our analytic lenses has been vital. Neither lens
would have accomplished much, left in the purely academic environment of
basic research. The utility and validity of the research outcomes have been
grounded on long-term, cumulative and intense access to source material about
practice and the embedding of research within that practice. A key feature of
probing how information about shape or other properties is factored into artis-
tic decisions about movement and music has been to understand choreographic
thinking. In order to identify the patterns reported in this chapter, long-term
research access to a particular choreographer and his company was essential,
providing opportunities to document and analyse creative processes in direct
collaboration with the artists.
References
Afanador, K., E. Campana, T. Ingalls, D. Swaminathan, H. D. Thornburg, J. James, J. Mumford,

G. Qian and S. Rajko, 2008: ‘On cross-modal perception of musical tempo and the speed
of human movement’, in R. Kronland-Martinet, S. Ystad and K. Jensen, eds., Computer
Music Modeling and Retrieval: Sense of Sounds, 4th International Symposium, CMMR
2007, Copenhagen, Denmark, 27–31 August 2007 (Heidelberg: Springer), pp. 235–45.
Anderson, Z., 2014: Review of ‘Tractys: The Art of Fugue’, The Independent, 10 February
2014.
Barnard, P., 1985: ‘Interacting cognitive subsystems: a psycholinguistic approach to short
term memory’, in A. Ellis, ed., Progress in the Psychology of Language, 2 vols. (London:
Erlbaum), vol. 2, pp. 197–258.
Barnard, P., 1991: ‘Bridging between basic theories and the artefacts of human-computer
interaction’, in J. M. Carroll, ed., Designing Interaction: Psychology at the Human-
Computer Interface (Cambridge: Cambridge University Press), pp. 103–27.
Barnard, P., 2004: ‘Bridging between basic theory and clinical practice’, Behaviour Research
and Therapy 42/9: 977–1000.
Barnard, P., 2012: ‘What do we mean by the meanings of music?’, Empirical Musicology
Review 7/1–2: 69–80.
Barnard, P. and S. deLahunta, 2011: ‘Creativity and bridging’, Creative Research Center Guest
Blog, Montclair University, https://blogs.montclair.edu/creativeresearch/2011/04/19/
creativity-and-bridging-by-philip-barnard-and-scott-delahunta/ (accessed 9 April 2017).
Barnard, P., D. Duke, R. Byrne and I. Davidson, 2007: ‘Differentiation in cognitive and
emotional meanings: an evolutionary analysis’, Cognition and Emotion 21/6: 1155–83.
Burrows, J., ed., 1998: Conversations with Choreographers (London: South Bank Centre).
Burrows, J., 2010: A Choreographer’s Handbook (Abingdon: Routledge).
Cage, J. and A. Knowles, 1969: Notations (New York: Something Else Press).
Clayton, M., R. Sager and U. Will, 2004: ‘In time with the music: the concept of entrain-
ment and its significance for ethnomusicology’, ESEM Counterpoint 1: 1–82.
Cross, I. and E. Tolbert, 2009: ‘Music and meaning’, in S. Hallam, I. Cross and M. Thaut, eds.,
The Oxford Handbook of Music Psychology (Oxford: Oxford University Press), pp. 24–34.
Davis, R., 1961: ‘The fitness of names to drawings’, British Journal of Psychology 52: 259–68.
De Keersmaeker, A. T. and B. Cvejić, 2012: A Choreographer’s Score: Fase, Rosas danst
Rosas, Elena’s Aria, Bartók (New Haven: Yale University Press).
350 Music and Shape
deLahunta, S., 2013: ‘Publishing choreographic ideas’, in E. Boxberger and G. Wittmann,

eds., pARTnering Documentation: Approaching Dance, Heritage, Culture. 3rd Dance
Education Biennale 2012 Frankfurt am Main (Munich: E podium), pp. 18–25.
deLahunta, S. and P. Barnard, 2005: ‘What’s in a phrase?’ in J. Birringer and J. Fenger,
eds., Tanz im Kopf/Dance and Cognition: Jahrbuch Nr. 15, Gesellschaft für Tanzforschung
(Münster: LIT), pp. 253–66.
deLahunta, S., G. Clarke and P. Barnard, 2012: ‘A conversation about Choreographic
Thinking Tools’, Journal of Dance and Somatic Practices 3/1–2: 243–59.
deLahunta, S., W. McGregor and A. Blackwell, 2004: ‘Transactables’, Performance
Research 9/2: 67–72.
Hodgins, P., 1992: Relationships Between Score and Choreography in Twentieth-Century
Dance: Music, Movement, and Metaphor (Lewiston, NY: Mellen).
Humphrey, D., 1959: The Art of Making Dances (New York: Grove Press).
Jordan, S., 2000: Moving Music (London: Dance Books).
Jordan, S., 2012: ‘Mark Morris marks music, or: what did he do to Bach’s Italian Concerto?’,
in S. Schroedter, ed., Bewegungen zwischen Hören und Sehen. Denkbewegungen zu
Bewegungskünsten (Wurzburg: Königshausen & Neumann), pp. 219–36.
Juslin, P. and J. A. Sloboda, eds., 2010: Music & Emotion: Theory and Research, rev. edn
Kirsh, D., D. Muntanyola, R. J. Jao, A. Lew and M. Sugihara, 2009: ‘Choreographic
methods for creating novel, high quality dance’, in L. L. Chen, L. Feijs, M. Hessler, S.
Kyffin, P.-L. Liu, K. Overbeeke and B. Young, eds., Design and Semantics of Form and
Movement, DeSForM 2009: 188–95. Available at http://www.northumbria.ac.uk/static/
5007/despdf/designres/2009proceedings.pdf (accessed 9 April 2017).
Leach, J., S. Whatley and S. deLahunta, 2009: ‘Choreographic objects: traces and arte-
facts of physical intelligence’, http://projects.beyondtext.ac.uk/choreographicobjects
Long, J. B. and J. Dowell, 1989: ‘Conceptions of the discipline of HCI: craft, applied sci-
ence and engineering’, in A. Sutcliffe and L. Macaulay, eds., People and Computers, Vol.
V (Cambridge: Cambridge University Press), pp. 9–32.
May, J., B. Calvino- Merino, S. deLahunta, W. McGregor, R. Cusack, A. Owen, M.
Veldsman, C. Ramponi and P. Barnard, 2011: ‘Points in mental space: an interdisciplin-
ary study of imagery in movement creation’, Dance Research 29/2: 402–30.
McGregor, W., P. Barnard, S. deLahunta, J. Wilson and E. Douglas-Allan, 2013: Mind and
Movement: Choreographic Thinking Tools (London: Wayne McGregor | Random Dance).
Mithen, S., 2005: The Singing Neanderthals (London: Weidenfeld & Nicolson).
Rainer, Y., 1974: ‘The mind is a muscle: a quasi survey of some “minimalist” tendencies in the
qualitatively minimal dance activity midst the plethora or an analysis of Trio A’, in Yvonne
Rainer: Work 1961–73 (Halifax: Press of the Nova Scotia College of Art and Design), p. 63.
Sloboda, J., 1985: The Musical Mind: The Cognitive Psychology of Music (Oxford: Clarendon
Press).
Tan, S.-L., P. Pfordresher and R. Harre, 2010: The Psychology of Music: From Sound to
Significance (Hove: Psychology Press).
Teasdale, J. D. and P. J. Barnard, 1993: Affect, Cognition and Change: Re-modelling Depressive
Thought (Hove: Erlbaum).
Reflection
Richard G. Mitchell, film music composer
As a practitioner of film music composition, I make music that is shaped by

its relationship with a moving picture. Music’s role in most commercial films,
whether documentary or drama, is to help direct the audience’s emotional
response in synchronization with the action. In this sense music shapes the
audience’s feelings every bit as much as the action.
Film directors see music as an aid to help the audience perceive their films in
the same way they do. They talk about their work having story arcs and narra-
tive curves, and in this sense, although I am a musician and composer, I think
of myself as a film maker together with them.
The established convention in film-making is that the musical elements are
usually composed and produced in post-production. At this point, the film is
shot and it is too late to change its content, although re-editing the constituent
parts is often still ongoing. The score is usually crafted at a point where the film
is locked as a ‘fine cut’ (i.e. it’s not possible to change the film’s picture edit, and
every frame of the running order is finalized).
Composing the score for a movie’s narrative structure involves the ability,
therefore, to shape music to the absolute time frame of the film. Although
there are as many approaches to using music in film as there are styles of film-
making, the fact that a film starts at time X and ends at time Y means there
is an emphatic structure imposed on any music that is made to work with it,
which will be dictated at the very least by its overall length.
The shape of the music is affected by many considerations within the scenes
that make up the film. Before the content of any music is determined, work on
a project involves lengthy discussion about where to put music. The principle
of where to start and stop a piece of music within a film can itself engage
many possible responses from an audience. Whether the film-makers wish to
use music to prepare an audience emotionally or perhaps simply geographically
for an incoming scene or thought, or whether they require music to inform the
351
352 Music and Shape
audience as to how they’d like them to react, a variety of simple techniques

involving the shape of the music are used to try to provoke specific responses.
So the way the music works in terms of its emotional development and
impact is shaped around the events within the film. ‘Spotting’ is the process of
discussion where we define this. In the spotting session the film-makers (usually
the composer, the editor and the director) make decisions about how they think
the various music segments (‘cues’) might work and often where they think the
film needs specific help from a music score. They discuss when each cue will
start and stop to the nearest frame throughout the film and what its purpose is.
In addition, there may be specific sync spots within the cues where the music
has to mark very specific moments, often of an emotional nature.
However—and just as important as deciding how to shape the music in
the context of a film—much time is spent determining when not to use music,
because its emotional potency can be much greater if music is targeted and
used sparingly. In addition, the impact of music can be just as significant at the
point where it stops as where it starts.
The length of a film frame in cinema is 1/24th of a second, and during a
montage of scenes edited in quick succession how the rhythm of the music
works against the cuts can change the audience’s perception of the picture in
variety of ways. A simple example is what’s known in film music as ‘pushing a
cut’—making a dramatic phrase fall rhythmically a frame or two before a cut to
some significant scene change or a piece of action. This will have a very differ-
ent emotional effect on the audience compared to the same musical statement
falling after the cut or sync point.
The positioning of music in synchronization with a film has important con-
sequences for decisions made about its shape. Thoughts about what is required
from moment to moment within a movie then form the basis of a sort of map
by which the composer will be guided in interpreting a particular vision of the
narrative.
There are myriad techniques and theories about how music works with film
that can be taught or learned with experience, but the effect of music playing in
conjunction with a scene can never be completely anticipated, and the results
often surprise. When placed in sync with a specific piece of action, a slow ada-
gio might induce a faster feeling of pace than running a cue with an intense,
fast rhythmic feel. Even experienced practitioners of the medium sometimes
find it difficult to know for sure quite how to resolve a music cue.
When the first sound films were introduced in the early 1930s, there quickly
developed a list of simplistic Hollywood clichés that followed the use of music
accompanying the silent movies. Bright and breezy pieces were used to mirror
happy feelings when the action was fun, and sad, slow, tear-jerking music was
scored when the audience was to feel miserable. Then through the 1940s and
1950s that tradition slowly evolved into the more complex and sophisticated
use of music in film we find today.
Reflection: Richard G. Mitchell 353
It is feasible to shape music that clearly points to an emotion that was

intended in the script but wasn’t conveyed by a performance. With a specific
facial expression, one of the characters might have been thought to misinter-
pret a line of dialogue, and a decision might be made to let the music rework
that moment in a totally different way from what is being suggested on screen.
The score might be required to act as juxtaposition to what is implied by
the dialogue or visual language. A moment of anger might on the one hand be
reinforced by the music because it is thought that the emotion wasn’t acted out
forcefully enough on screen, or indeed the same anger might require a more
sympathetic passage of music in order to inform the audience that there is some
sad justification for it. An odd, simple rhythmic device might run through a
scene in order to make an audience perceive an element of suspense or unease,
when the scene itself may not actually exhibit anything associated with such a
feeling.
Music can also be used to subvert a scene or a moment in a film. It’s pos-
sible to wrong-foot an audience by emotionally manipulating them to feel
exhilarated, using music with a positive, major, upbeat style that will ultimately
heighten a feeling of despair when used to contrast a dark payoff as the piece
ends. A feeling purposefully generated by music within a scene might have the
intention of acting as a contrast to what’s coming, setting up a powerful emo-
tional twist.
Whether we push the action and use music simply to excite, thrill and propel
us through events, or whether we decide to use some slow and sustained musi-
cal device informing us of something more cerebral about the underlying emo-
tional motivation, in contradiction to what is being presented on the screen,
making music in film has the purpose of shaping the audience’s response to
what they are seeing.
Since I was a teenager, I’ve been driven by the notion of composing and pro-
ducing music to some sort of moving image, so I’ve always created music that
has its shape dictated by the content of a film. However, precisely because my
work belongs within a collaboration, I have found cinema a great medium for
contemporary composers to work in. While allowing certain aspects of their
work to be shaped by the restrictions of film, it offered composers in this field
during the second half of the twentieth century the opportunity to earn a living
while experimenting with interesting musical approaches that few members of
the public would have gained access to in the concert hall. Commercial cinema
changed the public’s acceptance of new musical forms by osmosis. Some styles
of musical experimentation had their day while shaping commercial cinema. So
in the movie 2001 the director, Stanley Kubrick, introduced several pieces by
Ligeti to a huge cinema-going public who may not have chosen to sit through a
concert of his music. In the context of a commercial movie the public embraced
a new musical language that would otherwise have been inaccessible to many.
And although the new musical language used in scenes like this had an effect
354 Music and Shape
on the audience, some of this musical experimentation in turn affected the way
movies came to be shot and edited, and thus the shape of a new breed of cinema
was affected by new music.
As an alternative to the modus operandi of most composers who only ever
allow the shape of their music to be dictated by film, I prefer, when given the
opportunity, to compose and produce music in response to a script. So before
a single frame is shot and the production has got under way, I can, in discus-
sion with the director, influence the shape of a film in terms of how it is both
shot and edited. There are directors I’ve developed relationships with who will
take music I’ve prepared for them and construct scenes in which they move the
camera in response to my music, making the notion of shaping a two-way rela-
tionship. The shape of music in cinema doesn’t always have to be governed by
the shape of the film.
In sum, for me the notion of shape is embraced in my composition process,
which involves the creation of music that is sculpted and moulded around film.
A movie score forms the emotional shape around the film’s narrative, through
which, using the power that music has to shape human emotion, my music aims
to help its audience understand both the basic and the sometimes sophisticated
underlying themes expressed in the storytelling.
As I’ve tried to suggest, music and film have a mutually beneficial rela-
tionship in shaping each other in the minds of those who watch and listen,
and when that really works well it gives composer and film-makers together a
uniquely satisfying creative experience.
PART 5
Shapes felt
Reflection
Julia Holter, singer and composer
Most of the time I find that once a piece of music I have made has a ‘shape’—
or as I say it, has a ‘form’—it is finished, regardless of what shape it is. But it’s
hard to say how I know at what point it has a shape—it’s obviously a subjective
thing. I think I have in my mind a kind of closed rounded figure whose shape
changes continuously, like an amoeba or something. But it can (and always
will) stretch and morph into something new with every experience of listening
to the piece; all the parts inside are alive and will move around and change. It’s
just important that it is closed. That closure and the fact that things within it
can change but always remain within is what makes it a piece.
Again, it is hard to know how to explain the moment at which I decide
something I’ve written ‘has a shape’ and thus is finished, but a lot of times
what I notice the most is a consistent timbral blend. Once the individual sound
sources—the different instruments as well as ambient noises—start intertwin-
ing and feeling ‘like one’, that is like the closing of the shape. It is the point
at which I no longer can distinguish one part from another, because, even if
I think I can, that part that was recorded will never actually be what it was in
the moment it was recorded. For instance, maybe it will be clear that the bass
is the bass, but once the recording is finished, what was once ‘the bass’ will be
forming endless numbers of relationships within the song with other sounds,
every time I listen. Of course, those interactions would happen the moment the
bass was introduced into the recording, whether the piece was finished or not,
but when out of the chaos of all these relationships I get a feeling of oneness—
that’s when I know it’s done. I think this feeling of oneness comes when the
relationships between sounds grow so strong that they start forming offspring
sounds (artefacts, noise, interesting harmonics) and the connections between
all the sounds become denser, so that they are no longer independent.
Today I had an experience where I started hearing things that were never
recorded in a song I was working on—phantom sounds. I think I even heard
357
358 Music and Shape
people talking, and there were no actual people ever recorded talking. It was
because there was so much going on in the song during a particular climactic
moment that there was a lot of distortion and there were complicated interac-
tions between frequencies. It created noise and these artefacts, and it is what
made me love it even more. Everything was blending together and, in so doing,
seeming to emphasize new sounds. I think almost all pieces of music I like do
this to me, so I was glad it was happening.
I’m not a mathematician, but I am pretty sure there will always be an infinite
number of sounds that possibly could be heard while listening to a recorded
piece of music, because every time you hear it you might hear different fre-
quencies brought out in the harmonic spectrum. So when I say that the sounds
might ‘blend’, I don’t mean that when they blend and become ‘like one’ there
are fewer sounds or that they become just ‘one sound’. But that ‘feeling’ I get
when I hear outcomes of relationships between sounds—whether it’s people
talking or noise or whatever—is like an acknowledgement that the sounds have
become self-sufficient creatures, breathing on their own and reproducing, but
all within the world of a particular piece, a particular amoeba.
11
Musical shape and feeling

Daniel Leech-Wilkinson
Prior (2010) and, in this volume, Prior (Chapter 7), Greasley and Prior
(Chapter 8) and the many Reflections offered by practitioners all show how
widely, easily and variously the concept of shape is used by musicians. And yet
what do sound and shape have to do with each other? I argue in this chapter
that, more than everything else it affords, shape functions as a quasi-submodal
concept, common to all the senses, that readily links musical sounds to the
feeling responses of listeners. Music invokes feeling states through modelling
their dynamic properties, and in turn those dynamic properties are used by
musicians to help them give lifelike qualities to music. In speaking of shape,
musicians are indicating, by a highly efficient means, the character of, or the
need for, dynamic patterns in sound that can model states of movement and
feeling, or indeed of anything that changes over time. To make the case, I draw
on recent work in philosophy, psychology and neuroscience in search of mech-
anisms capable of affording the apparent interconnectedness of shape and
musical sound. The chapter builds through a survey of relevant recent work,
climaxes in the middle with a view of the late work of Daniel Stern, and winds
down through thought about underlying mechanisms.
Automatic and learned responses
Synaesthetes who see colours on hearing sounds sometimes report that the
colours appear in specific shapes, including mobile shapes (Ward, Chapter 9
in this volume). Nonsynaesthetes, presented with animations representing syn-
aesthetes’ experiences in original and altered forms, tend to prefer the originals,
suggesting that the mappings, while highly personal, are not arbitrary (Ward
et al. 2008) and recruit some of the same mechanisms as normal perception
(Ward, Huckstep and Tsakanikos 2006). That a music–shape relationship is 359
360 Music and Shape
used so widely by musicians (Prior 2010, 2012) suggests that the mapping is
very easy to make, and therefore that the capacity is to some extent structural
or at any rate very thoroughly learned.
Walker et al. (2010) showed that infants as young as three or four months
prefer matchings of visual and pitch direction (both moving up, or both mov-
ing down) and pitch and sharpness (rising and pointed, or falling and rounded)
that correspond to the mappings reported in the extensive literature on adult
cross-modal mapping and synaesthetes, implying that infant perception might
be synaesthetic at birth though, in most people, later unlearned (Maurer and
Mondloch, 2006). But how fully unlearned?
Ramachandran and Hubbard famously demonstrated (2001; see also
Maurer, Pathman and Mondloch 2006) that almost everyone agrees that
rounded and pointed shapes map onto the nonsense words ‘bouba’ and ‘kiki’
rather than ‘kiki’ and ‘bouba’, confirming the findings of a similar experiment
by Köhler (1929) with ‘baluma’ and ‘takete’ (see Spence 2011 for a review of
these and other studies, including cross-cultural ones). Exceptions have been
found among children with autism spectrum disorder and people with damage
to certain brain areas, suggesting ‘that crossmodal correspondences, at least
those involving sound symbolism, can occur at quite a high level’ (ibid.: 974).
On the other hand, Dolscheid et al. (2013, discussed from a similar perspec-
tive in Eitan 2013) provided good evidence that the metaphors for pitch posi-
tion available in one’s native language influence one’s sense of pitch. For Dutch
speakers, high and low visual stimuli influenced the pitches they sang, while
thin and thick did not, and for Farsi speakers, for whom pitches are thin and
thick rather than high and low, it was the thickness of the visual stimuli that
had the effect, to approximately the same degree. In this respect, at least, lang
uage use apparently feeds back into music cognition. Training Dutch speakers
in the use of the Farsi terms produced a similar result, but training them in the
opposite terms (thick for high and thin for low) did not. It seems, then, that
language has an effect only in the direction already established in pre-linguistic
infant synaesthesia: it can reinforce but not fully determine our responses
(Eitan 2013). Similarly, the linguistic terms (high, low, thick, thin) appear to
have been adopted by languages because they have pre-linguistic origins.
Many of the synonyms for musical shape offered by participants in the
research leading to Prior (2012) can be classified linguistically (though not
necessarily according to participants’ intentions) as drawing on images of
either intensity/quantity or trajectory/direction or both (Table 11.1). Images
suggesting simple mapping onto two-dimensional space—shapes as visualized
forms—are less frequent than those suggesting motion with change, which is
perhaps not surprising given the nature of music, but it nonetheless suggests
that there is much more going on here than simply the pitch content of music
being imagined as height in relation to time. (More sophisticated uses are exam-
ined by Leech-Wilkinson and Prior 2014.)
Musical shape and feeling 361
TABLE 11.1 Some of the synonyms for ‘shape’ collected for Prior (2010)
Quantity Both Direction
Large and small Expansion and contraction High and low

Light and shade Growth and decay Backwards and forwards
Fast and slow Swell and dying away In and out
Thick and thin Ebb and flow Up and down
Give and take Peaks and troughs Left to right
Anticipation Crest and trough Ascend and descend
Tension Push and ease Driving and following
Intensity Stretching and relaxing Movement
Climax Build and release Flow
Contrast Impetus Moulding
Timing or pacing Urge Direction or energy
Change over time Impact Momentum
Rhythm Dynamics Growth
Many studies have examined pitch/height perception, and taken together

they offer considerable support for the idea that this is a relationship that is
stronger for those used to seeing music notated with pitch on the y axis and
time on the x than for others (Eitan and Granot 2006; Antovic 2009, compar-
ing Serb and Romani children; Athanasopoulos and Moran 2013, comparing
British, Japanese and BenaBena; Küssner and Leech-Wilkinson 2014; Küssner
et al. 2014). At the same time, the notion that pitch has height may, as we
have seen in reporting Walker et al. (2010), rest on deeper foundations. Prince,
Schmuckler and Thompson (2009: 42), surveying previous studies, report that
‘both musicians and untrained listeners exhibit activation in visual cortex while
attending to pitches within a melody’, but they go on to find that ‘convert-
ing contour information from the auditory to the visual domain exploits the
skills that musical training confers’. In other words, musical training sharpens
a sense that pitch maps onto space, but (just as we saw in assessing the role of
language) this particular cross-modal mapping may call on a more widespread
ability.
Perhaps a more challenging question to ask is how the dimensions of music
other than pitch are imagined in terms of space. Can we extend the notion of
shape in its visuo-spatial sense to more aspects of musical sound? And if so,
can they be combined to generate the strength of association that we seem to
find ‘shape’ having for many musicians?
Eitan and Granot (2006) investigated perceived similarities between sev-
eral musical parameters (including ‘dynamics, pitch contour, pitch intervals,
attack rate and articulation’) and found human motion associated with them
all, though in varying degrees and directions, as summarized in Table 11.2.
The finding on speed is revealing. Acceleration might associate with descent
362 Music and Shape
TABLE 11.2 Associations reported in Eitan and Granot (2006)
Loudness
Crescendo Coming closer, acceleration, increasing energy (due to external force in slow
tempos only), running motion
Diminuendo Moving away, deceleration (at slower tempos only), falling pitch, falling or
sliding motion
Pitch contour
Rise Acceleration, spatial ascent, moving away (small effect), higher energy, running
or walking
Fall Deceleration, spatial descent, lower energy, leftwards motion, falling
Speed
Acceleration Descent
Deceleration Descent, moving away
Articulation
Legato changing Moving away, slowing (at slow speeds)
to staccato
Note: Findings listed here combine some of their categories.
through our experience of gravity, whereas deceleration might associate with

descent through the quite different experience of increasing exhaustion or
declining intensity of feeling or engagement. The embodiment of music is
complex and multifaceted, as we can see from the subtlety and accuracy with
which music can associate with other phenomena (Leech-Wilkinson 2009a;
Juslin and Lindström 2010).
While Eitan and Granot (2006) found that associations involving verticality
and speed change are stronger for musicians (i.e. those with musical training)
than for nonmusicians (those without), other associations—including loudness
with motion and energy, and pitch direction with energy—seem equally strong
for both groups. Küssner and Leech-Wilkinson (2014) found 83 per cent of
nonmusicians matching pitch and height as against 98 per cent of musicians,
and 73 per cent of nonmusicians matching loudness and energy as against 93
per cent of musicians. The 73 per cent figure for the nonmusicians is not trivial,
even though their responses are more varied. It is hard to say, then, that shape
is linked to pitch height (still less loudness to energy) largely through training.
However strengthened the association may have become through practice, it
seems distinctly possible that it has deeper roots, though whether biological
(inherited) or embodied (acquired through physiological interaction with the
environment, Johnson’s sense of ‘embodied’; 2007) remains an open question.
A sense of musical shape, then, may have more to do with the multiply chang-
ing dynamics of an event than with its pitch contour (though that too is impor-
tant for musicians). The importance of this will become clear when we look at
Stern’s work on the dynamics of everyday life.
So far we have considered relationships between visualizable shape and

music. But what of other shaped experiences? Intensity, which is found in all
dimensions of musical sound, was identified by Marks (1978) and Eitan and
Granot (2006) as a key attribute through which we associate auditory and
nonauditory domains, while Eitan and Granot (2007) showed how easily it
is mapped across musical parameters. Stevens et al. (2009: 806, 809) suggest
that arousal response is more consistent than valence, which may be more
subject to cultural norms, which in turn suggests that response to intensity
(in so far as arousal is intensity) may be the more fundamental mechanism.
We may go further and suggest that intensity comes closest (among quali-
ties accessible to conscious awareness) to the unidentified common parameter
among the terms Eitan and Timmers (2010) collected for pitch differentia-
tion: the young, the high, the sharp, the bright, the fast, the strong, the awake,
the active provide us with more intense experiences (i.e. greater disruption
of a comfortable state) than the old, the low, the blunt, the dark, the slow,
the weak, the sleepy, the passive. Intensity maps easily onto these as well as
onto musical components, and also onto the underlying processes of ten-
sion, surprise, disruption and change that are fundamental to that sense of
directed motion found in musical structures and performances and, indeed,
everyday life.
Most of the studies cited so far, and many more, give us good reason
to think of music as lifelike. That is to say, how sounds interrelate (with
what was just heard, with what happened earlier, with what happens next
in the sense that in the light of that they seem more meaningful) and the
dynamics of the succession of sounds cause them to seem alive and con-
struct a vivid sense of ‘now’ that seems to move with them through time
(Johnson 2007: MOVING TIMES schema; Stern 2004). Music is lifelike,
then, through the constantly changing shapes of the experiences it engenders
in a listener. It is lifelike, too, through our tendency to chunk events into nar-
ratives, separate from the events around them (Lavy 2001; Murray 2003). It
is not that music does anything as crude as to tell a story, but rather that as
one chunk or phrase or event is followed by or transformed into the next, a
sense of narrative develops which in its own musical terms is coherent and
directed (by the composer or performer or both, in collaboration with the
listener). The narrative has shape at a higher structural level than the shape-
liness of moment-to-moment progressions; but at the moment-to-moment
level it is made up of lifelike behaviour, its life played out in sounds and in
our responses to them.
Both music and life can be summarized formally as narrative structures, but
both are experienced in time through sequences of events whose most strik-
ing characteristic is their vitality, created through the manner in which they
change (Stern 2004). Experiences (the events that in sequence accumulate into
the narrative of our lives) begin either gradually or suddenly, and typically
364 Music and Shape
develop either by building to a point of maximum engagement and then tail-

ing off, or just by tailing off gradually as one deals with the consequences of
a sudden start. In this important sense experiences have shape; indeed, that
they are shaped is one of their most evident characteristics. What shapes them
are changes in all aspects of perception and response as they develop. Because
it is constantly changing its dynamic characteristics, and because of the ease
with which it maps across domains, music models or at any rate behaves like
the experience of events. And we map ourselves onto it, experiencing it like a
sequence of lived events with the feelings they arouse (Johnson 2007: esp. 236).
In a number of respects, then, music shapes and is shaped through the dynam-
ics formed by its changing sound.
If we look at music from these perspectives, it becomes very easy to sense
why music and shape are so closely associated, but it is also clear that this is
less directly a matter of shapes that may be visualized in space (though one
could make drawings or diagrams of many of these things) and more a matter
of changing shapes of feeling, that is to say, change in the intensity and qual-
ity of experience over time. Susanne Langer seems to have been speaking of a
relationship between shapes of feeling and shapes of music when she famously
concluded that music’s intrinsic properties ‘sound the way moods feel’ (Langer
1963: 244–5; her italics). And so to make progress in understanding the ease
with which the dynamic content of music maps to and from that of lived exper
ience, in other words for an effective approach to analysing musical shape, we
need a theory that deals with the psychodynamics of everyday life. We find one
in the work of Daniel Stern. (On Stern, see also Kim 2012, 2013.)
The dynamics of experience
Stern’s final work is concerned with The Present Moment (Stern 2004) and with
dynamic experience or Forms of Vitality (Stern 2010). Phenomenology has long
had related concerns, but Stern’s particular focus is on the experience of ‘now’,
the ever-present moment. Stern sees experience as characterized by a sequence
of present moments, each no more than a few seconds in length, which are
shaped by feeling responses to incoming perceptions, and which group together
to form dynamically shaped mini-dramas, sensed as a gestalt, through which
one lives. Stern records examples for analysis by asking participants to describe
episodes in their everyday lives. Here, a participant describes breakfast.
Present Moment 4
I am holding a slice of bread, not yet spread with honey. But it is a differ-
ent kind of bread than I normally buy. It feels strange and I am surprised
by it. I think, ‘What do I do with this bread?’ A mild negative feeling
arises.
This moment took about three seconds. She then spreads honey on the bread
without paying conscious attention to the act. A new moment begins adjacent
to the previous one.
Present Moment 5
I am then aware of biting into the honeyed bread. I like the texture and
think, ‘It’s not so bad.’ And with that a sense of feeling better builds up.
I then become conscious of the radio interview again. (Stern 2004: 13–14)
What we have here is a sequence of remembered experiences split into their

constituent moments; each little scene has a thematic unity, dealing with one
main event plus the thoughts and background events that accompanied it. The
description is necessarily constructed after the event—to self-report simul
taneously would be to alter the experience even more—and therefore an ele-
ment of composition is inevitably involved. In a sense, this is not unhelpful for a
comparison with music, where, in the absence of semantic meaning, narratives
are often consciously constructed as part of the process of finding meaning in
sounds (Wingstedt, Brändström and Berg 2010; Tarasti 2004). Stern provides
a map of the feeling shapes that constitute these two ‘moments’, using curves
to represent the changing intensity of each feeling shape (‘surprise’, ‘feel nega-
tive’, ‘bite + chew’, ‘feel resistance’, ‘surprise’, ‘pleasure’), using the changing
shape and thickness of the curve to give an impressionistic sense of what the
participant reported having experienced (Stern 2004: 15).
Stern derives from his case studies conclusions about the present moment
that are very easy to apply to our experience of music.
• Present moments [read ‘musical phrases’] are unbelievably rich. Much
happens, even though they last only a short time.
• Present moments occupy the subjective now…
• The moment is a whole happening, a gestalt. The psychological
subject matter is the whole, not the smaller units that make it up.
• When she experienced these events it occurred in a now that she could

identify and put boundaries around.
• The present moment is short. In this case … each lasted between
three and five seconds, as estimated by the subject.
• Consciousness is the main criterion used to identify episodes

containing present moments…
• The feelings experienced … trace a time-shape (a temporal profile)
of analogic risings and fallings. In other words, they are carried on
vitality affects (dynamic time-shapes) that contour the experience
temporally.
• A lived story unfolds within each present moment. It is made of many
small experiences that are put together in the subjective present. The
storyline, even if minimal, rides on the temporal feeling shape of the
366 Music and Shape
contoured affects. The unfolding micro-story resolves the novelty or

problem.
• Such moments are not cut off from the rest of life, isolated and

unconnected. Rather, they capture a sense of the subject’s style,
personality, preoccupations, or conflict—in other words, their
experience of the past. Each such moment is psychodynamically
relevant. (2004: 14–16)
So too music, when one focuses one’s attention on it, can be experienced as
wholly engrossing, ‘unbelievably rich’. It occupies the subjective now. It cre-
ates a sense of stylistic and psychological wholeness (the one leading to the
other). Its phrases and sections have definition and gestalt-like qualities of self-
containedness within a greater whole; and at the local level they can be quite
short, short enough to be experienced within a psychological ‘now’. We identify
and understand music in terms of these experienced ‘moments’ which give it its
meaning for us. The feelings it generates ‘trace a time-shape of analogic risings
and fallings. In other words, they are carried on vitality affects (dynamic time-
shapes) that contour the experience temporally’. We sense musical continuity as
having narrative qualities, although it usefully lacks the precision of the ‘lived
stories’ which Stern finds make up everyday life. They ‘capture a sense of the
[music]’s style’ and analogously behave as if they were people, indeed as if they
were us. Thus musical units are indeed ‘psychodynamically relevant’.
Similarly, the curves Stern uses to represent these feeling shapes could
equally well be mapping aspects of a phrase, or melody, or loudnesses, texture,
rhythm, or also qualities like expectedness, complexity, mood, character, edgi-
ness, tension, all adding up in complex ways to give a sense of shape. This itself
emphasizes how directly the musical features (contour, loudness, speed) model
the qualitative (mood, tension and so on).
Stern’s ‘present moments’ bring together things one experiences as ‘now’ in
time spans that are only as long as the 2–5 seconds our perceptual systems allow
(Fraisse 1984). Stern here draws on Husserl’s phenomenology of time: the past
of the present moment, not remembered but rather still experienced because
still fading from the present; the present of the present moment; and the antici-
pated, expected, immediate future. This ties in usefully with existing music per-
ception and analytical theory, which argues that expectation is essential in the
creation of musical meaning (Huron 2006; Stern’s particular reference point is
Narmour 1990). But Stern’s point is that all three stages are taken together as
‘now’. This offers a more ecologically plausible notion of what it is like to hear
and to perform music, both performer and listener aiming always to treat what
comes next as a good continuation of what just happened. It follows, both for
Stern describing life and for us thinking of music, that the present moment is
not always the most intense; it may be happening in the shadow of what hap-
pened a moment ago, but what happened a moment ago is also something that
can define ‘now’ and give it shape. Indeed, ‘now’ can be relatively trivial so long
as one understands it in relation to the past and the anticipated future. And
Stern goes on to make precisely this analogy with music, noting how ‘much of
the richness of music lies in the fact that each subsequent phrase recontextual-
izes the previous one’, but at the same time, ‘A coherent experience was grasped
during the present moment, even though that experience may have multiple
fates’ (2004: 30). There is thus a constant ‘trialogue’ between past, present and
future characteristic of music and life (ibid.: 31).
But there are important differences. Music has hierarchical levels, for one
thing. It has an orderliness, in other words, that we do not find in life. In music
each event leads coherently into the next, or contrasts with it in a way that will
later be reconciled or resolved. There are no loose ends in music, or if there are
we tend to fault it, saying it is badly composed. Music is lifelike in a utopian
fantasy world where everything that happens makes sense and where, whatever
conflicts we may experience along the way, everything turns out for the best.
The entire tradition of music analysis is directed at proving this. (On music as
utopia, see also Levitas 2010.)
In Stern’s view, present moments characterized by ‘temporally contoured
feelings’ (2004: 36) are best thought of as ‘vitality affects’: ‘these temporal
contours of stimulations … are transposed into contours of feelings in us’
(ibid.: 64). Thus ‘temporal contours’ are the objective changes in intensity or
quality of the stimulations; ‘vitality affects’ are the subjectively experienced
shifts in internal feeling states that accompany the temporal contour, their
‘vitality’ being that sense in which they are the most characteristic aspect of
the affective experience of living. Most stimulation of the nervous system,
Stern says, whether it comes from within or without, ‘has a temporal shape
or contour that consists of analogic shifts in the intensity, rhythm, or form
of the stimulus’ (ibid.: 62). And this, of course, is exactly what happens in the
performance of a (western classical) score. Through the score, a roadmap is
already provided by the composer—the events of a life are determined—but
what the performer does is precisely to provide a temporal shape through
moment-to-moment adjustments in intensity, rhythm and pitch (or which-
ever of these dimensions is available through the instrument in use) that
provide triggers for the feelings that seem best (most satisfyingly) to arise
from those events. One can think of these as ‘expressive gestures’ (Leech-
Wilkinson 2006, 2009a). Expressive gestures generate vitality affects: it is
through expressive gestures that musical scores, which without them are so
dull as barely to be music at all,1 come to life. And what musicians over-
whelmingly mean when they talk about ‘shaping’ a phrase given in a score is,
as we have seen, that process of enlivening that is achieved through the use
of expressive gestures.
For Stern, though, the key idea is that the dynamic qualities of vitality
affects may be linked across modalities (Stern 2004: 37, 64–5). Vitality affects
368 Music and Shape
occur in many modes and can map easily from one mode to another. This
is certainly what has been repeatedly suggested for expressive gestures, which
index other things, or behave like them, and take meaning from the likeness
(Leech-Wilkinson 2006). As Kim (2013) points out (in her essay for the special
issues of Empirical Musicology Review on music and shape), for both Stern and
his predecessors, Hausegger (1887) and Truslit (1938), the experience of expres-
sive shape (vitality affect) is ‘understood in a broader sense than emotions’
(Kim 2013: 165). The emphasis here is on dynamics, not states, and in that
sense (shaped) forms of vitality may offer a more appropriate way of thinking
about the experience of music than seeing music as a sequence of inductions
or representations of emotional states (ibid.). A shape-focused view of music,
therefore, might offer a more ecologically valid way of understanding feeling
responses to music than do attempts to see music as expressive of particular
emotional states.
A final point of Stern’s (2004), though he intended it as a key to psychoana-
lytic practice, ties in with Kim’s (2013) focus on the aesthetics of empathy and is
especially relevant to our understanding of musical performance and response.
For Stern (2004: 22, 172–3) present moments shared, where analyst and analy-
sand seem to understand each other’s feelings with unmediated clarity, can be
especially intense and life-changing. Stern describes these as ‘a shared feeling
voyage’, when ‘two people traverse together a feeling-landscape as it unfolds in
real time.’ It seems possible that something like this takes place between musi-
cians performing well together and is imagined as happening also for listeners
attending to a performer with exceptional concentration and sympathy. In the
latter case, the listener, and perhaps the performer, feels that they understand
the other and become one with their music.2 It may be an illusion, but there is
intersubjectivity concentrated in these moments, as music happens, of a sort
that may not be unrelated (though this needs focused research) to song as sex-
ual attractor (Miller 2000).
Stern (2010) takes the notion of vitality form (a more abstract conception of
the form underlying a vitality affect) and sees it now as the most fundamental
percept characteristic of life: ‘Subjectively, a thought can rush onto the mental
stage and swell, or it can quietly just appear and then fade. It has a beginning,
middle, and ending… Mental movement, while it is happening, traces a profile
of its rising and falling strength as it is contoured in time. This is its dynamic
form of vitality’ (ibid.: 21). At the same time,
Vitality forms are hard to grasp because we experience them in almost all
waking activities. They are obscured by the felt quality of emotions as it
accompanies them. They are absorbed into the explicit meaning as the
vitality form accompanies a train of thought, so we do not pay attention
to the feel of the emergence of the thought, but only to its contents. It [the
vitality form] slips through our fingers. (ibid.: 10)
And so it is with musical shape. Our attention is so focused on what is happening

among the notes—harmonies, rhythms, themes, events, qualitative effects—that
we normally pay no attention to the shapes traced by these sounds and by the
feelings they engender. Yet it is their shape that carries all these experiences we
notice so directly. This perhaps begins to explain why musical shape, until recently,
lacked a bibliography. On the one hand, as an attribute of the attention-grabbing
musical surface, it is rarely noticed for itself; on the other, among musicians the
term is so easily used as a shorthand for expressivity that it hardly needs explana-
tion for it to work perfectly well as a heuristic (Leech-Wilkinson and Prior 2014).
As far as the underlying mechanism is concerned, ‘Once an experience
activates the brain, it will leave a purely vitality dynamic representation and
a content representation.’ (The content representation refers to the causes of
the stimulus and their implications for us.) ‘The dynamic representation must
encode the speed and its changes, the intensity (force) and its changes, and the
duration, and the temporal stresses, rhythm, and directionality’ (Stern 2010:
25). So ‘[v]‌itality forms are modality non-specific. They belong to no one sen-
sory modality but to all…’ (ibid.: 26).
This helps to explain why music expresses vitality forms so easily and so
clearly. Like vitality forms, music’s modality is also in important senses non-
specific; it can apply to anything that changes over time or (if we wish) to little
beyond itself. And within this abstract medium of music the notion of shape
is abstract too: it offers a concept common to, and realizable through, many
aspects of music, just as music can model many aspects of life. In this sense,
shape is close to being a suprasensory attribute, linked to the most evident of
all suprasensory modalities, which (Marks 1978 argued) is intensity. But where
intensity characterizes a moment, shape summarizes a process, and it is that
that gives it outstanding heuristic value for musicians. So we can say that shape
describes the changing intensities that model one kind of experience in the
domain of another. For Stern, ‘The concept of dynamic vitality forms brings
together four converging lines of thought, namely intersubjectivity, cross-and
meta-modality, the dynamic features of experience, and a phenomenologi-
cal focus on subjectivity’ (2010: 44). And these are exactly our concerns too.
Musical shape communicates among performers and listeners; it maps between
modalities and can be conceived as beyond them; as a concept it encapsulates a
fundamental aspect of lived experience; and it characterizes everything we feel
happening within ourselves.
Stern’s work, then, helps to explain how it is that the notion of shaping
music belongs not just to musical practice, nor to a particular tradition in
which, thanks to notation, pitch is conceived as moving up and down through
space. Rather, it offers a way of understanding how it is that music maps so
easily onto our experience of changing affect. Like musical sound, our feelings
change in intensity, in complex polyphonic patterns, from moment to moment,
experienced always as ‘now’ with a brief, still-vivid past (preceded by a longer,
370 Music and Shape
increasingly hazy tail) and an implied immediate future which may yet sur-
prise us. Shape conceptualizes all these aspects of music and feeling. More than
that, it carries them along in us. As Stern emphasizes, his work has significant
commonalities with phenomenology, sharing its prioritizing of experience over
structure or language as the dominant reference domain for cross-modal sig-
nification. What seems to be implied by the prioritizing of lived experience is
that knowledge of music acquired through our bodies and their preconscious,
unreflective motor and limbic responses underlies cognitive responses, provid-
ing a frame within which they operate and in relation to which they are selected
and form as thoughts about music. Underlying Stern’s work, therefore, and also
the work on embodiment that we consider next, is the wealth of recent research
on music’s interaction with the brain’s limbic and motor systems (Koelsch
2010; Altenmüller, Wiesendanger and Kesselring 2006). A full understanding
of music and shape, in so far as present research allows, would need to take
detailed account of that work.
Embodiment
While the foundation provided by underlying neural mechanisms is acknow

ledged, to understand shape in terms of musical experience we need an
approach to thinking about music and bodily response that illuminates par-
ticular cross-domain mappings. Stern offers one such approach. Another valu-
able source for thinking about musical shape is the ever-growing literature on
embodiment. Mark Johnson’s The Meaning of the Body (2007) is particularly
helpful here. Johnson argues that the movement of our bodies through our
environment, and our daily experience of other moving objects, are fundamen-
tal to our understanding of the world and provide us with ways of thinking
about everything that involves a sense of motion: ‘For example, tension has
a meaning grounded in bodily exertion and felt muscular tension. Linearity
derives its meaning from the spatial, directional qualities of bodily motion.
Amplitude is meaningful to us first and foremost as a bodily phenomenon of
expansion and contraction in the range of motion’ (Johnson 2007: 25).
These are not just abstract qualities: ‘they are qualities of organism–envi-
ronment interactions’ (ibid.). We might add that they are essential features of
music and that they are also qualities in which change can be experienced more
holistically as shape. Drawing on Johnson’s discussion of Dewey’s theory of
qualities—the overriding feeling of a situation—which Johnson sees as being
created by a collection of cross-domain mappings (ibid.: 77), one may go on
to propose that we understand music crucially (though not wholly) as the con-
stantly changing, dynamic, felt quality created by cross-domain mapping from
the sounds of a particular performance. This seems to make sense of e veryday
experience of music. The quality arising from embodiment is modulated,
of course, by many additional factors including all kinds of autobiographi-

cal and learned associations and intellectual thought about the music. But it
remains a crucial ingredient in the mix.
With George Lakoff, Johnson had already done important work on the
schemas that we use in thinking about our interaction with the environment
(Lakoff and Johnson 1980), and a number of those that Johnson sees as cru-
cial to the construction of meaning through embodiment are closely concerned
with aspects of musical shape. Especially relevant is SCALARITY: ‘Because
we must continually monitor our own changing bodily states, we are exqui-
sitely attuned to changes in degree, intensity and quality of feelings. Such
experiences are the basis for our sense of the scalar intensity of a quality (the
SCALARITY schema)’ (Johnson 2007: 138). What we gain by relating shape
to the SCALARITY schema is an increased sense of the multiplicity of ways
in which changing intensity of feeling, arising from changing intensity of musi-
cal score and performance, acquires further depth and affords varied mean-
ings from the ease with which a performance maps onto many other aspects
of experience. It is not necessary to assume that we inherit these schemas: we
simply cannot fail, given the bodies we have and the environment in which we
find ourselves, to learn them (ibid.: 178). Embodied schemas may well form at
a level between the biological and the cultural: embodied knowledge is neither
inherited nor indoctrinated; it is acquired simply by living and moving.
This helps us tease out the difference between schemas, like SCALARITY
or MORE IS UP /LESS IS DOWN, which are embodied, and the association
that is probably learned by western musicians between pitch and height, which
for them is also connected with shape. The sight of notated sounds rising and
falling across the page, repeated daily from childhood, will inevitably create a
deep-seated sense of rising and falling pitch which would give the notion of
shape one particular intensifying aspect for western classical musicians that it
might not have everywhere (as found by Athanasopoulos and Moran 2013).
To western musicians it would come to seem as natural as a schema and would
increase the sense that music is shaped by verticality as well as by intensity.
Further image schemas that bear on musical shape include MOVING
MUSIC with its underlying schemas MOVING TIME (we imagine time, and
music, moving past us) and MOVING OBSERVER (we imagine ourselves
moving through time with the musical now moment), the latter with its associ-
ate MUSIC AS MOVING FORCE; and also MUSICAL LANDSCAPE (we
imagine a performance of a score as simultaneously present, spread out around
or before us, the schema underlying images of musical compositions as formal
structures fully present independent of performances; Johnson 2007: 29–31,
244–54; Johnson and Larson 2003). Shape is used by musicians, as Prior (2010,
2011) has shown, through all these notions. Seen as ways of thinking about
music, they can easily seem contradictory, even mutually exclusive. How can
you think of music as moving force and also as landscape? Understanding the
372 Music and Shape
rootedness of these conceptualizations of music in low-level image schemas,

each with a firm basis in embodied experience, explains easily how all can be
drawn on in understanding music. Rooted in our bodies, they each arise reas
onably from our daily experience, and as such there is no functional conflict
between them (Johnson and Larson 2003: 80).
Underlying principles
Stern and Johnson are in many ways offering similar views, albeit with differ-
ing foci.3 Stern is concerned with the dynamics of feelings as they are experi-
enced phenomenologically, Johnson with the process by which those feelings
have meaning. Together they offer us the beginnings of a coherent explanation
for the meaningful interaction of music and shape. They also suggest how it
is that shape seems to represent so many aspects of our experience of music
both as sound and as meaning—melodic contour, harmonic tension, loudness
envelopes, pitch inflections, textural change, performance actions, affective (in
the right social context, physical) response—most of which can function simul-
taneously on many structural levels, shapes nested within shapes within shapes
and so on from a single note to a whole piece. It may be possible to take our
understanding of this phenomenon one step further, however, by going back to
some of the research on which Stern and Johnson draw, and supplementing it
with findings from more recent work.
Two research themes are of particular interest: suprasensory modalities
and multisensory perception. Both offer ways of understanding the mechan-
ics of cross-domain mapping. First, though, we need to consider the level on
which mapping takes place. As Stern points out, Lawrence Marks in his classic
study of the theories then relevant to synaesthesia, The Unity of the Senses:
Interrelations among the Modalities (1978), traced ideas about suprasensory
attributes as far back as Democritus (c. 460–370 BCE) and Aristotle (384–22
BCE). For Aristotle a sixth, common sense, integrated the outputs of all the
others, allowing them to be perceived in terms of general attributes: ‘motion,
rest, number, form, magnitude, and unity’ (Marks 1978: 4; Doğantan-Dack
2013 points out a predecessor in Ehrenfels). Marks proposed that a level on
which (perhaps due to a common phylogenetic heritage) the senses overlap, and
where they could find features in common, could explain the ease with which
we make analogies between different sensory experiences (ibid.: 182–5). To
clarify what kinds of common features these might be, consider the process of
analysing auditory data into perceptual features—including melody, contour,
rhythm—demonstrated by the results of selective brain injury (Peretz 2003).
This necessarily involves the process, already well documented in studies of
auditory perception (summarized in Rees and Palmer 2010), by which the audi-
tory cortex analyses incoming sound. The question is what happens between
that initial analysis and our perception of melody, contour, rhythm and so on,
synthesized out of them.
It is easy to see that contour, for example, combines aspects of frequency,
relative quantity, speed and time, and is already quite a complex phenomenon.
Frequency, quantity, speed and time are not entirely straightforward either,
perhaps involving signals registered in a simpler form. So in constructing a
sense of contour the brain has to analyse out of the incoming sound data, and
then combine into a contour gestalt, components that are very much simpler.
Alternatively, if the gestalt is perceived first (which is also possible) it may then
be deconstructed into its components in order to search for them in gestalts
experienced before, identifying them and giving them meaning through previ-
ous experience. Either way, in their simplest, lowest-level form, we have no per-
ceptual access to these simpler components. We perceive them only within more
complex phenomena. However they are registered, each has a role in other per-
cepts too: frequency and quantity are necessary components of loudness, for
example, since without frequencies there is nothing to be heard, and without a
variable response to quantity there is no way of loudness being a percept. These
simpler components would not be consciously perceptible on their own, but
precisely because they are so simple, not yet combined into specific percepts,
they could be simple enough to be shared across several (or all) sensory modali-
ties. All that is then required for cross-domain mapping is a mechanism in the
brain that habitually compares one data stream with, or that has a response
mechanism sensitive to, a memory of others.
Both the simpler components and the mapping mechanism are implied in
studies by, respectively, Näätänen and Winkler (1999) and McLachlan and col-
leagues (2010, 2011). Näätänen and Winkler propose ‘sensory feature traces’
assembled during a pre-representational phase of sensory information pro-
cessing which then, in the following representational phase, are mapped onto
time and compared with the contents of long-term memory. It would be at this
second stage that features such as frequency, relative quantity, speed and time
become perceptible. First-stage features, Näätänen and Winkler suggest, are
simpler and inaccessible to conscious perception, which is exactly what we need
to explain cross-modal commonalities. Refining Näätänen and Winkler’s work
in a major survey of recent research on auditory processing, McLachlan and
Wilson (2010) propose that incoming auditory information is stored in a multi-
dimensional array in short-term memory, enabling multimodal similarities with
data in long-term memory to be identified and to contribute to sound source
identification and association (further developed in McLachlan et al. 2011)
This process, or something like it, allows us to make sense of the very high
rates of agreement that Eitan and Timmers (2010) found among participants
faced with having to choose which of two terms was most like what in the West
we call ‘high’ pitch. Table 11.3 shows the terms that produced most agreement.
I have added a column proposing an association learned through everyday
374 Music and Shape
TABLE 11.3 Highest-scoring results from Eitan and Timmers (2010; Table 1), with a
proposed environmental cause for the participants’ preference
Highest-scoring Test Terms for Consensus Experience Underlying

‘High’ Pitch Preference
High Low
Alert–sleepy 1.0 .10 voice

Sharp–heavy .98 .00 size
Thin–thick .98 .03 size
Sharp–blunt .97 .08 struck sound?
Young–old .97 .10 age
Happy–sad .95 .03 voice
Light–dark .95 .05 ?
Fast–slow .95 .05 size
Light–heavy .95 .11 size
Granddaughter–grandmother .93 .08 age
Feminine–masculine .90 .07 gender
Small–large .90 .15 size
Tense–relaxed .90 .17 body
experience which may underlie the participants’ preferences. Thus, old people
have lower voices than young, so high pitch is female or a granddaughter, while
low is male or a grandmother and so on. But other examples have no such
everyday explanation. What has pitch to do with brightness in our experience
of the world (an association we may share with chimpanzees: Ludwig, Adachi
and Matsuzawa 2011, though see also Spence and Deroy 2012)? In this case, it
seems necessary to hypothesize that an aspect of pitch, too low-level for us to
perceive, is being found to correspond to (or even be identical to) an aspect of
light. And the analysis of, for example, sound and light into very basic com-
ponents seems necessary for the construction of the gestalts evidenced by the
selective results of brain injuries (Peretz 2003) and for the process suggested by
Näätänen and Winkler (1999).
The same mechanism seems likely to underlie Marks’ (1978) suprasensory
attributes: ‘Suprasensory attributes are those categories or dimensions of expe-
rience that … apply to most or to all modalities. Intensity is a classic exam-
ple, to which duration must also be added. Size (extension), brightness, and
hedonic tone are other candidates, though perhaps not universally applicable’
(ibid.: 5). As we have seen in discussing automatic and learned responses, inten-
sity comes closest to the generalized quality that seems necessary to link pitch
and brightness, for example. Nonetheless, it may already be too complex to be
fundamental to cross-domain mapping. Shape–sound relations seem to depend
less on intensity as a basic category than on a lower-level aspect of changing
quantity that can behave identically in awareness of physical space (allowing
sound to be sensed as having height or proximity) and of intensity of feeling.
Bueti and Walsh (2009) offer a mechanism by which a sense of magnitude may
be acquired from infancy, building on Walsh’s (2003) work on quantity. But
all these could be synthesized out of Näätänen and Winkler’s (1999) ‘sensory
feature traces’.
Martino and Marks (2000) introduce key additional points. First, relation-
ships between modalities depend on an awareness of the relative position of a
stimulus on a scale defined by experience. We can say that a sound is bright only
because we can compare it to other sounds we have heard that are darker. No
sound can seem one or the other without that experience being accessible to us.
Here is where synaesthesia differs from cross-modal experience, however, since
for synaesthetes these relationships are automatic and invariant. This relativity
of attributes for most listeners is entirely compatible with the view I have sug-
gested: comparison is made between incoming data and things already known
(whether known through inheritance or, much more usually, through embodi-
ment, enculturation or learning). There is no need to assume measurement
against a baseline.
Secondly, Martino and Marks point out that a post-perceptual represen-
tation of stimuli (analysis into constituent parts and cross-domain mapping
after perception of domain-specific gestalts) would explain recent findings that
semantic relationships between stimuli can also produce congruence effects
(cross-domain mappings). McLachlan and Wilson (2010: 181) propose a simi-
lar role for verbal labels at an early stage in sound identification. Again, this
allows (indeed requires) embodiment, enculturation or learning to play a pow-
erful role in establishing mappings: experience creates rapid matching of stim-
ulus and meaning involving whichever domains have repeatedly been found
relevant (Yu 2008). Thus, music getting quieter maps to increasing distance
because of repeated experience in the real world (embodiment); music increas-
ing in frequency maps to increasing height because of repeated linguistic expe-
rience of ‘high’ and ‘low’ as descriptors of musical sounds (enculturation), or
more specifically for western classical musicians because of repeated experience
of notation (learning). Or to take a different kind of example, violent music is
embodied, martial music is encultured, cigar music is learned (see www.you-
tube.com/watch?v=NIckHmwZAeI).4
The other very relevant body of work that we need to consider (as Stern 2010:
49 suggests) concerns the possible coupling of motion and thought within the
sensorimotor system. Gallese and Lakoff (2005), building on Gallese’s earlier
work on mirror neurons and Lakoff’s on embodied concepts, offer an explana-
tion for the strength of couplings that bring together emotional response with
thought and motion. Musical shape, which involves all three, would provide an
outstanding example. They argue that ‘sensory modalities like vision, touch,
hearing and so on are actually integrated with each other and with motor con-
trol and planning’ (ibid.: 459). The same areas that control action also con-
struct ‘an integrated representation of (1) actions together with (2) objects
376 Music and Shape
acted on and (3) locations toward which actions are directed’ (ibid.: 460). In
other words, they argue that action, imagined action and understanding are all
tied together through using the same neural systems. The same might very well
be true for musical performance, listening to musical performance and under-
standing it. A very similar point is argued by Molnar-Szakacs and Overy (2006:
236): ‘according to the simulation mechanism implemented by the human mir-
ror neuron system, a similar or equivalent motor network is engaged by some-
one listening to singing/drumming as the motor network engaged by the actual
singer/drummer; from the large-scale movements of different notes to the tiny,
subtle movements of different timbres.’
Recent work on multisensory perception offers further ways in which sound
might generate a sense of shape. A review by Stein and Stanford (2008) con-
cludes that many or even most neural systems may be multimodal. One of their
sources, Ghazanfar and Schroeder (2006: 284), seems especially pertinent:
Traditionally, it has been assumed that the integration of such disparate
information at the cortical level was the task of specialized, higher-order
association areas of the neocortex. In stark contrast to this assumption,
the neurobiological data reviewed here suggest that much, if not all, of
neocortex is multisensory… The world is [a]‌barrage of sensory inputs,
our perception is a unified representation of it, and the neocortex is
organized in a manner to make the underlying processes as efficient as
possible.
Certainly this makes excellent sense of many aspects of musical experience.

Music maps onto so much else, in this case, not only because sounds are not
tied to particular semantic meanings but also, crucially, because so much of the
neocortex can do something with it. The key feature is change in experience
over time, tied (via the sensorimotor cortex) to a sense of movement. That links
together changes in space with changes in sensory and emotional experience,
just as we find in the case of musical shape. But shape is only one instance,
albeit a particularly general and semantically flexible one, of a connection that
a multisensory capacity in the sensorimotor cortex would facilitate: we easily
find in music the characteristics of many very specific kinds of motions (stum-
bling, plodding, jerking, drooping and so on) or behaviours (clumsy, tentative,
alluring, rapturous, imploring), all combining aspects of movement with sensa-
tion and narrative implication (again we have action, imaginative simulation
and understanding).
So the sensorimotor system begins to look like a metaphor engine for any-
thing that involves an experience of motion in relation to imagination and
meaning. It receives input from the other modalities and conceives it (models it)
in terms of dynamics with potential for meanings. It generates a ‘vitality form’
in Stern’s sense, or a shape in ours, a core experience with dynamic character-
istics independent of any particular modality; and the same form or shape can
easily be perceived as like the shape or form of experiences in other modalities.

Here metaphor becomes embodied, or (perhaps better) the embodied becomes
metaphor.
In conclusion
The kinds of processes that are being revealed by research in the multisensory
brain go a long way towards explaining how sound could so easily be mapped
onto shape and why it might seem so ‘natural’ to musicians (and especially
western musicians) to think of musical performance as giving shape to the
notes in the score. It is an immensely flexible concept, very easy, because it
involves changing quantity and intensity, to apply in any aspect of sound and
experience where it can do some useful work. In performance, as in other kinds
of experience, shape can apply to anything that changes. In shaping a score,
performers can select one or more of many available dimensions in sound. And
this multi-applicability of shape to sound enables them to be responsive to con-
text, shaping different dimensions from moment to moment. It enables them to
define a personal approach to shaping as part of their own performance style,
identifying them, communicating their way of understanding music. It enables
them to work personally within a constrained period style, itself defined by
particular ways of shaping sound. But this same flexibility also affords the pos-
sibility of massive change in period style over time (Leech-Wilkinson 2009b).
This is exactly what we find now that we have well over one hundred years of
recorded performance.
The fundamental level on which a sense of shape can be passed around the
brain could be a crucial factor in allowing music to be made expressive by per-
formers in a shifting variety of complex and interesting ways, and it explains
how the concept of shape can be used to think about and act within all of them.
Shape gets us about as close as we can get to suprasensory modalities. It works
in every dimension that can change over time. And this is why it is so useful to
performers in thinking and talking about how to make music. It is one of the
most powerful ways we have of making sense of the experience of music with-
out having to be too specific.
A performer in a recent interview for Prior (2011) said: ‘And that’s the thing
about music, if you use imagery it makes your muscles do all kinds of things
that you don’t necessarily have to describe in a minutely physical way.’ This
is a crucial point. Instead of having to say, ‘I want you to make the A in bar
3 slightly softer and maybe 40 milliseconds longer than the previous G, and
then the B semiquaver a little bit longer, maybe another 20 ms, and about 20
dB quieter than you’d expect, or alternatively you could make the A slightly
louder and shorter and the G very short, or …’ and so on, you can simply say,
‘I’d like you to shape that phrase a little more’, and then the performer does
378 Music and Shape
whatever s/he feels works. Shape is great value for performers, communicating
the effect that is required, leaving them free to produce it through feeling, not
analysis (Leech-Wilkinson and Prior 2014; perhaps assisted by auditory imag-
ery: Keller, Dalla Bella and Koch 2010), and thus, free to use their experience,
judgement and taste, to express their musicianship. And we as listeners align
ourselves to the shaped template that the musician supplies (DeNora 2004).
Performed shapes, arranged in a persuasively and movingly managed sequence,
become our felt experience (Johnson 2007: 238). I suggested above that music is
lifelike in a utopian fantasy world where everything that happens makes sense
and where, whatever conflicts we may experience along the way, everything
turns out for the best. I was really speaking of composition then. But we can
say something very similar about performance and the way it uses the notion
of shape: to speak of shape in performance is to speak of the way music, given
a highly skilled performer, enacts an idealized image of the feeling experience
of our everyday lives.
In sum, we have seen that shape is a highly flexible concept widely used
by (especially, but not only) western musicians (Prior, Chapter 7) to talk
about the expressive qualities of a performance. It relates closely to other
concepts involving real or imagined motion through space (including gesture
and trajectory) or across terrain (landscape, contour). At a more general
level it conceptualizes change over time. But fundamentally, in all this dis-
course, shape is modelling changing feelings, and it is that mapping between
the dynamics of musical sound and the dynamics of feelings that allows
shape to function so effectively as a way of thinking and speaking about
musical expressivity. The dynamics of musical sound are easily analysed and
visualized with sound visualization and mapping software (such as Sonic
Visualiser) which gives some access to the shaped nature of performance and
the expressive work it does for the listener. But the underlying mechanisms
need to be teased out through other kinds of research. Stern’s ‘forms of
vitality’ offer a powerful means of thinking further about the shape of feel-
ing, while Johnson’s work on embodiment and image schemas helps to show
how the relationship between shape and feeling is grounded in bodily experi-
ence. Research suggesting the existence of modes of suprasensory perception
can be linked to work on the neural mechanisms of sensory perception that
arrives at similar conclusions. Research on multimodal perception offers a
neural mechanism by which a sense of shape may be generated simultane-
ously in sound, vision and motion, and sheds additional light on the ease
with which music is described using metaphor and in terms of its likeness
to other things, the means by which it acquires so many of the meanings
attributed to it.
Bringing this work together under the umbrella of shape brings us closer to
understanding both how a sense of shape is generated by musical sound and
how and why it is so effective as a tool for musicians. For the end user, thought
about and talk of shape enables the defining and sharing of ideas about an
activity (expressive performance) largely carried out within the domain of
feeling and intuitive response, a domain otherwise inaccessible to analysis or
discussion. Shape functions, then, on at least two levels. As a way of talking
generally about the dynamics of performance, it affords efficient and commu-
nicative teaching and rehearsing. As a model of the dynamics of performance,
it encapsulates the changing magnitudes of sound during performance, afford-
ing a sense of contoured trajectory through which feeling and sound can be
aligned.
References
Aksnes, H., 2001: ‘Music and its resonating body’, Danish Yearbook of Musicology 29:
81–100.
Altenmüller, E., M. Wiesendanger and J. Kesselring, 2006: Music, Motor Control and the
Brain (Oxford: Oxford University Press).
Antovic, M., 2009: ‘Musical metaphors in Serbian and Romani children: an empirical
study’, Metaphor and Symbol 24/3: 184–202.
Bueti, D. and V. Walsh, 2009: ‘The parietal cortex and the representation of time, space,
number and other magnitudes’, Philosophical Transactions of the Royal Society B
364: 1831–40.
Currie, G., 2011: ‘Empathy for objects’, in A. Coplan and P. Goldie, eds., Empathy:
Philosophical and Psychological Perspectives (Oxford: Oxford University Press),
pp. 82–97.
DeNora, T., 2004: ‘Historical perspectives in music sociology’, Poetics 32: 211–21.
Doğantan-Dack, M., 2013: ‘Tonality: the shape of affect’, Empirical Musicology Review
8/3–4: 208–18.
Eitan, Z., 2013: ‘Musical objects, cross-domain correspondences, and cultural choice: com-
mentary on “Cross-cultural representations of musical shape” by George Athanasopoulos
and Nikki Moran’, Empirical Musicology Review 8/3–4: 204–7.
Eitan, Z. and R. Y. Granot, 2006: ‘How music moves: musical parameters and listeners’
images of motion’, Music Perception 23/3: 221–47.
Eitan, Z. and R. Y. Granot, 2007: ‘Intensity changes and perceived similarity: interpara-
metric analogies’, Musicae Scientiae, Discussion Forum 4A: 39–75.
Eitan, Z. and R. Timmers, 2010: ‘Beethoven’s last piano sonata and those who follow
crocodiles: cross-domain mappings of auditory pitch in a musical context’, Cognition
114: 405–22.
Fraisse, P., 1984: ‘Perception and estimation of time’, Annual Review of Psychology 35: 1–36.
Gallese, V. and G. Lakoff, 2005: ‘The brain’s concepts: the role of the sensory-motor system
in conceptual knowledge’, Cognitive Neuropsychology 22/3: 455–79.
380 Music and Shape
Ghazanfar, A. A. and C. E. Schroeder, 2006: ‘Is neocortex essentially multisensory?’, Trends

in Cognitive Sciences 10/6: 278–85.
Hausegger, F. v., 1887: Die Musik als Ausdruck (Vienna: Carl Konegen).
Huron, D., 2006: Sweet Anticipation: Music and the Psychology of Expectation (Cambridge,
MA: MIT Press).
Johnson, M., 2007: The Meaning of the Body: Aesthetics of Human Understanding
(Chicago: University of Chicago Press).
Johnson, M. L. and S. Larson, 2003: ‘ “Something in the way she moves”: metaphors of
musical motion’, Metaphor and Symbol 18: 63–84.
Juslin, P. N. and E. Lindström, 2010: ‘Musical expression of emotions: modelling listeners’
judgements of composed and performed features’, Music Analysis 29/1–3: 334–64.
Keller, P. E., S. Dalla Bella and I. Koch, 2010: ‘Auditory imagery shapes, movement timing,
and kinematics: evidence from a musical task’, Journal of Experimental Psychology:
Human Perception and Performance 36: 508–13.
Kim, J. H., 2012: ‘What music and dance share: dynamic forms of movement and action-
based aesthetic empathy,’ in S. Schroedter, ed., Bewegungen zwischen Hören und Sehen
Denkbewegungen zu Bewegungskünsten (Würzburg: Königshausen & Neumann).
Kim, J. H., 2013: ‘Shaping and co-shaping forms of vitality in music: beyond cognitiv-
ist and emotivist approaches to musical expressiveness’, Empirical Musicology Review
8/3–4: 162–73.
Koelsch, S., 2010: ‘Towards a neural basis of music-evoked emotions’, Trends in Cognitive
Sciences 14/3: 131–7.
Köhler, W., 1929: Gestalt Psychology (New York: Liveright).
Küssner, M. B. and D. Leech-Wilkinson, 2014: ‘Investigating the influence of musical
training on cross-modal correspondences and sensorimotor skills in a real-time drawing
paradigm’, Psychology of Music 42/3: 448–69.
Küssner, M. B., D. Tidhar, H. M. Prior and D. Leech-Wilkinson, 2014: ‘Musicians are more
consistent: gestural cross-modal mappings of pitch, loudness, and tempo in real-time’,
Frontiers in Psychology 5/789, doi: 10.3389/fpsyg.2014.00789 (accessed 9 April 2017).
Lakoff, G. and M. Johnson, 1980: Metaphors We Live By (Chicago: Chicago University Press).
Langer, S. K., 1963: Philosophy in a New Key, 3rd edn (Cambridge, MA: Harvard University
Press).
Lavy, M. M., 2001: ‘Emotion and the experience of listening to music: a framework for
empirical research’ (PhD dissertation, University of Cambridge).
Leech-Wilkinson, D., 2006: ‘Expressive gestures in Schubert singing on record’, Nordisk
Estetisk Tidskrift 33–4: 50–70.
Leech-Wilkinson, D., 2009a: The Changing Sound of Music: Approaches to the Study of
Recorded Musical Performances, http://www.charm.kcl.ac.uk/studies/chapters/intro.
html (accessed 9 April 2017).
Leech-Wilkinson, D., 2009b: ‘Recordings and histories of performance style’, in N. Cook,
E. Clarke, D. Leech-Wilkinson and J. Rink, eds., The Cambridge Companion to Recorded
Music (Cambridge: Cambridge University Press), pp. 246–62.
Leech-Wilkinson, D. and H. Prior, 2014: ‘Heuristics for expressive performance’, in
D. Fabian, R. Timmers and E. Schubert, eds., Expressiveness in Music Performance:
Empirical Approaches Across Styles and Cultures (Oxford: Oxford University Press),
pp. 34–57.
Levitas, R., 2010: ‘In eine bess’re Welt entrückt: reflections on music and utopia’, Utopia
Studies 21/2: 215–21.
Lipps, T., 1903: ‘Einfühlung, innere Nachahmung, und Organempfindungen’, Archiv für die
gesamte Psychologie 1: 185–204.
Ludwig, V. U., I. Adachi and T. Matsuzawa, 2011: ‘Visuoauditory mappings between high
luminance and high pitch are shared by chimpanzees (Pan troglodytes) and humans’,
Proceedings of the National Academy of Sciences 108/51: 20661–5.
Marks, L. E., 1978: The Unity of the Senses: Interrelations among the Modalities (New York:
Academic Press).
Martino, G. and L. E. Marks, 2000: ‘Cross-modal interaction between vision and touch: the
role of synesthetic correspondence’, Perception 29: 745–54.
Maurer, D. and C. J. Mondloch, 2006: ‘The infant as synesthete’, Attention and Performance
21: 449–71.
Maurer, D., T. Pathman and C. J. Mondloch, 2006: ‘The shape of boubas: sound–shape
correspondences in toddlers and adults’, Developmental Science 9/3: 316–22.
McLachlan, N. and S. Wilson, 2010: ‘The central role of recognition in auditory percep-
tion: a neurobiological model’, Psychological Review 117/1: 175–96.
McLachlan, N. M., L. J. Greco, E. C. Toner and S. J. Wilson, 2011: ‘Using spatial manip-
ulation to examine interactions between visual and auditory encoding of pitch and
time’, Frontiers in Psychology 1/233, doi 10.3389/fpsyg.2010.00233 (accessed 9 April
2017).
Miller, G., 2000: ‘Evolution of human music through sexual selection’, in N. L. Wallin,
B. Merker and S. Brown, eds., The Origins of Music (Cambridge, MA: MIT Press),
pp. 329–60.
Molnar-Szakacs, I. and K. Overy, 2006: ‘Music and mirror neurons: from motion to
“e”motion’, Social Cognitive and Affective Neuroscience 1: 235–41.
Murray, M., 2003: ‘Narrative psychology’, in J. A. Smith, ed., Qualitative Psychology:
A Practical Guide to Research Methods (London: Sage), pp. 111–31.
Näätänen, R. and I. Winkler, 1999: ‘The concept of auditory stimulus representation in
cognitive neuroscience’, Psychological Bulletin 125: 826–59.
Nakamura, J. and M. Csikszentmihalyi, 2009: ‘Flow theory and research’, in C. R.
Snyder and S. J. Lopez, eds., Oxford Handbook of Positive Psychology (Oxford: Oxford
Narmour, E., 1990: The Analysis and Cognition of Basic Melodic Structures: The Implication-
Realization Model (Chicago: University of Chicago Press).
Peretz, I., 2003: ‘Brain specialization for music: new evidence from congenital amusia’, in
I. Peretz and R. Zatorre, eds., The Cognitive Neuroscience of Music (Oxford: Oxford
Prince, J. B., M. A. Schmuckler and W. F. Thompson, 2009: ‘Cross-modal melodic contour
similarity’, Canadian Acoustics 37/1: 35–49.
Prior, H. M., 2010: ‘Links between music and shape: style-specific; language-specific;
or universal?’, paper presented at ‘Topics in Musical Universals: 1st International
Colloquium’, Aix-en-Provence, France, December 2010.
Prior, H. M., 2011: ‘Exploring the experience of shaping music in performance’, paper
presented at the Performance Studies Network First International Conference,
Cambridge, UK, 14–17 July 2011.
382 Music and Shape
Prior, H. M., 2012: ‘Shaping music in performance: report for questionnaire participants
Ramachandran, V. S. and E. M. Hubbard, 2001: ‘Synaesthesia: a window into perception,
thought and language’, Journal of Consciousness Studies 8/12: 3–34.
Rees, A. and A. R. Palmer, eds., 2010: The Oxford Handbook of Auditory Science: The
Auditory Brain (Oxford: Oxford University Press).
Spence, C., 2011 ‘Crossmodal correspondences: a tutorial review’, Attention, Perception,
and Psychophysics 73/4: 971–95.
Spence, C. and O. Deroy, 2012: ‘Crossmodal correspondences: innate or learned?’,
i-Perception 3: 316–18.
Stein, B. E. and T. R. Stanford, 2008: ‘Multisensory integration: current issues from
the perspective of the single neuron’, Nature Reviews: Neuroscience 9: 255–66 and
corrigendum.
Stern, D., 2004: The Present Moment in Psychotherapy and Everyday Life (New York: Norton).
Stern, D., 2010: Forms of Vitality: Exploring Dynamic Experience in Psychology, the Arts,
Psychotherapy, and Development (Oxford: Oxford University Press).
Stevens, C. J., E. Schubert, R. Haszard Morris, M. Frear, J. Chen, S. Healey, C. Schoknecht
and S. Hansen, 2009: ‘Cognition and the temporal arts: investigating audience response
to dance using PDAs that record continuous data during live performance’, International
Journal of Human-Computer Studies 67: 800–13.
Tarasti, E., 2004: ‘Music as a narrative art’, in M.-L. Ryan, ed., Narrative across Media: The
Languages of Storytelling (Lincoln: University of Nebraska Press), pp. 283–304.
Truslit, A., 1938: Gestaltung und Bewegung in der Musik (Berlin: Chr. Friedrich Vieweg).
Walsh, V., 2003: ‘A theory of magnitude: common cortical metrics of time, space and quan-
tity’, Trends in Cognitive Sciences 7: 483–8.
Ward, J., B. Huckstep and E. Tsakanikos, 2006: ‘Sound-colour synaesthesia: to what extent
does it use cross-modal mechanisms common to us all?’, Cortex 42: 264–80.
Ward, J., S. Moore, D. Thompson-Lake, S. Salih and B. Beck, 2008: ‘The aesthetic appeal
of auditory–visual synaesthetic perceptions in people without synaesthesia’, Perception
37: 1285–96.
Wingstedt, J., S. Brändström and J. Berg, 2010: ‘Narrative music, visuals and meaning in
film’, Visual Communication 9: 193–210.
Yu, N., 2008: ‘Metaphor from body and culture’, in R. W. Gibbs, ed., The Cambridge
Handbook of Metaphor and Thought (Cambridge: Cambridge University Press),
pp. 247–61.
Reflection
David Amram, composer, conductor,
jazz French horn player
Music already tells a story. Any good teacher always talks about the shape of
the phrase and how that individual moment relates to the whole picture. There’s
an arc to a piece of music, just like in classical theatre where there’s a begin-
ning, a middle and an end, and within those three essential areas of any form
of expression there’s a presentation of themes and variations, recapitulation,
dénouement and conclusion. Basic structure and symmetry are essential in both
art and life. I think most people, even if they don’t know there is such a thing,
have a much better experience when there is a structure, and then of course
within that structure you can do just about anything. Thinking in terms of sym-
metry and construction and shape and form, you’re able to deal with any kind
of music and understand as a performer, a composer or a conductor that what
you’re here to do is supposed to help the listener paint a picture themselves;
or when composing in the silence of your own space, make your composition
like a perfect building to inhabit, so that during the time that the piece is being
performed it gives the listener a chance to create some order out of the chaos
of their own life by following what it is you have given to them in the musical
journey you have created to be shared.
Music is visual as well as aural. Until modern times, people saw as well as
heard the musicians performing. Charlie Parker’s famous song Now’s the Time
expressed the whole idea of what music has always been about: a celebration
of the moment and the sanctity of what is transpiring at that moment which
may never happen again. Real-life experiences are of course as visual as they
are aural. That is why everyone involved in making music has to be concerned
with shape, form, movement and overall structure.
I shall always remember a campfire in the Menominee Indian reservation out-
side of Green Bay, Wisconsin. I was playing with the great Lakota Sioux singer
383
384 Music and Shape
Floyd Red Crow Westerman. After we were done with the concert, we went
off to this big bonfire and all the young Indian men were sitting there around
this huge campfire, probably as their great-great-great-great-grandparents had
done, and they were singing. They said, ‘David, how does this make you feel?’
I said, ‘Well, I just feel like I’m here thousands of years ago with your ancestors
and they’re here with us right now.’ And they said, ‘Well that’s why we do it.
This music is being sent by us directly up to the Great Creator.’ Then several
of them made a gesture with their hands over the fire as if they were fanning
both the smoke from the fire and the song itself upwards towards the sky. At
that moment I understood why and how this music should be played and how
it should be listened to and appreciated. The men at the fireplace had painted
me a picture of the shape and the form and the direction of what this music
was all about.
Musicians in my experience play differently when they watch dancers,
because just as the musicians inspire the dancers to dare to travel into the
unknown, playing with and for dancers enables you to bring out new things
in the music. I was at the Marlboro Music Festival in Vermont as composer in
residence in the summer of 1961. They had a famous flute player named Marcel
Moyse; he was legendary among flute players. He would get these phenomenal
classical players and he would take out a sheet of music for some old folk song,
or an old cowboy song or a German drinking song or a little French folk song,
and he would say, ‘Play this’. As the musician began to play, Marcel Moyse
would get up and start dancing round the room like a ballet dancer, singing
the melody of the song and waving his arms around like a crazed conductor.
With his unusual croaking style of singing and bizarre dance moves, it became
obvious that he was painting a picture for us of what the music was supposed
to do to those listening so that they could feel the music and the shape of the
piece being played. It’s just like what happens in the New Orleans Second Line
marches when you are as lucky as I was to be invited to march with the bands
who are playing the music that reflects their community. The singing, dancing,
playing and group collective feeling has its own unique sense of shape and
tempo.
The participatory nature of the Second Line does not promote the idea of
invading someone else’s turf. Quite the opposite: it shows that if you’re a seri-
ous musician and a respectful person, you don’t have to be terminally incarcer-
ated in your assigned slot for the rest of your life and never venture out of your
comfort zone. Once you see that music is a gateway to achieving a higher level
of understanding which creates a desire to learn more about all that transpires
in the rest of the world, musicians can begin to think of themselves as being
like a person painting a picture, who while creating is often obliged to start and
stop time. The silence between the stops and starts makes a sacred space for all
of us to be in, giving us the chance to store this magic moment in our memory
bank and in our hearts.
Reflection: David Amram 385
When we realize that music opens the doors to feelings, forms and history
that connects us to every other person on the planet, we become much more
comfortable when spending time with people from every walk of life, from street
people to architects, visual artists, brain surgeons and lawyers, astronauts, bar-
tenders, postal workers and athletes. Music shows us about all the things we
have in common and provides access to knowing how to ask others about all
the things we don’t know much about. And when we can put all this knowledge
in some kind of subjective storage space, we can become tourist guides for the
shapes and forms we have come to understand and, while sharing this informa-
tion, continue to learn more ourselves.
Reflection
Antony Pitts, composer and producer
Towards an outline …
Today I’m struggling with a piece that should have taken an afternoon to write
down. It appeared in the mist when summoned, almost on cue and apparently
fully formed, but it has taken another few months to grasp once more the
geometry of its form, the ratios and rationality of its quixotic light and shade.
The piece is a gift-cum-commission for Edward Higginbottom, at the end of
his long tenure at New College, Oxford. It’s a short setting of George Herbert’s
‘Love bade me welcome’ for unaccompanied choir, and from the moment
I started working on it, it was clear in my mind that this piece existed—com-
plete, perfect and (to me at least) unutterably beautiful and heart-rending.
A murmur, a hue, a shadow, an outline, a pang (what C. S. Lewis might well
term ‘Desire’)—these are the beginnings of creativity that I’m aware of: but
then I’m generally not aware of the real genesis of a piece of music, only
sometimes; often the beginnings are as impenetrable as forgotten dreams.
When I do know I’m thinking about a new piece, I sometimes feel sure that
I’m seeing or feeling it rather than hearing it. It—whatever it is—is amodal or
multimodal, an Ur-expression of some deeper confluence of ideas or tangling
of neurons.
In fact I wrote down the outline of ‘Love bade me welcome’ in an afternoon,
but in the weeks since I have struggled to agree with myself on its final form
(I’ve been seriously tempted to produce a folk-rock version). It’s the writing
that both destroys and captures the original idea in pinning it to the manu-
script: much of the struggle has been to reconcile what I consider the aesthetic
perfection of the original apparition, itself the ghostly flesh on the exquisitely
proportioned skeleton of Herbert’s poem, with the necessity of making it sing-
able while avoiding the safe danger of quantizing its chaotic edges down to an
excessively crystalline beauty.
386
Reflection: Antony Pitts 387
The fourteenth-century theorist (and I feel a connection with the fourteenth-

century state of musical affairs—the groping after new modes of expression
and new combinations of shape and colour as openly described by Machaut in
his quasi-autobiographical Le Voir Dit) Johannes de Muris talks about shape
on the one hand and sound on the other, and by ‘shape’ he means notation:
‘no musical relation exists between shape and shape, but musical harmony is
created by the relation between the sounds. For it is not the shape that is dimin-
ished or increased by another shape, but the sound which is signified by the
shape’ (Gallo [1977] 1985: 114). But the shape of notation is an attempt to cap-
ture and convey the shape of musical sound—a kind of translation. And there-
fore the available shapes in notation, of which there are many but not infinitely
many, inevitably reduce the initial mental conception to an essentially digital
and coarse-grained reproduction.
Barlines and time signatures become the bars of a prison containing origi-
nally free ideas awaiting their controlled exit into the hands or larynx of official
executors. I find particularly in this piece, as I often find, that I don’t want to
have to prescribe a specific barring system, since the music has a much more
fluid, overlapping set of pulses which the performer can tap into. But on the
plus side, today I made a discovery about the structure of those beats which
I could not have determined without the forced labour of (twenty-first-century)
notation: a pair of double-digit numbers to do with the circumstances of the
commission—which I had vaguely considered appropriate as acceptable extra-
musical impositions—suddenly turned up without asking in what I currently
perceive as the best rendering of my original vision. How does that happen? Is
the original inspiration causally affected by my subsequent work on it? Have
I influenced history, rather than the other way round?
I’m also concerned with another spectre—that of much greater but more
nebulous proportions—the ‘meta-work’ that my work will put on and never
take off. I mean that cloud of influences, references, juxtapositions that lie in
wait for my intellectual baby and which—once engendered—like one’s online
history—cannot be revised, rewritten. The life of the work over which I now
slave so assiduously will have a shape free from its creator’s legal reach: I cannot
say how it will be interpreted and received, however hard I try. I can, however,
attempt to map that penumbra of the meta-work, and possibly to influence and
leverage its course through the musical ocean … but that’s another narrative
for another time.
Reference
Gallo, F. A., [1977] 1985: Music of the Middle Ages II [Storia della Musica: Il Medioevo II],
trans. K. Eales (Cambridge: Cambridge University Press).
NOTES
Preface
1. Perhaps this explains why the marking criteria for the Associated Board of the Royal
Schools of Music state: ‘Candidates will be marked under five categories: pitch, time, tone,
shape and performance.’ Associated Board examiners, at any rate, need read no further.
http://gb.abrsm.org/en/our-exams/information-and-regulations/graded-music-exam-mark-
ing-criteria/.
2. As well as those mentioned below, from the previous essays by Doğantan-Dack (2013)
and Kim (2013), precursors with somewhat related ideas are well surveyed in Rothfarb
(2001) on ‘energetics’.
3. As a metaphor, ‘shape’ is not merely an interesting linguistic feature: as Gibbs states,
‘Metaphors are not simply an ornamental aspect of language, but a fundamental scheme
by which people conceptualize the world and their own activities’ (2008: 3). Even research-
ers who are unwilling to accept the theory of conceptual metaphor agree that the use of
metaphors provides information about thought processes (Cameron 2010).
4. http://www.cmpcp.ac.uk (accessed 9 April 2017).
5. http://www.charm.kcl.ac.uk (accessed 9 April 2017).
Chapter 1
1. Such ‘octave compression’ can be easily done and listened to with a modulo 12 opera-
tion on MIDI note data, effectively compressing everything into the confines of one octave.
Also, other shape manipulations can easily be done on MIDI note data and listened to
in view of shape features, such as so-called modus quaternion variants of pitch contours
(i.e. mirrored, retrograde and retrograde-mirrored in addition to the original).
2. In the original: ‘le premier objectif consiste à caractériser un phénomène en tant que
forme, forme “spatiale”. Comprendre signifie donc avant tout géométriser.’
3. Sound examples together with an overview of these typology and morphology prin-
ciples are available on audio CDs in Schaeffer (1998).
Chapter 2
1. Claxton (1980: 13) summarizes this aptly:
… [cognitive psychology] does not, after all, deal with whole people, but with a
very special and bizarre—almost Frankensteinian—preparation, which consists of a
brain attached to two eyes, two ears, and two index fingers. This preparation is only
to be found inside small, gloomy cubicles, outside which red lights burn to warn
ordinary people away. It stares fixedly at a small screen, and its fingers rest lightly
389
390 Notes
and expectantly on two small squares of black plastic—microswitches. It does not

feel hungry or tired or inquisitive; it does not think extraneous thoughts or try to
understand what is going on. It simply processes information. It is, in short, a com-
puter, made in the image of the larger electronic organism that sends it stimuli and
records its responses.
2. To count as a differentiated visualization, the drawing should display any of these
subcategories: (a) a sounding object or action hinting at the temporal unfolding of the
sound stimulus; (b) an analogous image, that is, the change in one musical parameter
expressed through an image; (c) nonformal notation of the music using abstract shapes,
e.g. lines, circles, dots; or (d) formal-conventional notation. Juxtaposed to this category are
global visualizations, further subdivided into (a) depiction of one instrument or (b) several
instruments; (c) evocation, that is, any associated pictorial response capturing the sound/
music as a whole; and (d) music icon, that is, depiction of music notes without referring to
specific aspects of pitch, duration or loudness.
3. The Suzuki method was developed by Japanese violinist Shin’ichi Suzuki after the
Second World War (Suzuki, Mills and Murphy 1973). Likening musical training to lan-
guage acquisition, its central tenet is that every child is able to acquire musical skills given
the right environment and instruction. Generally, training the ear and memorizing music
are seen as more important than music reading skills.
4. http://www.uio.no/english/research/groups/fourms/ (accessed 9 April 2017).
5. Patient and control groups also performed equally well in a simple synchronization
task of tapping along to a musical beat and in a task assigning emotional labels to musi-
cal excerpts, though the latter took participants with ASD five times longer than controls.
6. There is another fairly recent strand of empirical research concerned with free move-
ments to music (i.e. dance) in adults (Burger 2013; Thompson 2012; Van Dyck 2013) and
children (Maes and Leman 2013).
7. For an overview of Rudolph Laban’s work, see for example Davies (2006).
8. They may be regarded as spontaneous if the presented stimulus is novel for the par-
ticipant, or as unspontaneous if participants are familiar—due to prior exposure in the
experiment or elsewhere—with the stimulus.
Chapter 3
1. Notably, while the first stanza takes place at dusk, the third stanza depicts sunrise.
Assuming temporal continuity, this implies that the protagonist has spent the entire night
rowing in front of the darkened town. This darkened gap separates the second and third
stanzas (see also Youens 2007).
2. Note that none of the action and motion described in the second and third stanzas is the
protagonist’s: the acting or moving forces are external (wind, water, oarsman, sun). Even the act
of seeing is passive and forced: the sun ‘shows me’ (zeigt mir) the town. The only act the protago-
nist is able to perform is that of losing (verlor), whose disclosure terminates the poem.
3. We are grateful for the advice of Clemens Wöllner, who suggested this and the follow-
ing German expressions.
4. As several analyses of ‘Die Stadt’ have noted (e.g. Morgan 1976; Schwartz 1986), the
modified melody of stanza 3 presents ‘cover tones’ above the structural melodic line. The
latter, directly reflected in the vocal line of stanza 1, is maintained in the upper line of the
Notes 391
piano accompaniment (e.g. E♭–D–C, bars 34–35). Notably, the modified, disjunct vocal line
at the stanza’s highpoints (bars 29–31, 33–35) is aligned with the bass.
5. Note that in comparing the performances, the profile of intensity variation is of par-
ticular interest. The absolute values (e.g. actual maximum or minimum intensity) strongly
depend on the recording and recording conditions.
6. Morgan (1976) links the unresolved ending of ‘Die Stadt’ with the opening of the next
song in the published Schwanengesang, ‘Am Meer’. Note that though this ordering is based
on Schubert’s autograph, several scholars have suggested that it does not represent Schubert’s
original intention (presumed to follow Heine’s ordering), but a revision made at the pub-
lisher’s request. For a discussion of this issue, see Litterick (1996) and Reed (1997: 258–61).
7. http://www.sonicvisualiser.org/(accessed 9 April 2017).
Chapter 4
1. See in particular Juslin and Sloboda (2010), Robinson (2005), Huron (2006) and
Nussbaum (2007).
2. Private communication.
3. Produced on Sonic Visualiser. The darker line maps tempo, the lighter, energy.
4. For a compelling exception to this apparent norm, see Leech-Wilkinson’s (2013: 50) dis-
cussion of Alfred Cortot’s 1920 recording of Chopin’s Berceuse, where the pianist matches
the rising and falling of the melody with slowing and speeding up of the beats.
5. The first group comprised first-year undergraduate music students at the University
of Liverpool. The second group were musicologists, including three renowned Bach schol-
ars. Another striking finding was that, when asked to select an emotion for the movements
(out of the set sadness, anger, tenderness, fear) nonexpert listeners (undergraduates) identi-
fied the Fuga with fear, while the experts identified it with anger, perhaps reflecting more
advanced structural listening.
Chapter 5
1. An adjectival form of the term ‘perspect’, a contraction of ‘perceived aspect’, used in
contradistinction to the term ‘parameter’, which in zygonic theory is reserved for the physi-
cal correlates of perceptual domains (Ockelford 2005: 10).
2. From the Greek word ‘zygon’, meaning ‘yoke’, and implying a union of two similar
things.
3. Although two dots sharing the same location can be distinguished functionally, as is
the case in which two lines of dots both converge on the same point, for example.
4. Observe that to depict the physical production of sounds visually requires the use of
symbolic representation (see Figure 5.16).
Reflection: Alice Eldridge
1. Improvisers were recruited via UK and European free improvisation organizations
and forums.
2. http://toplap.org/wiki/ManifestoDraft (accessed 9 April 2017).
3. Personal communication.
392 Notes
Chapter 6
1. For the sake of space, examples in this section were not reprinted. They can be accessed
online through the public domain IMSLP/Petrucci Library at http://imslp.org/wiki/Violin_
Concerto_in_D_major,_Op.61_(Beethoven,_Ludwig_van) (accessed 9 April 2017).
Chapter 7
1. Violinists were asked to play François Devienne’s (1759–1803) Sonata for Clarinet
in B♭ and Pianoforte No. 2, I: bars 1–12; harpsichordists were asked to play Thomas
Roseingrave’s (1690/ 91–
1766) ‘Sarabande’ from Complete Keyboard Music, Musica
Britannica, Vol. LXXXIV, ed. Johnstone and Platt, p. 60. None of the participants was
familiar with the piece given to them.
2. References for quotations from participants link to the complete model available on
the companion website . This quote can be found under Musical level: Phrase, L5.10.
3. L2.4.
4. H12.11.
5. L2.19.
6. L6.1.
7. Tina, T1.4.
8. Bridget, T1.1; Nathaniel, T1.8.
9. T1.3.
10. T1.5.
11. T2.4–7.
12. T2.8.
13. Bridget, T2.1; Tina, T2.9.
14. Jane, T2.10; Julian, T2.13; Yoshi, T2.16.
15. T2.2.
16. Jane, T3.4.
17. Bridget, T3.2.
18. Elsie, T3.3.
19. Bridget, T4.1.
20. Victor, T4.14.
21. Elsie, T4.3.
22. Julian, T4.26.
23. See Table 14.T4, online.
24. Darragh, T5.1; Elsie, T5.3; Tina, T5.4; Katharine, T5.16; Yoshi, T5.20–1.
25. Jane, T5.5– 8; Julian, T5.9–12; Katharine, T5.13– 14; Nathaniel, T5.18; Yoshi,
T5.19.
26. Bridget, T6.1; Darragh, T6.2–4; Elsie, T6.6; Tina, T6.7–8; Victor, T6.11; Yoshi,
T6.23.
27. Katharine, T6.16; Nathaniel, T6.18; Yoshi, T6.22.
28. Jane, T6.12–14; Julian, T6.15; Katharine, T6.17; Nathaniel, T6.19–21.
29. Jane, T7.8.
30. Julian, T7.12–13 and T7.15; Katharine, T7.18; Yoshi, T7.22–3.
31. Elsie, T7.1.
32. Victor, T7.4.
Notes 393
33. Victor, T7.5–7; Jane, T7.9– 11; Julian, T7.14; Katharine, T7.16– 17 and T7.19;
Nathaniel, T7.20–21.
34. T7.3.
36. T9.6.
38. See Tables 7.M1 and 7.M2, online.
39. Julian, M3.3; Katharine, M3.4.
40. Julian, M3.2; Nathaniel, M3.7.
41. See Tables 7.M4 and 7.M5, online.
42. See Table 7.M6, online.
43. Tina, M1.26; Yoshi, S6.53.
44. H11.18.
45. H4.11.
46. As this is a situational factor, it is not included in the online tables. The quote is
taken from approximately 00:55:30–00:56:30 in the interview.
47. S7.27.
48. This is also not included in the online tables. The quote is taken from approximately
01:01:00–01:02:00 in the interview.
49. L1.2, H1.17, S1.1 and S2.2.
50. L2.1, L3.1, L5.4, H4.2 and H12.2.
51. L2.5, T2.5, H1.4, H5.5 and H13.3.
52. L2.6, T2.6 and H1.5.
53. H5.22.
54. T4.38, H5.26, H14.30 and S6.45.
55. L5.20, T4.23, T10.8, H6.8, H7.19, H9.7, H12.13, M5.5, S5.15 and S7.37.
56. L5.18, T4.16, T6.10, T7.7 and H4.12.
57. L5.13, T4.10, T6.7, T9.4, H1.8, H4.8, H7.14, H10.3, S6.7 and S7.19.
58. L5.8, T5.2, T6.5, H2.2, H3.1, H12.4, M1.10, M2.3, S5.3, S6.4, S7.11 and S8.1.
59. L6.4, T4.8, T7.2, H1.7, H5.9, H9.3, H12.9, M1.18, M2.7, S5.4, S6.6, S7.16 and S8.4.
60. L6.16, H6.15, H7.25, H9.20, M4.11, S5.34 and S7.50.
Reflection: Simon Desbruslais
1. I recommend that the reader listens to a trumpet masterclass at a university or music
college to experience first-hand how technically accomplished trumpeters can create diver-
gent sound characters.
Reflection: Malcolm Bilson
1. Excerpts from Knowing the Score and a second video, Performing the Score, can be
seen at http://www.malcolmbilson.com (accessed 9 April 2017).
2. These three subjects, rhythmic notational conventions, tempo fluctuations and tempo
rubato, are covered in detail in the DVD Knowing the Score. Excerpts can be seen at
http://www.malcolmbilson.com. The Prokofiev and Haydn examples in this Reflection are
excerpts from that video.
394 Notes
Chapter 8
1. For a discussion of the ontology of recorded music, see Kania (2008).
2. Participants were asked to indicate which of twenty-nine categories of music they per-
formed in; the categories were derived from normative classifications of music and refined
through a pilot study (Prior 2012b).
3. Théberge (2012: 90) argues that ‘the rise of the Internet and what has been called, in
more general terms, “digital culture” has posed challenges and opportunities to contempo-
rary musicians, engineers and producers; indeed the widespread dissemination of software-
based tools for recording music challenges the very idea of who can lay claim to those roles
in contemporary culture.’
Reflection: Mark Applebaum
1. Robert Arnold’s documentary film about the project, There Is No Sound in My Head,
appears on Vimeo.com and the Mark Applebaum DVD The Metaphysics of Notation
(Innova 787; 2010).
2. A three-dimensional score would afford countless other possibilities.
3. A more exhaustive discussion might consider everything from music in Kandinsky’s
art to a survey of today’s young visual artists for whom the employment of sound in their
work is more typical than atypical.
4. They even gather for scholarly conferences on the topic, such as ‘Time Stands
Still: Notation in Music Practice’ at Wesleyan University, April 2013.
5. This is a particular pet peeve of mine: Why must nonstandard notation always stimu-
late an improvised response? Can’t a performer work out a matching of notational signs to
musical sounds and actions in advance?
Reflection: Alex Reuben
1. See for example Black on Maroon, Tate Gallery, http://www.tate.org.uk/whats-on/
exhibition/rothko/room-guide/room-3-seagram-murals (accessed 9 April 2017).
2. 1:00, DV, UK 2001, http://www.alexreuben.com/home/que-pasa (accessed 9 April
2017).
3. 5:00, 35 mm, DKTV, UK 2003, http://www.alexreuben.com/home/big-hair (accessed
9 April 2017).
4. My current cognitive research is supported by awards from the Wellcome Trust and
Arts Council England for a movie project, Cinderella (RockaFela).
5. 3:00, Digi., Channel 4 TV/ACE/MJW, UK 2004, http://www.alexreuben.com/home/
line-dance (accessed 9 April 2017).
6. http://www.metmuseum.org/toah/works-of-art/57.92 (accessed 9 April 2017).
7. http://www.youtube.com/watch?v=7bICqvmKL5s (accessed 9 April 2017).
8. 48:00, HD, ACE, UK 2008.
9. http://www.youtube.com/watch?v=LRmRBrnQq8o (accessed 9 April 2017).
10. http://www.youtube.com/watch?v=pbFvQo6Ao6o (accessed 9 April 2017).
11. http://www.youtube.com/watch?v=y651C7aNXRc (accessed 9 April 2017).
12. 64:00, HD, Sadler’s Wells/ACE, UK 2008-11. Excerpt: Rosemary Lee’s Meltdown,
5:00, http://www.alexreuben.com/home/newsreel-two (accessed 9 April 2017).
Notes 395
Chapter 10
1. A term first used by Paul Hodgins (1992).
2. http://www.eadweardmuybridge.co.uk (accessed 9 April 2017).
3. The closest thing in music would be the graphic score, but reading the score requires
translating the marks on the page to spatial properties in sound.
4. https://web.archive.org/web/20160119165847; http://www.curriculumsupport.educa-
tion.nsw.gov.au/primary/creativearts/dance/elements (accessed 9 April 2017).
5. http://synchronousobjects.osu.edu/media/inside.php?p=gallery#projectgallery (accessed
9 April 2017).
6. Wayne McGregor | Random Dance is now known as Studio Wayne McGregor and
the dance company is referred to as Company Wayne McGregor, both with the web address
http://waynemcgregor.com (accessed 9 April 2017).
7. https://blogs.montclair.edu/creativeresearch/2011/04/19/creativity-and-bridging-by-
philip-barnard-and-scott-delahunta/ (accessed 9 April 2017).
8. McGregor was interviewed by Philip Barnard in London on 22 August 2013; the
transcript is in the archives at Studio Wayne McGregor.
9. http://www.ted.com/talks/wayne_mcgregor_a_choreographer_s_creative_process_
in_real_time (accessed 9 April 2017).
10. Available at http://waynemcgregor.com/learning/resources/ (accessed 9 April 2017).
11. An extensive self-analysis of the creation of four of her seminal works made between
1981 and 1986.
Chapter 11
1. The easiest way to experience expression-free performance today is to listen to the
output of a plain MIDI encoding of just pitches and durations. The qualities that a ‘musi-
cal’ or ‘expressive’ performance brings are then very obvious.
2. Kim (2013: 166–7) brings together Lipps (1903), the mirror neuron literature, and the
philosopher Gregory Currie (2011) in support of a similar point. The literature on flow is
also relevant (for a recent survey, see Nakamura and Csikszentmihalyi 2009).
3. Aksnes (2001) first brought together the earlier work of Stern and Johnson. For this
reference I am grateful to Alessandro Miani, commenting on a draft of this chapter.
4. Should this web address change, a search on ‘Hamlet cigar ad’ is likely to work for
some time to come.
INDEX
AbletonLive, 266–7 Applebaum, M., 283–4, 394n1. See also

Abramson, L., 103 The Metaphysics of Notation
abstract art, 302, 324–6 Applied Music Research Centre, 135
abstract–concrete divide, 11 appraisal theories, 102
Achilles, 123 Aristotle, 123
Acknowledgement (Coltrane), 172–5, Arnold, R., 394n1
183, 189 art
acoustic environment, 278–9 abstract, 302, 324–6
acoustic features, of sadness, 103 Dream Garden, Series II (Hwang), 304–5
acoustic practice, 165–7 emotion in, 302
acoustics modern, 302
in live performance, 258 music and, 302–3
in recording studio, 263 painting, 302–5
space and, 278–9 rhythm in, 325–6
action, 375–6 zygonic theory in, 134–7
action tendency, 97 ASD. See Autism Spectrum Disorder
active listening, 48–9 Ashton, F., 335
adult drawings and visual asynchronous cross-modal mapping,
representations, 40–3 49–50
affect Athanasopoulos, G., 41–2
emotion and, 121–3 atomism, 103
Massumi on, 121–2 atonal music, 302
performance shapes and, 109–16 audience, 3, 9, 18, 33, 48, 128, 170, 222,
shaping of, 104–8 229–36, 238, 258, 262, 328, 332,
in Sonata for Unaccompanied Violin 334, 351–4
No. 1 in G minor, 97–123 DJs relating to, 267–70
vectors and, 121–3 film, 324–7
vitality, 367–8 in live performance, 262
affective epithets, 104 auditory stimuli, 58–9
affective trajectories, 107–8 auditory-visual correspondences, 34. See also
Aksnes, H., 395n3 cross-modal mapping
Alloy, L., 103 aura, 383
‘Am fernen Horizonte’ (Heine). See also Autism Spectrum Disorder (ASD),
‘Die Stadt’ 43, 390n5
metaphors in, 63–4 avant-garde, 329, 331
narrative in, 61–4, 390nn1–2
original text and English translation Bach, C. P. E., 249
of, 62 Bach, J. S., 30, 40, 98, 300, 332. See also
perception and emotion in, 61–4 Sonata for Unaccompanied Violin
Schubert’s reframing of, 64–8 No. 1 in G minor
analysis-by-synthesis, 10–11 B minor Mass by, 243–5, 244, 245
Anderson, W. R., xxvi Cello Suite No. 5 by, 166
anger, 117–23 Complete Trumpet Repertoire by, 245–6
Anthoni, N., xxvii Fugue in C major by, 6
anthropomorphic projection, 21 Second Brandenburg Concerto by, 244
397
398 Index
Bach, J. S. (cont.) Braille, L., 156

solo violin music by, 208–15 braille music notation, 156, 157
Sonata for Unaccompanied Violin in brain, 361
C major by, 209–12 cross-domain mapping and, 373
Balanchine, G., 335–6 emotion and, 97–8
Bamberger, J., 37, 149–50 multisensory perception and, 373, 375–7
Barnard, P., 342 synaesthesia and, 307
Beethoven, L, 192 bridging representations, 334–5
Léonard’s cadenza for Violin Concerto brightness, 68–9
and, 193–201 Britten, B., xxvi
Piano Concerto No. 1, 23 Brooks, H., 192
Piano Sonata in F minor Op. 2 No. 1 by, 248 Bueti, D., 375
behaviour, 97–8 Burrows, J., 333
Beresford, S., 166 Butler, R., 152
Berg, A., 89
Wozzeck by, 90–5 cadenza, in musical space, 192–201
Bergonzi, J., 175, 177, 179, 181 Cage, J., 290, 331
Berkowitz, A., 170, 171 Cambridge Companion to Recorded Music,
Berliner, P. F., 170 xxviii
Berthoz, A., 12 Canada, 152
Big Hair (2003) (Reuben), 324–5 Caramiaux, B., 44
Björk, 266 Cardew, C., 138, 300–1
Bleuler, E., 311 Cassidy, A., xxvii
blind people Cassone, G., 247
blind children’s tactile representation chains of thought methodology, 171–5
of musical sound, 150–4 character
braille music notation for, 156, 157 determining, 248–50
tactile form for, 148 in trumpet performance, 242–3
body, 3 children
acoustic practice and integrating shapes blind children’s tactile representation
in, 165–7 of musical sound, 150–4
choreography and, 337–8 cognitive psychology for, 37
in dance, 328–9, 330, 337–8 cross-modal correspondences in, 360
embodiment and, 370–2 drawings and visual representation
in improvisation, 168 by, 37–40
in live performance, 262 IDS and motherese, 117–18
in Que Pasa (Reuben), 324, 327 infant synaesthesia, 308, 311, 360
shape and, 5, 30–1 with learning difficulties, 135
sound-accompanying body motion, 5, 18 picture scores by, 149–50, 151
sound-producing body motion, 5, 18–19 Chinese music, 45
body-motion constraints, 21–3 Chomsky, N., 194
body-motion features Chopin, F., xxvi
phase-transitions, 15–16, 21 choreogeography, 325, 326
shape and, 18–20 A Choreographer’s Handbook
sonic features and, 5, 15 (Burrows), 333
sound-accompanying body motion, 5, 18 choreography
sound-producing body motion, 5, 18–19 body and, 337–8
timescales and, 14 choreographic thinking tools, 345–7
body-motion trajectories, 5, 22–3 film, 325–6
Boltz, M. G., 254 music in, 335–6
Borodin, A., 89 scores in, 336–7
Bostridge, I., 75 shape in, 332
Boteler, E. H., 310 choreomusical analysis, 335–6
Brabazon, T., 266 chromatic transposition dimension, 184
Brahms, J., 45, 300 city metaphor, 97–8
Index 399
city skylines, 284 conventional jazz improvisation repertoire,

Clarke, E., 114–15, 116 171, 175–81
coarticulation, 21 Corelli, A., 192
cognition, xxxi, 325 Cortot, A., 391n4
distance and, 71 Couperin, F., 249
embodied, 12 Cox, F., xxvi
emotion relating to, 103 Crook, H., 171, 175, 177, 193
film and, 254 cross-cultural comparisons, 41–2
interacting cognitive systems, 342, 343 cross-domain mapping, 9
language relating to, 152–3 brain and, 373
motor theory and, 12–13 embodiment and, 375
shape, 9–12, 22 enculturation and, 375
cognitive-affective patterns, 339–40 mechanics of, 372–5
cognitive psychology, 389n1 perception and, 373–4
for children, 37 in ‘Die Stadt’, 68–79
dance and, 333, 339–45, 348 cross-modal associations, 58–9
Cohen, A. J., 254 cross-modal correspondences
Coleman, G., 264 auditory stimuli and, 58–9
collaboration, 334–8 in children, 360
colour traditional experimental paradigms
in music, 308–9, 310 of, 35–6
shape and, 307 cross-modal imitation, 139, 140, 142, 154
in synaesthesia, 307, 308–9, 310, 359 cross-modal mapping, 33–4
Coltrane, J., 171, 172–5, 182, 183, 189 active listening versus passive
commercial memes, 284 listening, 48–9
Complete Trumpet Repertoire (Bach), 245–6 elaborate, 49
composition, 386–7 emotion relating to, 60–1, 79–83
approaches to, 283, 300 experiences and, 34–5
in film, 351–4 future directions for, 50–1
importance of understanding, 127–8 isolated tones versus concurrently varied
improvisation, performance, and, 170–1 tones, 47–8
musical excerpts versus whole language and, 58–9, 60
compositions, 48 live performance versus recorded
musical notation, for composer, 138 performance, 48
narrative and, 127–8 mandatory, 49
performance and, 127–8 methodological issues, 46–50, 80–1
punctuation, 128 multiplicity of, 59–61
Schillinger System of Musical musical excerpts versus whole
Composition, 284 compositions, 48
shape and, 57 in musical notation, 140–8
compositional tools, 300 music versus sound, 46–7
concert halls, 278–9 nature of task and, 49
concert level, 230–1 performance and multiplicity of, 81–2
concurrently varied tones, 47–8 pure tones versus MIDI, 47
congruence effect, 35–6 spontaneous, 49
consciousness, 326 in ‘Die Stadt’, 72–3, 79–83
constraint-based shapes synchronous versus asynchronous, 49–50
anthropomorphic projection and, 21 in zygonic theory, 140–8
body-motion constraints, 21–3 cross-modal relationship, 147–8
musical instrument constraints, 20–1 culture
vocal constraints, 20–1 digital, 394n3
containment, 104 enculturation, 375
contemporary dance, 334–5 perception and, 152–3
control space, 10–11 pitch and, 152–3
control theory, 22 Cumming, N., 99–104, 109
400 Index
Currie, G., 395n2 on shape, of popular music, 266–9, 271–2

Curwen, J., 166 shaping by, 271
dance, 324–5. See also choreography technology for, 266–9
body in, 328–9, 330, 337–8 distance
bridging representations in, 334–5 cognition and, 71
choreographic thinking tools, 345–7 pitch direction and, 70–2
cognitive psychology and, 333, 339–45, 348 in ‘Die Stadt’, 70–2
contemporary, 334–5 DJs. See disc-jockeys
developments in, 329, 331 Doğantan-Dack, M., xxx, xxxi
ethnographic studies in, 338–9 Dolscheid, S., 152, 360
meaning in, 329, 332–3, 338–45 Doppler effect, 70–1
mental architecture and shape in, 334–5, drama, 123
339–45 drawings
Mind and Movement and, 345–7 by adults, 40–3
musical notation in, 329, 331–3 by children, 37–40
musicianship and, 384 children’s picture scores, 149–50, 151
music intersecting with, 332–3 of sound and music, 37–43
music theory and, 329 of synaesthesia, 310
organization in, 332 Dream Garden, Series II (Hwang), 304–5
patterning, emotion, and, 339–40 Dreyfus, L., 104, 192
phrases in, 329, 331 dynamics
popular music and, 254 dynamic markings, 226
radical experimentation in, 331 dynamic shapes, 15–16
senses in, 338–9 dynamic shaping, 75–6
shape in, 328–49 dynamic tension, 30–2
space in, 333 of experiences, 364–70
tools to support interdisciplinary
collaboration and exchange, 334–8 editing, 327
Eitan, Z., 43, 361–3, 373–4
Danto, A., 104 embodied cognition, 12
Dark Glistening (Layden), 322 embodiment
Darwin, C., 97 cross-domain mapping and, 375
Davidson, J. W., 262 of shape, 370–2
Davidson, L., 38–40 emotion, 170
Davies, S., 102, 104 affect and, 121–3
Davis, M., 263–4 affective trajectories and, 107–8
De Bruyn, L., 43 in ‘Am fernen Horizonte’, 61–4
Debussy, C., 89 anger, 117–23
de Muris, J., 387 in art, 302
depressive realism, 103, 106 behavior and, 97–8
detail-oriented thinking, 103 brain and, 97–8
developmental synaesthesia, 308 cognition relating to, 103
Devienne, F., 392n1 cross-modal mapping relating to,
differentiated visualization, 390n2 60–1, 79–83
digital culture, 394n3 Cummings on, 99–102
directionality, 136 fear, 97, 117–21
disc-jockeys (DJs), 252, 253, 258, 324 Freud on, 97–8
audience relating to, 267 in film, 352–3
equalization by, 260–1 language and, 60
improvisation by, 266–7 metaphorical mapping of, 98
interview study with, 267–9 motivic recurrence relating to, 290
live performance by, 266–7 in Mozart’s Trio, 98, 108
mixing by, 260–1, 269 nostalgia, 107
record selection by, 267–8 Oatley on, 102–3, 107, 117
scratching, 269 patterning and, 339–40
Index 401
perception relating to, 103 feeling, 357–8

performance and, 3, 109–16 empathy and, 368
pitch direction relating to, 72–3 environment and, 370–1, 373–4
popular music and, 254 experiences and, 363–4, 376
sadness, 102–8 expressive gestures and, 367–8
shape of, 96–123, 332–3 expressive shape, 368
shaping of, 104–6 for Johnson, 370–2
social behavior and, 97–8 multisensory perception and, 372–8
in Sonata for Unaccompanied Violin present moment and, 364–8
No. 1 in G minor, 97–123 research on, 372–7
in ‘Die Stadt’, 62–4, 72–3, 76, 78–83 shape and, 359–79
in Symphony No. 40 in G minor, 98 for Stern, 364–70, 372
tenderness, 107, 117–21 suprasensory modalities and, 372, 374–5,
theory, 96–8, 102 377, 378
vectors and, 121–3 underlying principles on, 372–7
vehement states, 123 vitality affects and, 367–8
empathy, 368 vitality form and, 368–9, 378
Empirical Musicology Review, xxviii, xxx film
enculturation, 375 audience, 325, 326–7
energy, 57 choreography, 325–6
in ‘Die Stadt’, 73–5 cognition and, 254
Enescu, G., 243 composition in, 351–4
environment directors, 351
acoustic, 278–9 emotion in, 352–3
feeling and, 370–1, 373–4 music in, 351–4
organism-environment interactions, 370–1 music videos, 254
performance and, 3 perception and, 254
in recording studio, 263–5, 266 scores in, 351
space and, 278–9 shape on, 324–7, 351–4
synaesthesia relating to, 308 sound in, 325–6
equalization, 260–1 space in, 325
ethnographic studies, 338–9 texture and, 27
exchange, 334–8 first-person perspectives, 311–12
experiences Fischer, W., 104
cross-modal mapping and, 34–5 Fischer-Dieskau, D., xxvii, 75
dynamics of, 364–70 Fisher, P., 123
feeling and, 363–4, 376 flight, 123
narrative and, 363–4 flow, 395n2
of performers, xxviii–xxix fluidity, 3
of present moment, 364–8 force lines, 30–2
shape and, 5, 24 form, xxvi
visual, to synaesthesia, 308–11, 321–2 large-scale, 90–5, 102
expression, 217, 228–9, 238 tactile, 148, 150–4
shape in relation to, xxvii Forsythe, W., 331, 332
expressive freedoms, 242, 243–7, 393n1 forte indication, 77–8
expressive-free performance, 395n1 Fortspinnung module, 118
expressive gestures, 367–8 Freud, S., 97
expressiveness, 115–16 Friedrich, R., 247
expressive shape, 368 Frith, C., 325
Frith, S., 255
Fabian, D., 115–16
Fast, S., 258, 262 Gallese, V., 375–6
fear, 97 Gander, J., 264
in Sonata for Unaccompanied Violin Garner, W. R., 35–6
No. 1 in G minor (Bach), 117–23 GERMS model, 217, 238
402 Index
gestalt theory, 10, 364–5 Hooper, P. P., 42

gestural representations, 43–6 Horning, S., 272
gestures How to Improvise (Crook), 171
expressive, 367–8 Hubbard, E. M., 360
musical gestures, sound, and Hughes, T., 261
listening, 99–102 Hummel, J., 243
shape in relation, xxvii Humphrey, D., 329, 331
Ghazanfar, A. A., 376 Huron, D., 103, 107, 122
Gjerdingen, R. O., 103 Husserl, E., 10, 24, 366
global visualizations, 390n2
goal-points, 22–3 icon, 139, 144–5
Godøy, R. I., xxxi, 44, 50, 51 iconic representation, 139–41
Goodrick, R., 175, 177, 193 ideational resources, 334
Granot, R. Y., 361–3, 362 IDS. See Infant-Directed Speech
graphic imagery, 338–9, 377–8
connection between sound and, imitation
144–5 cross-modal, 139, 140, 142, 154
design, 32 in zygonic theory, 136–7, 138
score, 395n3 imperfect zygonic relationship, 132, 133, 134
graphical representations, 7 improvisation
Greasley, A., 254–5 Bergonzi on, 175, 177, 179, 181
Greek theory, 123 body in, 168
Grieg, E., 89 cadenza, in musical space, 192–201
Griffiths, D., 167 chains of thought methodology in, 171–5
Gromko, J. E., 42 chromatic transposition dimension in, 184
guided improvisation, 177 by Coltrane, 171, 172–5, 182, 183, 189
guitar composition, performance, and, 170–1
chord symbols, 158, 159 conclusion on, 201–2
distortion guitar sound, 19 conventional jazz improvisation repertoire,
171, 175–81
Haga, E., 44–5 by Crook, 171, 175, 177, 193
Hair, H. I., 39 by DJs, 266–7
hallucinogenic synaesthesia, 308 exercises, 177
‘Happy Birthday’ song, 39–40 guided, 177
Harding, P., 272 improvised classical cadenza, 171,
harmonic cartography, 209–12 192–201
harmonic comet, 208 by Léonard, 193–201
harmonic rhythm, 211–15 limitation and variation of musical topics
harmony, 225–6 approach to, 175–81
Harvey, T., xxvii modelling and navigating space in, 182–92
Haydn, J., 98 M-Space and, 171, 184, 186–92
Sonata C Major, Hob. 50 by, 250 musical refractions, 176
Heine, H., 30, 59. See also ‘Am fernen phrase-based, 175, 176, 178–81
Horizonte’; ‘Die Stadt’ phrases in, 178–81, 184
Heinichen, J. D., 107 Pressing’s model for, 177–9, 191, 194
Hennion, A., 269 proximity and, 182–8
Herbert, G., 386 randomness in, 187–8
heuristics, xxviii, 221–2, 227–9, 231, research on, 170, 191–2
233–9, 259, 272, 369 rhythmic patterns and, 179, 183–4
Higginbottom, E., 386 shape of, 167–8, 170–1
historically informed performance (HIP) strategy comparison, 189–91
intricacy, 116 transformational dimensions in, 184,
movement, 243–4 186, 187
historical uses, of shape, xxvi–xxvii The Improvising Mind (Berkowitz), 170
Homer, 123 index, 139, 144–5
Index 403
indigenous Canadian ethnic groups, 152 Kremer, G., 109, 113, 114, 115–16
Infant-Directed Speech (IDS), 117–18 Kubrick, S., 353
infant synaesthesia, 308, 311, 360 Kupka, F., 304
inspiration, 334 Küssner, M. B., 41, 42–4, 152
intensity, 363, 374
loudness and, 144 Laban Movement Analysis, 43, 45–6
maximum, 65–6, 67 Lakoff, G., 130, 371, 375–6
mean, 65–6, 67 Lamont, A., 254–5
in musical notation, 144 Lang, L., xxvi
in performance, 144 Langer, S., 102, 364
phrase intensity and forte indication, 77–8 language, 12, 390n3
pitch and, 75–6 cognition relating to, 152–3
in ‘Die Stadt’, 73–6, 78–9 cross-modal mapping and, 58–9, 60
interdisciplinary collaboration, 334–8 emotion and, 60
interperspective relationships live coding languages, 167
diagram of, 135 music software languages, 167
directionality of, 136 perception relating to, 152–3
in zygonic theory, 131–6 pitch and, 152–3, 360
interperspective values, 130–1, 139 Larcombe, G. K., 310
Interpretative Phenomenological Analysis large-scale form, 90–5
(IPA), 220 shaping of, 102
inversion, 293–4 large-scale formal connectivity, 292–8
IPA. See Interpretative Phenomenological Layden, T. B., 322
Analysis learning
isolated tones, 47–8 children with learning difficulties, 135
practice and, 165–6
James, L., 324 shape, 165–7
Japan, 41–2 Leech-Wilkinson, D., xxxi, 41, 42–3, 152
Jarre, J.-M., 129, 158–160 on motherese and IDS, 117–18
jazz, 326 Leeds International Pianoforte
conventional jazz improvisation repertoire, Competition, 250
171, 175–81, 201 Légende (Enescu), 243
parameters in, 179 Leman, M., xxxi, 45, 50, 51
pedagogy of, 170, 175–81, 201 Léonard, H., 193–201
rhythmic patterns in, 179, 249 Levinson, J., 102, 104, 107
and synaesthesia, 257–8, 309 Ligeti, G., 25, 26, 89
Johnson, M., 370–2 light, 68–70, 78–9
Jordan, S., 335–6 limitation and variation of musical
Josquin, 300 topics approach, 175–81
Juslin, P., 102, 103, 238 Line Dance (2004) (Reuben), 325, 327
link schema, 130
Kanach, S., 191 Lipps, T., 395n2
Karwoski, T. F., 310 listener, 368
Katz, M., 252, 269 listening, 3
Kelly, M. E., 42, 48 active, 48–9
key, 89 musical gestures and, 100–2
key-postures, 5, 22–3 passive, 48–9
Kim, J. H., xxx, 368, 395n2 live coding languages, 167
kinaesthetic responses, 39–40 live performance
Kivy, P., 102 acoustics in, 258
Klavierschule (Türk), 249–50 audience in, 262
Klee, P., 303 body in, 262
Knowing the Score (Bilson), 248, 393n1 by DJs, 266–7
Kohn, D., 43 microphones in, 261
Kozak, M., 44 performer’s role in, 255–62
404 Index
live performance (cont.) shape, 4, 7–8, 168, 259, 389n3

Prior on, 255–6 in ‘Die Stadt’, 63–4
recorded performance versus, 48 The Metaphysics of Notation
shape in, 255–62 (Applebaum), 394n1
shaping in, 271 descending shields, 286
loudness, 144 handbook for, 283–4
‘Love bade me welcome’ (Herbert), 386 interpretation of notation in, 298–301
Luca, S., 109, 110, 112–16 inversion in, 293–4
lullabies, 117–18, 123 large-scale formal connectivity in, 292–8
Lussana, F., 308 musical notation and score in, 283–4,
298–301
Macero, T., 263–4 panel 3, 292–4, 295
macro timescale, 14, 25 panel 4, 284–7, 290–8
Maes, P.-J., 45–6 panel 5, 286, 288, 289, 290–8
Magnusson, T., 167 panels 6 and 7, 292–3, 297
Mahler, G., 98, 286 panels 9 and 10, 298, 299
Mahnkopf, C-S., xxvi premise for, 284
Malloch, S., 262 rectilinear forms, 289
mandatory cross-modal mapping, 49 on rhetorical development, 284–92
Marchand, P., 270–1 sinusoidal curve, 285, 286, 287
Margulis, E., 107 slope in, 290
Marks, L. E., 363, 374–5 stacked arrangement in, 292–3
Martino, G., 375 vertical links in, 294, 297–8
Marwood, A., xxvi Metheny, P., 189
Massumi, B., 121–2 Meyer, L., 98
maximum intensity, 65–6, 67 microphones, 261
McGregor, W., 330, 332, 334, 336–9, micro timescale features, 14
395n6, 395n8 MIDI. See Musical Instrument Digital
Mind and Movement and, 345–7 Interface
McLachlan, N., 373 Mills, C. B., 310
McLachlan, S., 270–1 MIR. See Music Information Retrieval
McLean, A., 167 mirror neurons, 375–6, 395n2
meaning mixing, by DJs, 260–1, 269
in dance, 329, 332–3, 338–45 mixing metaphor, 278–9
shape in, 332–3, 340–1 Moby, 266
The Meaning of the Body (Johnson), 370 modern art, 302
mean intensity, 65–6, 67 modern music, 90–5
melodic contour, 226 Molnar-Szakacs, I., 376
melodic rhythm, 212–15 Monelle, R., 99
melodic shaping, 100–2 Montano, E., 268–9
memory, 166–7, 338–41, 373 Montgomery, W., 189
Mendelssohn, F., 107 Moorefield, V., 265
Menominee Indian reservation, 383–4 Moran, N., 41–2
mental architecture, 334–5, 339–45 morphodynamical theory, 10
meso timescale, 5 morphology, 303
features, 14–15 morphology of sonic objects, 16–17
motion-sound chunks and, 23–4 morphology space, 10–11, 16
sonic features at, 15–16 Morris, M., 335
metaphorical mapping, 98 mother-child dialogue, 117–18
metaphors, 60 motherese, 117–18. See also
in ‘Am fernen Horizonte’, 63–4 Infant-Directed Speech
city, 97–8 motion, 72
mixing, 278–9 motiongrams, 25
in performance, 217 motion-sound chunks, 23–4
pitch, 152 motion-sound scripts, 24–6
Index 405
motivic recurrence, 290, 297 musicality, 338

motor control, 10, 22 musical notation, 4, 127–8
motormimetic sketching, 12–13 blind children’s tactile representation of
motor systems, 375–6 musical sound, 150–4
motor theory, 5 braille music notation, 156, 157
cognition and, 12–13 in children’s picture scores, 149–50, 151
embodied cognition and, 12 for composer, 138
perception and, 12–13 cross-modal mapping in, 140–8
shape and, 12–13 in dance, 329, 331–3
movement, xxvii film scores, 351
movement level, 231–3 graphic score, 395n3
movement notation systems, 329 guitar chord symbols, 158, 159
Moving Music (Jordan), 335 intensity in, 144
Moyse, M., 384 Knowing the Score on, 248
Mozart, W. A., 192, 250 in The Metaphysics of Notation, 283–4,
Symphony No. 40 in G minor by, 98 298–301
Trio by, 98, 99, 108 movement notation systems, 329
M-Space. See multidimensional music theory and, 329
musical space nonstandard, 394n5
Mudd, S., 141–2, 152 for performance, 138, 171
multidimensional musical space (M-Space), rhythmic conventions and, 249–50, 393n2
171, 184, 186, 187–92 scores, in choreography, 336–7
multimodality, 8, 373–5 scores produced through
multimodal phenomenon, 33–5 synaesthesia, 158–9
multimodal synthesis, 340–1 shape in, using zygonic theory, 138–48
multisensory perception, 372–8 shape of, 387
multitrack recordings, 264 sound–shape relationships, in scores,
muscle memory, 166–7 149–59
music. See also popular music staff notation, 154–6
art and, 302–3 stave notation, 143
in choreography, 335–6 view of score, 225
classical, 252, 259 western notation, 5–7, 9, 58–9, 154–6
colour in, 308–9, 310 words on score, 225
dance intersecting with, 332–3 in zygonic theory, 129–48
drawings and visual representations of musical profile, 284
sound and, 37–43 musical refractions, 176
in film, 351–4 musical semiotics, 99
gestural representations of sound and, 43–6 musical space. See space
as lifelike, 363, 367 musical structure, 225
modern, 90–5 musical training, 39–43, 361, 390n3
shape and, xxv–xxxii musicianship
sound versus, 46–7 dance and, 384
musical excerpts, 48 muscle memory in, 166–7
musical expression translation and, 207–8
essential components of, 217 Music Information Retrieval (MIR), 7
GERMS model of, 217, 238 music producers, 252
heuristics for, 228–9, 238 influences on, 266
musical genres, 254–5, 256 performer as, 266
musical gesture, 99–102 of popular music, 262–6, 271, 272
musical imagery, 338–9 role of, 263–6
musical instrument constraints, 20–1 shaping by, 271
Musical Instrument Digital Interface (MIDI), music production, 11, 265–6
389n1, 395n1 music software languages, 167
popular music and, 264–5 music theory, 329
pure tones versus, 47 music videos, 254
406 Index
musique concrète, 15, 40 Peircean semiotics, 99–102

Muybridge, E., 326, 328 Peircean tripartite classification, 148, 149
Peircean typology, 139, 144–6, 148,
Näätänen, R., 373, 375 149, 161
Nancarrow, C., 290, 292 perception
narrative, 326–7 in ‘Am fernen Horizonte’, 61–4
in ‘Am fernen Horizonte’, 61–4, 390nn1–2 cross-domain mapping and, 373–4
composition and, 127–8 culture and, 152–3
experiences and, 363–4 emotion relating to, 103
performance and, 127–8 film and, 254
in ‘Die Stadt’, 62–8 language relating to, 152–3
story and, 383 motor theory and, 12–13
structure of, 363–4 multisensory, 372–8
Nattiez, J., 99 pitch and, 361
nature of task, 49 senses and, 327
Newsreel (Reuben), 327 in ‘Die Stadt’, 62–4
Niedt, F. E., 107 perceptual features, 6
No Blues (Montgomery), 189 performance. See also live performance
noise, 341 composition, improvisation, and, 170–1
nonsense words, 360 composition and, 127–8
nonstandard musical notation, 394n5 cross-modal mapping multiplicity
normalized phrase duration, 76–8 and, 81–2
nostalgia, 107 emotion and, 3, 109–16
Notations (Cage), 331 environment and, 3
notes, 236–7, 248 expressive-free, 395n1
Nunn, J. A., 307 expressiveness in, 115–16
Nussbaum, C., 102, 104 harmonic cartography and, 209–12
Nussbaumer brothers, 308 harmonic comet and, 208
Nymoen, K., 44 harmony and, 225–6
heuristics, for musical expression, 228–9, 238
Oatley, K., 97–8, 102–3, 107, 117 HIP movement, 116, 243–4
Odbert, H. S., 310 intensity in, 144
One Flat Thing (Forsythe), 331, 332 Knowing the Score on, 248
orchestra performance, 243 metaphors in, 217
orderliness, in zygonic terms, 137 musical notation for, 138, 171
organism-environment interactions, 370–1 narrative and, 127–8
organization, in dance, 332 orchestra, 243
overtone singing, 314, 316 preparation, 216–17, 238
Overy, K., 376 recorded, 48
Oxygène (Jarre), 129, 158–9, 160 rhythmic patterns and, 226
of ‘Die Stadt’, 73–8, 390n4, 391nn5–6
painting. See also art technology and, 252
Dream Garden, Series II (Hwang), 304–5 tempo fluctuation and, 250, 393n2
shape in, 302–5 tempo rubato and, 250, 393n2
Papua New Guinea, 41–2, 152–3 trumpet, 244–7, 393n1
parameters, 179 performance shapes, 250
Parker, C., 171, 326, 383 affect and, 109–16
participation, 384 interview aim and method, 217–21
passive listening, 48–9 interview data analysis, 220–1
pathos, 104 interview discussion, 237–9
patterning, 339–40 interviewer, 219
pattern representations, 167 interview participants, 218–19
patterns, 226. See also rhythmic patterns interview procedure, 219–20
pedagogy, 170, 175–81 interview results and proposed
Peirce, C., 129 model, 221–37
Index 407
model of shape in, 221–2 mean weighted, 65–6

multiple levels of musical shaping in, 222–4 metaphors, 152
performer’s role in, 255–66 perception and, 361
research on, 216–17, 237–9 pitch direction, 360–1
shaping at concert level, 230–1 distance and, 70–2
shaping at level of whole piece, 231 emotion relating to, 72–3
shaping at movement level, 231–3 in zygonic theory, 141–2, 143
shaping at note level, 236–7 pitch-related features, 16
shaping at phrasing level, 234–6 Plaistow, S., xvi, xxvii
shaping at section level, 233 Podger, R., xxvii
situational factors, 230 Pollock, J., 325–6
for Sonata for Unaccompanied Violin polyphony, 226
No. 1 in G minor (Bach), 109–16 popular music
sound changes and, 229–30 classical music compared to, 252, 259
technical modifications and, 227–8 contributors to, 255, 270
technology and, 259–61 dance and, 254
triggers for musical shaping, 224–7 defining, 253–5
performer DJs’ perspective on shape of, 266–9, 271–2
experiences of, xxviii–xxix emotion and, 254
listener and, 368 generalizations about, 254–5
as music producer, 266 layers of shaping in, 270–3
of popular music, 255–66 MIDI and, 264–5
position of, 142–4 multitrack recordings, 264
role of, in live performance, 255–62 musical genres, 254–5, 256
role of, in performance shapes, 255–66 music producers of, 262–6, 271, 272
role of, in recording studio, 262–6 music videos, 254
shaping by, 271 performer of, 255–66
Perlman, I., 109, 111, 112–16 recording engineers in, 264
Perry, L., 327 in recording studio, 255, 262–6
perspective scope of term, 253–5
of DJs, on shape, 266–9, 271–2 shape of, 252–73
first-person, of synaesthesia, 311–12 Shuker on, 253, 255
value, 130, 139 summary of shape in, 269–73
phantom sounds, 357–8 Tagg on, 253
phase-transitions, 15–16, 21 technology and, 252, 259–61, 270
phenomenology, 364, 366, 370 Poschardt, U., 268
phrase Powell, E. R., 42
-based improvisation, 175, 176, 178–81 practice, 165–6
in dance, 329, 331 Pratt, C., 152
in improvisation, 178–81, 184 present moment, 364–8
intensity, 77–8 Pressing, J., 177–9, 191, 194
markings, 226–7 primary zygonic relationship, 131–2, 133, 134
normalized phrase duration, of Prince, J. B., 361
‘Die Stadt’, 76–8 Prior, H. M., 254–6, 261, 268, 360–1, 377–8
phrasing, 234–6, 243–4 Pritchard, D., 246
Piano Concerto No. 1 (Beethoven), 23 projection, 325
pianola representation, 6 Prokofiev, S., 248, 249, 250
Piano Sonata in F minor Op. 2 No. 1 proximity, 182–8
(Beethoven), 248 punctuation, 128
pitch pure tones, 47
culture and, 152–3
height, 361, 373–4 Quasthoff, T., 75
intensity and, 75–6 Queen, 258
language and, 152–3, 360 Que Pasa (2001) (Reuben), 324, 327
maps, 361 Quinn, M., xxvi
408 Index
Rainer, Y., 329, 331 nostalgia and, 107

Ramachandran, V. S., 360 shape of, 107–8
Ramone, P., 265 tenderness and, 107
randomness, 187–8 sampling reverbs, 278
Ránki, D., 250 SCALARITY schema, 371
Raup’s cube, 182 scale
Ravel, M., 89 large-scale form, 90–5, 102
reaction-time paradigms, 35–6 shape and, 89–95
recorded performance, 48 scenography, 328–9
recording engineers, 264, 271 Schaeffer, P., 7–8, 11, 40–1, 191
recording studio schemas, 130, 371
acoustics in, 263 schematization, 30, 31
changes in, 265 Schillinger, J., 192
environment in, 263–5, 266 Schillinger System of Musical
home studio, 266 Composition, 284
performer’s role in, 262–6 Schmuckler, M. A., 361
popular music in, 255, 262–6 Schoenberg, A., 300
shape in, 262–6 Schroeder, C. E., 376
shaping in, 271 Schubert, E., 115–16
studio as instrument, 265 Schubert, F., 59, 391n6. See also ‘Die Stadt’
technology in, 264–6, 327 Schumann, R., 30
records, 268 Schurig, W., xxvi
record selection, 267–8 scores, 4, 127–8. See also musical notation
rectilinear forms, 289 children’s picture scores, 149–50, 151
reflection, 104 in choreography, 336–7
repetition, 89 in film, 351
retrograde, 300 graphic, 395n3
reverb programmes, 278 in The Metaphysics of Notation
Reynolds, R., 284 (Applebaum), 283–4
rhetorical development, 284–92 produced through synaesthesia, 158–9
rhythm, 325–6 sound-shape relationships in, 149–59
rhythmic conventions, 249–50, 393n2 view of, 225
rhythmic motives, 250 words on, 225
rhythmic patterns in zygonic theory, 138–48
harmonic versus melodic rhythm, 211–15 scratching, 269
improvisation and, 179, 183–4 Scripp, L., 38, 39
in jazz, 179, 249 secondary relationships, 131, 134, 136
performance and, 226 Second Brandenburg Concerto (Bach), 244
Risset, J., 8 section level, 233
ritornello scheme, 104, 106, 118, 121 sense modalities, 8
Roberts, J., 158–9, 160 senses
Robertson, A., xxvii in dance, 338–9
Robinson, J., 102 multisensory perception, 372–8
Roffler, S., 152 perception and, 327
Rosch, E., 11 sensorimotor system, 375–7
Roseingrave, T., 392n1 sensory feature traces, 373, 375
Routes (2008), 326, 327 sequence, 300
Russell, J., 117, 167 Serato, 268
The Sermon (Smith), 188
Sachs, G., 308 Shakespeare, W., 123
Sadie, S., xxvii shape. See also performance shapes
sadness acoustic practice and integrating shapes
acoustic features of, 103 in body, 165–7
in Sonata for Unaccompanied Violin automatic and learned responses to, 359–64
No. 1 in G minor (Bach), 102–6, 107 body and, 5, 30–1
Index 409
body-motion features and, 18–20 as structure, xxvi, 383

in choreography, 332 suprasensory modalities and, 372, 374–5,
cognition, 9–12, 22 377, 378
colour and, 307 in synaesthesia, 306–17
composition and, 57 synonyms for, 360, 361
constraint-based, 20–3 thinking shapes, in music, 26–7
in dance, 328–49 timescales and, 5, 13–15
defining, 4, 248–50, 323 tone, 309–10
differentiation and ambiguity of, 328–32 ubiquity of, 328–32
DJs’ perspective on, 266–9, 271–2 unnatural, 278–9
in Dream Garden, Series II (Hwang), 304–5 in zygonic theory, 138–48
embodiment of, 370–2 shapelessness, 323
of emotion, 96–123, 332–3 shape mapping, 9. See also cross-domain
experiences and, 5, 24 mapping; cross-modal mapping
expression in relation to, xxvii shape-reflecting representations, 7
expressive, 368 shaping
feeling and, 359–79 of affect, 104–8
on film, 324–7, 351–4 at concert level, 230–1
fluidity of, 3 Cummings on, 99–102
force lines of, 30–2 by DJs, 271
as form, xxvi dynamic, 75–6
gesture in relation to, xxvii of emotion, 104–6
historical uses of, xxvi–xxvii of large-scale form, 102
of improvisation, 167–8, 170–1 layers, in popular music, 270–3
inside and outside, 170 at level of whole piece, 231
learning, 165–7 in live performance, 271
in live performance, 255–62 melodic, 100–2
in meaning, 332–3, 340–1 model of, 221–2, 222
metaphors, 4, 7–8, 168, 259, 389n3 at movement level, 231–3
morphology and, 303 multiple levels of, 222–4
motion-sound chunks relating to, 23–4 by music producers, 271
motion-sound scripts relating to, 24–6 at note level, 236–7
motor theory and, 12–13 by performer, 271
movement in relation to, xxvii at phrasing level, 234–6
multisensory perception and, 372–8 in recording studio, 271
of musical notation, 387 at section level, 233
music and, xxv–xxxii shape and, 104–8
music theory on, 329 of Sonata for Unaccompanied Violin
ontologies, 8–9 No. 1 in G minor (Bach) 104–8
in painting, 302–5 temporal, 76–8
of popular music, 252–73 triggers for, 224–7
in Que Pasa (Reuben), 324, 327 shells, 182
in recording studio, 262–6 Shocklee, H., 327
of records, 268 Shuker, R., 253, 255
representations, 5–8 sight, 30
of sadness, 107–8 signal-based representations, 7
scale and, 89–95 sillon fermé, 15
shaping and, 104–8 singing, 314, 316
in Sonata for Unaccompanied Violin Sinha, J., 312, 317
No. 1 in G minor (Bach), 97–123 sinusoidal curve, 285, 286, 287
sonic, 5 skeletonization, 286
of sonic features, 15–17 Skyspace (Pritchard), 246
sound and, 4–5, 144–5, 321–2, 357–9 Sloboda, J. A., 254–5, 333
spaces relating to, 278–9, 361–2 slope, 290
Stern on, 369–70 slur, 248
410 Index
Smith, J., 188 performance shape and changes in, 229–30

social behavior, 97–8 phantom, 357–8
Söffing, C., 311–12 shape and, 4–5, 144–5, 321–2, 357–9
software, 260, 266–9, 378, 394n3 texture and, 327
solfège handsigns, 166 in trumpet performance, 242–3
solo violin music, by Bach, 208–15 vision and, 310–11
Sonata C Major, Hob. 50 (Haydn), 250 visual of, 321–2, 383
Sonata for Unaccompanied Violin in C major sound-accompanying body motion, 5, 18
(Bach, J. S.), 209–12 sound-producing body motion, 5, 18–19
Sonata for Unaccompanied Violin No. 1 soundscapes, 302–4
in G minor (Bach, J. S.) sound-shape relationships, 149–59
Adagio in, 99–116, 122–3 ‘Sounds of Intent’ project, 135
affect and shape in, 97–123 sound-tracings, 19
affective trajectories in, 107–8 musical training on, 41
anger in, 117–23 space
Cumming’s analysis of, 99–104, 109 acoustics and, 278–9
cycle of, 116–23 cadenza in, 192–201
emotion in, 97–123 city skylines, 284
fear in, 117–23 concert halls, 278–9
fifth cycle, 105–8 control, 10–11
Fortspinnung module in, 118 in dance, 333
Fuga in, 106, 116–19, 122–3 environment and, 278–9
Kremer’s performance of, 109, 113, 114, in film, 325
115–16 modelling and navigating, in
Luca’s performance of, 109, 110, 112–16 improvisation, 182–92
musical gestures in, 99–102 morphology, 10–11, 16
performance shapes for, 109–16 M-Space and, 171, 184, 186, 187–92
Perlman’s performance of, 109, 111, 112–16 shape relating to, 278–9, 361–2, 362
Presto in, 106, 116–17, 120–3 synaesthesia relating to, 308
ritornello scheme in, 104, 106, 118, 121 time and, 332
sadness, in Adagio, 102–6, 107 2D, 167, 325
shaping of, 104–8 unnatural, 279
Siciliana in, 106, 116–17, 119–23 Spector, P., 327
tenderness in, 117–21 spectral structure, 69–70
vectors in, 121–3 speed, 361–2
sonic features time and, 373
body-motion features and, 5, 15 speeded responses, 36
dynamic shapes, 15–16 spontaneity, 104, 171
main dynamic shapes of, 15–16 spontaneous cross-modal mapping, 49
at meso timescale, 15–16 ‘Die Stadt’ (Schubert), 59, 61
shape of, 15–17 average rhythmic durations, 65–6
timescales and, 14 brightness in, 68–9
The Sonic Self (Cumming), 99–102 contrasting and parallel dimensions in, 63
sonic shapes, 5 cross-domain mapping in, 68–79
sound cross-modal mapping in, 72–3, 79–83
blind children’s tactile representation of distance in, 70–2
musical sound, 150–4 dynamic shaping in, 75–6
connection between graphic and, 144–5 emotion in, 62–4, 72–3, 76, 78–83
drawings and visual representation of energy in, 73–5
music and, 37–43 forte indication in, 77–8
in film, 325–6 intensity in, 73–6, 78–9
gestural representations of music and, 43–6 light in, 68–70, 78–9
as multimodal phenomenon, 33–5 mean intensity and maximum intensity,
musical gestures and, 99–102 65–6, 67
music versus, 46–7 mean weighted pitch, 65–6
Index 411
metaphors in, 63–4 cross-modal relationship in, 147–8

motion in, 72 definition of, 306–8
narrative in, 62–8 developmental, 308
normalized phrase duration of, 76–8 discussion on, 317
perception in, 62–4 drawings of, 310
performance comparison summary, 78–9 dual components of, 306
performances of, 73–8, 390n4, 391nn5–6 environment relating to, 308
phrase intensity and forte indication in, first-person perspectives of, 311–12
77–8 first recorded case of, 308
pitch and intensity in, 75–6 hallucinogenic, 308
Schubert’s reframing of Heine for, 64–8 in infants, 308, 311, 360
spectral structure of, 69–70 interview with synaesthete, 313–15, 316,
standard deviation, 65–6, 68 317
structure in, 68–73 in overtone singing, 314, 316
tempo in, 74–5 scores produced through, 158–9
temporal shaping in, 76–8 shape in, 306–17
staff notation, 154–6 space relating to, 308
standard deviation, 65–6, 68 summary of, 317
Stanford, T. R., 376 texture in, 307
stave notation, 143 tone shapes in, 309–10
Stein, B. E., 376 types of, 306
Stern, D., 359, 395n3 visual experiences to, 308–11, 321–2
on experience, 364–70 Ward on, 306, 307, 311, 313–15
feeling for, 364–70, 372 in zygonic theory, 146–8
on present moment, 364–8 synchronous cross-modal mapping, 49–50
on shape, 369–70 synonyms, for shape, 360, 361
on time, 366–7 synthesis, 102
vitality affects and, 367–8
on vitality form, 368–9, 378 tactile form
Sternklang (Stockhausen), 142–5 blind children’s tactile representation
Stevens, S. S., 140–2 of musical sound, 150–4
stimulus-response compatibility, 36 for blind people, 148
Stockhausen, K., 142–5 Tagg, P., 253
story, 383 Tan, S., 254
structure Tan, S.-L., 42, 48
musical, 225 Tchaikovsky, P., 246
shape as, xxvi, 383 technical modifications, 227–8
spectral, 69–70 technology
in ‘Die Stadt’, 68–73 digital culture, 394n3
Stuart, M., 331 for DJs, 266–9
Stumpf, C., 50 internet and, 394n3
suprasensory modalities, 372, 374–5, performance and, 252
377, 378 performance shapes and, 259–61
Suzuki method, 39, 390n3 popular music and, 252, 259–61, 270
Swan Lake Suite (Tchaikovsky), 246 in recording studio, 264–6, 327
swell-paper, 148 software, 260, 266–9, 378, 394n3
symbols, 139, 145–6 tempo, 74–5
guitar chord, 158, 159 tempo fluctuation, 250, 393n2
symmetry, 89, 383 temporal density, 71
Symphony No. 40 in G minor (Mozart), 98 temporal shaping, 76–8
synaesthesia tempo rubato, 250, 393n2
brain and, 307 tenderness
colour in, 307, 308–9, 310, 359 sadness and, 107
common examples of, 306 in Sonata for Unaccompanied Violin No.1
core characteristics of, 307 in G minor, 117–21
412 Index
Ten Pieces for Wind Quintet (Ligeti), 25 solo trumpeter, 243–7

tertiary relationships, 131, 140–1 sound in, 242–3
tertiary zygonic level, 136, 137 Tudor, 335
texture, 304 Türk, D. G., 193, 194, 249–50
film and, 27 two-dimensional (2D) space, 167, 325
sound and, 327
in synaesthesia, 307 Unités Sémiotiques Temporelles (UST), 7–8
Théberge, P., 252, 265, 394n3 unnatural shape, 278–9
thematic material, 89 unnatural spaces, 279
There Is No Sound in My Head Unquity Road (Metheny), 189
(Arnold), 394n1 unspeeded responses, 36
thermoforming, 148 upbeat, 248
thinking Upitis, R., 37–9
detail-oriented, 103 UST. See Unités Sémiotiques Temporelles
shapes, in music, 26–7
Thinking in Jazz (Berliner), 170 valence, 60, 74
Thom, R., 10 vectors, 121–3
Thompson, W. F., 361 vehement states, 123
Thorpe, M., 141–2, 152 verticality, 362
timbral features, 6, 7 vertical links, 294, 297–8
timbre, 71, 309 Violin Concerto Op. 3 No. 6 (Vivaldi), 104–6
time vision, 310–11
present moment, 364–8 visual, of sound, 321–2, 383
space and, 332 visual experiences, 308–11, 321–2
speed and, 373 visualization, 390n2
Stern on, 366–7 visual representation
timescales adult drawings and, 40–3
body-motion features and, 14 children’s drawings and, 37–40
macro, 14, 25 cross-cultural comparisons of, 41–2
meso, 5, 14–16, 23–4 of sound and music, 37–43
micro, 14 in zygonic theory, 134–7, 144
shape, 5, 13–15 visual turntablism, 254
sonic features and, 14 vitality affects, 367–8
Timmers, R., 363, 373–4 vitality form, 368–9, 378
tonality Vivaldi, A., 104–6
components of, 302–3 vocal constraints, 20–1
defining, 302
in soundscapes, 302–4 Wagner, R., 89–90
tone Walker, P., 360, 361
concurrently varied, 47–8 Walker, R., 150, 152, 153
isolated, 47–8 Waller, F., 192
pure, 47 Walsh, V., 375
shapes, 309–10 Ward, J., 306, 307, 311, 313–15
traditional experimental paradigms, 35–6 Warner, T., 252
tragic catharsis, 123 Welch, G., 150, 152, 153
Traktor, 268 Welsh, P., 39
transformational dimensions, 184, 186, 187 western notation, 5–7, 9, 58–9, 154–6
translation, 207–8 whole compositions, 48
Treatise Handbook (Cardew), 300–1 Whorfian hypothesis, 152
Trevarthen, C., 117 Wilbraham, J., 242
Trio (Mozart), 98, 99, 108 Wilson, S., 373
trumpet performance Winkler, I., 373, 375
character in, 242–3 Wishart, T., 191
expressive freedoms in, 242, 243–7, 393n1 words, on score, 225
orchestra performance, 243 Wozzeck (Berg), 90–5
Index 413
Xenakis, I., 7, 13, 191 overview of, 129–34

Peircean tripartite classification in,
Zagorski-Thomas, S., 255 148, 149
Zeki, S., 325 Peircean typology in, 139, 144–6, 148,
Zigler, M. J., 309 317 149, 161
zygonic theory perspective value in, 130, 139
in art, 134–7 pitch direction in, 141–2, 143
braille music notation and, 156 primary zygonic relationship in, 131–2,
cross-modal imitation in, 139, 140, 142, 154 133, 134
cross-modal mapping in, 140–8 scores in, 138–48
iconic representation in, 139–41 secondary relationships in, 131, 134, 136
imitation in, 136–7, 138 shape in, 129–48
imperfect zygonic relationship, 132, 133, 134 Stevens on, 140–2
interperspective relationships in, 131–6 synaesthesia in, 146–8
interperspective values in, 130–1, 139 tertiary relationships in, 131, 140–1
link schema in, 130 tertiary zygonic level, 136, 137
Mudd on, 141–2 Thorpe on, 141–2
musical notation in, 129–48 visual representation in, 134–7, 144–9
orderliness, in zygonic terms, 137 zygonic relationships, 131–4

PDF

Diunggah oleh

Informasi Dokumen

Judul Asli

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

PDF

Diunggah oleh

Hak Cipta:

Format Tersedia

Music and Shape

Studies in Musical Performance as Creative Practice

About the series

Published in the United States of America by Oxford University Press

© Oxford University Press 2017

All rights reserved. No part of this publication may be reproduced, stored in

You must not circulate this work in any other form

Library of Congress Cataloging-​in-​Publication Data

PART 1 Shapes mapped

Reflection Lucia D’Errico 30

Reflection Anna Meredith 57

PART 2 Shapes composed

Reflection Steven Isserlis 127

Reflection Alice Eldridge 165

PART 3 Shapes performed

Reflection Simon Desbruslais 242

Reflection Steven Savage 278

PART 4 Shapes seen

Reflection Timothy B. Layden 321

Reflection Richard G. Mitchell 351

PART 5 Shapes felt

Alice Eldridge is a researcher, lecturer and cellist with interdisciplinary inter-

at venues and festivals throughout Europe, North America, Lebanon and

the cognition of musical structure; and the construction of musical meaning.

Renee Timmers is Reader in Psychology of Music at the University of Sheffield,

1.1 A pianola representation of the first eight bars of J. S. Bach’s Fugue

R.2 Berg, Wozzeck, Act 3, bars 3–​7 91

5.11 Single interperspective values of difference cannot be imitated

6.2 An illustration of musical refractions. In the course of an

10.3 Relationships and representations that bridge sources of inspiration

0.1 Historical examples of the use of shape xxvi

Oxford University Press has created a password-​protected website to accom-

There can be no doubt that concepts of shape are ubiquitous in musical

TABLE 0.1 Historical examples of the use of shape

Shape as W. R. Anderson (critic) ‘Caractacus, the composer’s Op. 35 (Leeds, 1898:

by Leech-​Wilkinson and by Prior). But mainly it consists of contributions by

Jin Hyun Kim, in her article in the same collection, notes that:

Key-​postures, trajectories and sonic shapes

It seems that we come across expressions of shape everywhere in music-​

How this ‘temporal-​to-​atemporal’ transformation in our minds works still

In the twentieth century, however, we have also seen attempts to develop

Shape in musical contexts is a multimodal phenomenon because it involves

commensurable. On the other hand, abstractions based on western music nota-

Findings in a number of domains seem to converge in suggesting that notions

approximate image of sound production, which we have called motormimetic

is mostly concerned with shapes of frequency relationships, meaning spectral

Although these three timescales coexist in musical experience, it is possible

features, as mentioned above. Furthermore, the most important attribute of

Seeing evidence from various research fields converging on the importance of

to so-​called phase-​transitions in body motion: changes in the morphology space

These two typological classifications (dynamic-​related and pitch-​related) were

We can observe a great variety of music-​related body motion in dance, concert

Another element of motor control is that body motion seems to be organized

the key-​postures at the goal-​points of the downbeats and continuous motion

In a phenomenological perspective, motion-​sound chunks may be understood

body motion, as well as amplitude, velocity and degree of calmness or agita-

Thinking shapes in music

The observation that shape metaphors and graphical shape representations

(mentally, on paper, digitally), listening, drawing, listening, each time creating

Berthoz, A., 1997: Le sense du mouvement (Paris: Odile Jacob).

There is no optical space in my experience of music. If I leave aside a sponta-

FIGURE R.1 Schematization of bodily music-​shape forces (in colour at )

Shape, drawing and gesture

Library of Congress Cataloging-in-Publication Data

R.2 Berg, Wozzeck, Act 3, bars 3–7 91

Oxford University Press has created a password-protected website to accom-

by Leech-Wilkinson and by Prior). But mainly it consists of contributions by

Key-postures, trajectories and sonic shapes

It seems that we come across expressions of shape everywhere in music-

How this ‘temporal-to-atemporal’ transformation in our minds works still

to so-called phase-transitions in body motion: changes in the morphology space

These two typological classifications (dynamic-related and pitch-related) were

We can observe a great variety of music-related body motion in dance, concert

the key-postures at the goal-points of the downbeats and continuous motion

In a phenomenological perspective, motion-sound chunks may be understood

FIGURE R.1 Schematization of bodily music-shape forces (in colour at )

whenever features from two dimensions—whether within a single modality or

the shape of it in one’s mind—otherwise hidden or inaccessible features of the

A further option—which might be seen as an attempt to bridge that onto-

In almost all studies investigating cross-modal mappings of music, research-

NATURE OF TASK: SPONTANEOUS—M ANDATORY—E LABORATE

for researchers regarding perception as an active process based on action–

data is a well-known problem in music psychology and has been discussed

Athanasopoulos, G. and N. Moran, 2013: ‘Cross- cultural representations of musical

Lewkowicz, D. J. and N. J. Minar, 2014: ‘Infants are not sensitive to synesthetic cross-modality

Cross-modal correspondences and affect in

An important and under-investigated issue concerning cross-modal correspon-

cross-cultural study, that listeners’ colour and emotional associations of musi-

Structure and cross-domain mappings in Schubert’s ‘Die Stadt’

CROSS-M ODAL AND EMOTIONAL CORRESPONDENCES

Cross-domain mappings in three performances of ‘Die Stadt’

TABLE 3.4 Recorded performances of ‘Die Stadt’ by Fischer-Dieskau, Bostridge and Quasthoff

Dietrich Fischer-Dieskau Gerald Moore DFD Deutsche Grammophon (2005)